[R] the difference between "-" and "!" between base and data.table package

Carl Sutton suttoncarl at ymail.com
Sun Apr 16 02:18:43 CEST 2017


Hi 


I normally use package data.table but today was doing some base R coding.  Had a problem for a bit which I finally resolved.  I was attempting to separate a data frame between train and test sets, and in base R was using the "!" to exclude training set indices from the data frame.  All I was getting was zero observations.  Changed to using "-" and it worked.  I recalled that in data.table the "!" function worked, so created this little bit of code.

#  Base R Functions
str(mtcars)
train_indices <- sample(nrow(mtcars), round(0.75*nrow(mtcars)))
train <- mtcars[train_indices,]
mode(train_indices); class(train_indices)
test <- mtcars[!train_indices,]  #  the "!" function returning 0 observations
test_1 <- mtcars[-train_indices,]
identical(test, test_1)

#  Using data.table package
library(data.table)
dt1 <- data.table(mtcars)
train_indices <- sample(nrow(dt1), round(0.75*nrow(dt1)))
train <- dt1[train_indices,]
mode(train_indices); class(train_indices)
test <- dt1[!train_indices,]  #  the "!" function
test_1 <- dt1[-train_indices,]
identical(test, test_1)
The documentation appears to me to accept "!" in base, so do I have some kind of ridiculous error or ..??
Carl Sutton



More information about the R-help mailing list