[R] the difference between "-" and "!" between base and data.table package
Carl Sutton
suttoncarl at ymail.com
Sun Apr 16 02:18:43 CEST 2017
Hi
I normally use package data.table but today was doing some base R coding. Had a problem for a bit which I finally resolved. I was attempting to separate a data frame between train and test sets, and in base R was using the "!" to exclude training set indices from the data frame. All I was getting was zero observations. Changed to using "-" and it worked. I recalled that in data.table the "!" function worked, so created this little bit of code.
# Base R Functions
str(mtcars)
train_indices <- sample(nrow(mtcars), round(0.75*nrow(mtcars)))
train <- mtcars[train_indices,]
mode(train_indices); class(train_indices)
test <- mtcars[!train_indices,] # the "!" function returning 0 observations
test_1 <- mtcars[-train_indices,]
identical(test, test_1)
# Using data.table package
library(data.table)
dt1 <- data.table(mtcars)
train_indices <- sample(nrow(dt1), round(0.75*nrow(dt1)))
train <- dt1[train_indices,]
mode(train_indices); class(train_indices)
test <- dt1[!train_indices,] # the "!" function
test_1 <- dt1[-train_indices,]
identical(test, test_1)
The documentation appears to me to accept "!" in base, so do I have some kind of ridiculous error or ..??
Carl Sutton
More information about the R-help
mailing list