[R] Confused about using data.table package,

David Winsemius dwinsemius at comcast.net
Sun Feb 19 22:01:36 CET 2017


> On Feb 19, 2017, at 11:37 AM, C W <tmrsg11 at gmail.com> wrote:
> 
> Hi R,
> 
> I am a little confused by the data.table package.
> 
> library(data.table)
> 
> df <- data.frame(w=rnorm(20, -10, 1), x= rnorm(20, 0, 1), y=rnorm(20, 10, 1),
> z=rnorm(20, 20, 1))
> 
> df <- data.table(df)

  df <- setDT(df) is preferred.
> 
> #drop column w
> 
> df_1 <- df[, w := NULL] # I thought you are supposed to do: df_1 <- df[, -w]

Nope. The "[.data.table" function is very different from the "[.data.frame' function. As you should be able to see, an expression in the `j` position for "[.data.table" gets evaluated in the environment of the data.table object, so unquoted column names get returned after application of any function. Here it's just a unary minus. 

Actually "nope" on two accounts. You cannot use a unary minus for column names in `[.data.frame` either. Would have needed to be df[ , !colnames(df) in "w"]  # logical indexing


> 
> df_2 <- df[x<y] # aren't you supposed to do df_2 <- df[x<y]?

I don't see a difference. 

> 
> df_3 <- df[, a := x-y] # created new column a using x minus y, why are we
> using colon equals?

You need to do more study of the extensive documentation. The behavior of the ":=" function is discussed in detail there.

> 
> I am a bit confused by this syntax.

It's non-standard for R but many people find the efficiencies of the package worth the extra effort to learn what is essentially a different evaluation strategy.


> 
> Thanks!
> 
> 	[[alternative HTML version deleted]]

Rhelp is a plain text mailing list,

-- 
David
> 
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA



More information about the R-help mailing list