[R] dplyr and function length()

peter dalgaard pdalgd at gmail.com
Tue Aug 4 11:06:44 CEST 2015


On 04 Aug 2015, at 10:50 , Karl Schilling <karl.schilling at uni-bonn.de> wrote:

> Dear All,
> 
> I have an observation / question about how the function length() works once package dplyr is loaded.
> 
> Say we have a data.frame  df with n rows and m columns. Then a way to get the number of rows is to use
> 
> length(df$m1)  (m1 here stand is as the header of the first column)
> 
> or, alternatively
> 
> length(df[,1]).
> 
> Both commands will return n.
> 
> However, once dplyr is loaded,
> 
> length(df[,1]) will return a value of 1.
> 
> length(df$m1) and also length(df[[1]]) will correctly return n.
> 
> I know that using length() may not be the most elegant or efficient way to get the value of n. However, what puzzles (and somewhat disturbs) me is that loading of dplyr affects how length() works, without there being a warning or masking message upon loading it.
> 
> Any clarification or comment would be welcome.

Presumably, dplyr changes how [.data.frame works (by altering the default for drop=, I expect) so that df[,1] is a data frame with 1 variable and not a vector. And yes, that _is_ somewhat disturbing.

-pd 

> 
> Thank you so much,
> 
> Karl
> 
> 
> -- 
> Karl Schilling
> 
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com



More information about the R-help mailing list