[R] Count of rows while looping through data

jim holtman jholtman at gmail.com
Fri May 27 21:45:52 CEST 2011


When you subset, the factors will carry along all the original levels.
 You can remove them in your processing by:

x$fac <- factor(x$fac)

> x <- data.frame(fam=c('a','a','b'), grp=c('1','2','3'))
> # split
> x.s <- split(x, x$fam)
> # notice additional levels
> str(x.s$b)
'data.frame':   1 obs. of  2 variables:
 $ fam: Factor w/ 2 levels "a","b": 2
 $ grp: Factor w/ 3 levels "1","2","3": 3
>
> z <- x.s$b
> str(z)
'data.frame':   1 obs. of  2 variables:
 $ fam: Factor w/ 2 levels "a","b": 2
 $ grp: Factor w/ 3 levels "1","2","3": 3
> z$grp <- factor(z$grp)  # remove extra levels
> str(z)
'data.frame':   1 obs. of  2 variables:
 $ fam: Factor w/ 2 levels "a","b": 2
 $ grp: Factor w/ 1 level "3": 1
>


On Fri, May 27, 2011 at 12:20 PM, Jeanna <stroutj at uw.edu> wrote:
> I may have prematurely excited...
>
> I ended up using the split method since my family indicators are
> alphanumeric so my issue is as follows.
>
> I'm applying this to different subsets of my main data set.  The subsets do
> not contain all families.  When I run the method on one of my subsets I get
> back a table that includes ALL the families.  Those that weren't in the
> subset to which I applied the method have <NA> for all of the fields.
>
> If I export one of the subsets, restart R (to be certain nothing of my
> original playtime is left) and import only the subset, the method works
> perfectly.
>
> The addition of the previously removed rows seems to happen at the 'split'
> step.
>
> Is there something I'm doing incorrectly?  I can't seem to figure out how to
> convince R not to look at my original data frame when deciding how many
> families there are.
>
> --
> View this message in context: http://r.789695.n4.nabble.com/Count-of-rows-while-looping-through-data-tp3547949p3555752.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?



More information about the R-help mailing list