[R] Filtering out a data.frame
Erik Iverson
eriki at ccbr.umn.edu
Tue Jun 8 06:18:33 CEST 2010
Jeff08 wrote:
> Sample Data.Frame format
>
> Name is Returns.nodup
>
> X id ticker date_ adjClose totret RankStk
> 427225 427225 00174410 AHS 2001-11-13 21.66 100 1235
>
>
> "id" uniquely defines a row
>
>
> What I am trying to do is filter out id's that have less than 1500 data
> points (by date)
>
> First, I used
>
> total<-by(Returns.nodup, Returns.nodup$id,nrow)
>
> which subsetted by ID and calculated the number of data points for each ID
>
> Now I am trying to figure out a way to use this to filter out the original
> data.frame (Returns.nodup)
>
> I have tried using the following, but it is VERY slow:
>
> z<-unlist(lapply(1:length(y), function(i) which(a$id==y[i]) ))
> Returns.filtered<-Returns.nodup[z,]
>
> Is there a faster way to do this?
>
Most likely, yes. But without a reproducible example, it's difficult to think
about the problem. Can you please give us one?
If not, you can probably cobble something together using ?table and ?%in% I'm
guessing.
More information about the R-help
mailing list