[R] Data Manipulation using R
Stephen Tucker
brown_emu at yahoo.com
Wed Apr 18 19:50:29 CEST 2007
...is this what you're looking for?
donedat <- subset(data,ID < 6000 | ID >= 7000)
findat <- donedat[-unique(rapply(donedat,function(x)
which( x < 0 ))),,drop=FALSE]
the second line looks through each column, and finds the indices of negative
values - rapply() returns all of them as a vector; unique() removes
duplicated elements, and with negative indexing you remove these values from
donedat.
--- Anup Nandialath <anup_nandialath at yahoo.com> wrote:
> Dear Friends,
>
> I have data set with around 220,000 rows and 17 columns. One of the columns
> is an id variable which is grouped from 1000 through 9000. I need to
> perform the following operations.
>
> 1) Remove all the observations with id's between 6000 and 6999
>
> I tried using this method.
>
> remdat1 <- subset(data, ID<6000)
> remdat2 <- subset(data, ID>=7000)
> donedat <- rbind(remdat1, remdat2)
>
> I check the last and first entry and found that it did not have ID values
> 6000. Therefore I think that this might be correct, but is this the most
> efficient way of doing this?
>
> 2) I need to remove observations within columns 3, 4, 6 and 8 when they are
> negative. For instance if the number in column 3 is -4, then I need to
> delete the entire observation. Can somebody help me with this too.
>
> Thank and Regards
>
> Anup
>
>
> ---------------------------------
>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list