[R] remove
P Tennant
philipt900 at iinet.net.au
Sun Feb 12 06:39:26 CET 2017
Hi Val,
The by() function could be used here. With the dataframe dfr:
# split the data by first name and check for more than one last name for
each first name
res <- by(dfr, dfr['first'], function(x) length(unique(x$last)) > 1)
# make the result more easily manipulated
res <- as.table(res)
res
# first
# Alex Bob Cory
# TRUE FALSE FALSE
# then use this result to subset the data
nw.dfr <- dfr[!dfr$first %in% names(res[res]) , ]
# sort if needed
nw.dfr[order(nw.dfr$first) , ]
first week last
2 Bob 1 John
5 Bob 2 John
6 Bob 3 John
3 Cory 1 Jack
4 Cory 2 Jack
Philip
On 12/02/2017 4:02 PM, Val wrote:
> Hi all,
> I have a big data set and want to remove rows conditionally.
> In my data file each person were recorded for several weeks. Somehow
> during the recording periods, their last name was misreported. For
> each person, the last name should be the same. Otherwise remove from
> the data. Example, in the following data set, Alex was found to have
> two last names .
>
> Alex West
> Alex Joseph
>
> Alex should be removed from the data. if this happens then I want
> remove all rows with Alex. Here is my data set
>
> df<- read.table(header=TRUE, text='first week last
> Alex 1 West
> Bob 1 John
> Cory 1 Jack
> Cory 2 Jack
> Bob 2 John
> Bob 3 John
> Alex 2 Joseph
> Alex 3 West
> Alex 4 West ')
>
> Desired output
>
> first week last
> 1 Bob 1 John
> 2 Bob 2 John
> 3 Bob 3 John
> 4 Cory 1 Jack
> 5 Cory 2 Jack
>
> Thank you in advance
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list