[R] duplicate values

Erik Iverson iverson at biostat.wisc.edu
Sun Nov 16 19:29:01 CET 2008



Oliver Bandel wrote:
> Antje Nöthlich <antno <at> web.de> writes:
> 
> [...]
>> Now for the whole dataframe i would like to delete rows that have the same 
>> "Datetime" value as the prior row.
> 
> Well, if you do this, then you loose data.
> is this really, what you want?
> Throwing away data?
> I would think it make sense, if all columns are equal, so that unique()
> could be used - then you only throw away data, which already is registered
> in your data frame.
> 
> But when you throw away different values because of the same date-time,
> then there is the question: WHICH would you throw away?
> All but the first? Or do you want to select a maximum or minimum?

You end up doing this a lot in clinical trials at least, where you might 
only care about the first event per patient for a survival analysis, or 
first measurement of blood pressure for baseline data.  It's not so much 
throwing data away, as limiting it for a certain analysis.  Very common 
for me to do this sort of thing.

> 
> 
> You attempt looks strange to me...
> 
> 
> Ciao,
>    Oliver
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list