[R] Removing duplicates without a for loop

Clint Bowman clint at ecy.wa.gov
Wed Sep 26 23:05:41 CEST 2012


?duplicated

Clint Bowman			INTERNET:	clint at ecy.wa.gov
Air Quality Modeler		INTERNET:	clint at math.utah.edu
Department of Ecology		VOICE:		(360) 407-6815
PO Box 47600			FAX:		(360) 407-7534
Olympia, WA 98504-7600

         USPS:           PO Box 47600, Olympia, WA 98504-7600
         Parcels:        300 Desmond Drive, Lacey, WA 98503-1274

On Wed, 26 Sep 2012, Rui Barradas wrote:

> Sorry, but in my previous post I've confused the columns. It's by REQ.NR, not 
> by date
>
> REQ.NR <- 1:4
> REQ.NR <- c(REQ.NR, sample(REQ.NR, 2))
> dat <- data.frame(date = Sys.Date() + 1:6, REQ.NR = REQ.NR, value = rnorm(6))
>
> aggregate(dat, by = list(dat$REQ.NR), FUN = tail, 1)
>
> Rui Barradas
> Em 26-09-2012 16:19, wwreith escreveu:
>>   I have several thousand rows of shipment data imported into R as a data
>> frame, with two columns of particular interest, col 1 is the entry date, 
>> and
>> col 2 is the tracking number (colname is REQ.NR). Tracking numbers should 
>> be
>> unique but on occassion aren't because they get entered more than once. 
>> This
>> creates two or more rows of with the same tracking number but different
>> dates. I wrote a for loop that will keep the row with the oldest date but 
>> it
>> is extremely slow.
>> 
>> Any suggestions of how I should write this so that it is faster?
>> 
>> # Creates a vector of on the unique tracking numbers #
>> u<-na.omit(unique(Para.5C$REQ.NR))
>> 
>> # Create Data Frame to rbind unique rows to #
>> Para.5C.final<-data.frame()
>> 
>> # For each value in u subset Para.5C find the min date and rbind it to
>> Para.5C.final #
>> for(i in 1:length(u))
>> {
>>    x<-subset(Para.5C,Para.5C$REQ.NR==u[i])
>>    Para.5C.final<-rbind(Para.5C.final,x[which(x[,1]==min(x[,1])),])
>> }
>> 
>> 
>> 
>> --
>> View this message in context: 
>> http://r.789695.n4.nabble.com/Removing-duplicates-without-a-for-loop-tp4644255.html
>> Sent from the R help mailing list archive at Nabble.com.
>> 
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide 
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>




More information about the R-help mailing list