[R] Problems of data processing

Jacques VESLOT jacques.veslot at cirad.fr
Mon Jan 16 13:06:50 CET 2006


something wrong in X and Y definitions... but this could work:

do.call("rbind", lapply(split(toto, toto$Num),
    function(x) x[which.min(as.POSIXct(strptime(toto$Date, "%d/%m/%y 
%H:%M"))),]))

i don't understand the second query; do you want to keep the first line 
when there are several lines for the same place ?


Florent Bonneu a écrit :

>I have two problems for the data processing of my large data base (50000 rows). For example, a sample is as follows
>
>Num <- c(1,2,3,4,4,4,5,5)
>Date <- c("1/1/04 0:48","1/1/04 1:52", "1/1/04 1:55", "1/1/04 2:14", "1/1/04 3:09", "1/1/04 8:02", "1/1/04 9:05", "1/1/04 9:06")
>Place <- c("x1","x1","x3","x4","x4","x4","x5","x5")
>X <- c(1,””,2,3,3,3,6,6)
>Y <- c(1,””,9,7,7,7,8,8)
>
>toto <- data.frame(Num,Date,Place,X,Y)
>
>The first problem is to keep one line for each Num with the “minimum” date. I managed to do it with loops but i would like a solution without using loops. It will be better for my large data base.
>
>The other one is to retrieve the coordinates ill-informed. For example, for the same place “x1”, Num=2 doesn't have X and Y. But, we have this information for Num=1.
>
>The example data base must be like this
>
>Num <- c(1,2,3,4,5)
>Date <- c("1/1/04 0:48","1/1/04 1:52", "1/1/04 1:55", "1/1/04 2:14", "1/1/04 9:05")
>Place <- c("x1","x1","x3","x4","x5")
>X <- c(1,1,2,3,6)
>Y <- c(1,1,9,7,8)
>
>toto <- data.frame(Num,Date,Place,X,Y)  
>
>
>Somebody know how to do ?
>Thanks.
>
>Florent Bonneu
>Laboratoire de Statistique et Probabilités
>bureau 148  bât. 1R2
>Université Toulouse 3
>118 route de Narbonne - 31062 Toulouse cedex 9
>bonneu at cict.fr <mailto:bonneu at cict.fr>
>
>______________________________________________
>R-help at stat.math.ethz.ch mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>
>  
>




More information about the R-help mailing list