[R] Problems of data processing
Jacques VESLOT
jacques.veslot at cirad.fr
Mon Jan 16 13:06:50 CET 2006
something wrong in X and Y definitions... but this could work:
do.call("rbind", lapply(split(toto, toto$Num),
function(x) x[which.min(as.POSIXct(strptime(toto$Date, "%d/%m/%y
%H:%M"))),]))
i don't understand the second query; do you want to keep the first line
when there are several lines for the same place ?
Florent Bonneu a écrit :
>I have two problems for the data processing of my large data base (50000 rows). For example, a sample is as follows
>
>Num <- c(1,2,3,4,4,4,5,5)
>Date <- c("1/1/04 0:48","1/1/04 1:52", "1/1/04 1:55", "1/1/04 2:14", "1/1/04 3:09", "1/1/04 8:02", "1/1/04 9:05", "1/1/04 9:06")
>Place <- c("x1","x1","x3","x4","x4","x4","x5","x5")
>X <- c(1,,2,3,3,3,6,6)
>Y <- c(1,,9,7,7,7,8,8)
>
>toto <- data.frame(Num,Date,Place,X,Y)
>
>The first problem is to keep one line for each Num with the minimum date. I managed to do it with loops but i would like a solution without using loops. It will be better for my large data base.
>
>The other one is to retrieve the coordinates ill-informed. For example, for the same place x1, Num=2 doesn't have X and Y. But, we have this information for Num=1.
>
>The example data base must be like this
>
>Num <- c(1,2,3,4,5)
>Date <- c("1/1/04 0:48","1/1/04 1:52", "1/1/04 1:55", "1/1/04 2:14", "1/1/04 9:05")
>Place <- c("x1","x1","x3","x4","x5")
>X <- c(1,1,2,3,6)
>Y <- c(1,1,9,7,8)
>
>toto <- data.frame(Num,Date,Place,X,Y)
>
>
>Somebody know how to do ?
>Thanks.
>
>Florent Bonneu
>Laboratoire de Statistique et Probabilités
>bureau 148 bât. 1R2
>Université Toulouse 3
>118 route de Narbonne - 31062 Toulouse cedex 9
>bonneu at cict.fr <mailto:bonneu at cict.fr>
>
>______________________________________________
>R-help at stat.math.ethz.ch mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>
>
>
More information about the R-help
mailing list