[R] Problems of data processing

Florent Bonneu bonneu at cict.fr
Tue Jan 17 11:59:58 CET 2006


Thank you very much for your help but I think there is an error for the 
answer to the first problem  I spent time on searching the solution but 
I failed to find it. I tried to put "which.max" instead of "which.min" 
but it doesn't work. I tried to do my best but i didn't have any idea to 
solve this problem.

An example :

Num <- c(1,2,4,3,4,4,5,5,5)
Date <- c("1/1/04 0:48","1/1/04 8:02", "1/1/04 1:55", "1/1/04 2:14", "1/1/04 1:19", "1/1/04 1:02", "1/1/04 11:15", "1/1/04 9:06", "1/1/04 10:32")
Place <- c("x1","x1","x4","x3","x4","x4","x5","x5","x5")
X <- c(1,NA,3,2,3,3,6,6,6)
Y <- c(1,NA,7,9,7,7,8,8,8)
toto <- data.frame(Num,Date,Place,X,Y)
toto[order(toto$Num,as.numeric(as.POSIXct(strptime(toto$Date, "%d/%m/%y %H:%M")))),] 

toto <- merge(toto[1:3], unique(na.omit(toto[3:5])),by="Place",all.x=T) 

help <- do.call("rbind", lapply(split(toto, toto$Num),
   function(x) x[which.min(as.numeric(as.POSIXct(strptime(toto$Date, "%d/%m/%y %H:%M")))),]))
help

The solution must be

Num <- c(1,2,3,4,5)
Date <- c("1/1/04 0:48","1/1/04 8:02", "1/1/04 2:14", "1/1/04 1:02", "1/1/04 9:06")
Place <- c("x1","x1","x3","x4","x5")
X <- c(1,1,2,3,6)
Y <- c(1,1,9,7,8)
toto <- data.frame(Num,Date,Place,X,Y)


Any suggestion is welcome.

Florent Bonneu.



Jacques VESLOT wrote:

> OK ! so try this:
> merge(toto[1:3], unique(na.omit(toto[3:5])),by="Place",all.x=T)
>
>
> Florent Bonneu a écrit :
>
>> Indeed,
>> X <- c(1,Na,2,3,3,3,6,6)
>> Y <- c(1,Na,9,7,7,7,8,8)
>>
>> I want to obtain one line for each Num. It's not a problem if there 
>> are several lines for the same place, because my identifier is Num. I 
>> just want to get X and Y well-informed in an other line for the same 
>> place. For example, "Num=2" is at the place "x1", like "Num=1", but 
>> we don't have the coordinates X and Y for "Num=2".  Now, the same 
>> coordinates are well-informed for "Num=1", so i want to retrieve this 
>> coordinates in my line "Num=2" for my columns X and Y.
>>
>>
>>
>> Jacques VESLOT wrote:
>>
>>> something wrong in X and Y definitions... but this could work:
>>>
>>> do.call("rbind", lapply(split(toto, toto$Num),
>>>    function(x) x[which.min(as.POSIXct(strptime(toto$Date, "%d/%m/%y 
>>> %H:%M"))),]))
>>>
>>> i don't understand the second query; do you want to keep the first 
>>> line when there are several lines for the same place ?
>>>
>>>
>>> Florent Bonneu a écrit :
>>>
>>>> I have two problems for the data processing of my large data base 
>>>> (50000 rows). For example, a sample is as follows
>>>>
>>>> Num <- c(1,2,3,4,4,4,5,5)
>>>> Date <- c("1/1/04 0:48","1/1/04 1:52", "1/1/04 1:55", "1/1/04 
>>>> 2:14", "1/1/04 3:09", "1/1/04 8:02", "1/1/04 9:05", "1/1/04 9:06")
>>>> Place <- c("x1","x1","x3","x4","x4","x4","x5","x5")
>>>> X <- c(1,””,2,3,3,3,6,6)
>>>> Y <- c(1,””,9,7,7,7,8,8)
>>>>
>>>> toto <- data.frame(Num,Date,Place,X,Y)
>>>>
>>>> The first problem is to keep one line for each Num with the 
>>>> “minimum” date. I managed to do it with loops but i would like a 
>>>> solution without using loops. It will be better for my large data 
>>>> base.
>>>>
>>>> The other one is to retrieve the coordinates ill-informed. For 
>>>> example, for the same place “x1”, Num=2 doesn't have X and Y. But, 
>>>> we have this information for Num=1.
>>>>
>>>> The example data base must be like this
>>>>
>>>> Num <- c(1,2,3,4,5)
>>>> Date <- c("1/1/04 0:48","1/1/04 1:52", "1/1/04 1:55", "1/1/04 
>>>> 2:14", "1/1/04 9:05")
>>>> Place <- c("x1","x1","x3","x4","x5")
>>>> X <- c(1,1,2,3,6)
>>>> Y <- c(1,1,9,7,8)
>>>>
>>>> toto <- data.frame(Num,Date,Place,X,Y)
>>>> Somebody know how to do ?
>>>> Thanks.
>>>>
>>>> Florent Bonneu
>>>> Laboratoire de Statistique et Probabilités
>>>> bureau 148  bât. 1R2
>>>> Université Toulouse 3
>>>> 118 route de Narbonne - 31062 Toulouse cedex 9
>>>> bonneu at cict.fr <mailto:bonneu at cict.fr>
>>>>
>>>> ______________________________________________
>>>> R-help at stat.math.ethz.ch mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide! 
>>>> http://www.R-project.org/posting-guide.html
>>>>
>>>>  
>>>>
>>>
>>>
>>>
>>
>
>
>




More information about the R-help mailing list