[R] Problems of data processing

Florent Bonneu bonneu at cict.fr
Mon Jan 16 12:09:31 CET 2006


I have two problems for the data processing of my large data base (50000 rows). For example, a sample is as follows

Num <- c(1,2,3,4,4,4,5,5)
Date <- c("1/1/04 0:48","1/1/04 1:52", "1/1/04 1:55", "1/1/04 2:14", "1/1/04 3:09", "1/1/04 8:02", "1/1/04 9:05", "1/1/04 9:06")
Place <- c("x1","x1","x3","x4","x4","x4","x5","x5")
X <- c(1,””,2,3,3,3,6,6)
Y <- c(1,””,9,7,7,7,8,8)

toto <- data.frame(Num,Date,Place,X,Y)

The first problem is to keep one line for each Num with the “minimum” date. I managed to do it with loops but i would like a solution without using loops. It will be better for my large data base.

The other one is to retrieve the coordinates ill-informed. For example, for the same place “x1”, Num=2 doesn't have X and Y. But, we have this information for Num=1.

The example data base must be like this

Num <- c(1,2,3,4,5)
Date <- c("1/1/04 0:48","1/1/04 1:52", "1/1/04 1:55", "1/1/04 2:14", "1/1/04 9:05")
Place <- c("x1","x1","x3","x4","x5")
X <- c(1,1,2,3,6)
Y <- c(1,1,9,7,8)

toto <- data.frame(Num,Date,Place,X,Y)  


Somebody know how to do ?
Thanks.

Florent Bonneu
Laboratoire de Statistique et Probabilités
bureau 148  bât. 1R2
Université Toulouse 3
118 route de Narbonne - 31062 Toulouse cedex 9
bonneu at cict.fr <mailto:bonneu at cict.fr>




More information about the R-help mailing list