[R] simplify a dataframe
Arnaud Michel
michel.arnaud at cirad.fr
Wed Jul 17 22:03:34 CEST 2013
Thank you for the question (1)
Sorry for the imprecision for the question (2) :
Suppose the date frame df
df1 <- data.frame(
Debut =c ( "24/01/1995", "01/05/1997" ,"31/12/1997", "02/02/1995"
,"28/02/1995"
,"01/03/1995", "13/03/1995", "01/01/1996", "31/01/1996") ,
Fin = c ( "30/04/1997", "30/12/1997" ,"31/12/1997", "27/02/1995",
"28/02/1995",
"12/03/1995", "30/06/1995", "30/01/1996", "31/01/1996") ,
INDX = c(6,6,6, 11,11,11, 4, 5,5) )
I would like replace df1 by df2
df2 <- data.frame(
Deb = c("24/01/1995", "02/02/1995", "13/03/1995",
"01/01/1996") ,
Fin = c("31/12/1997", "12/03/1995", "30/06/1995",
"31/01/1996") )
Explication :
The lines 1, 2 3 of df1 (who have same value of index =6) are replaced
by only one line with
value of Debut of df2 = Debut of line 1 of df1
value of Fin of df2 = Fin of line 3 of df1
The lines 4,5,6 of df1 (who have same value of index =11) are replaced
by only one line with
value of Debut of df2 = Debut of line 4 of df1
and value of fin of df2 = Fin of line 6 of df1
The line 7 of df1 (who have same value of index =4) are replaced by only
one line with
value of Debut of df2 = Debut of line 7of df1
and value of fin of df2 = Fin of line 7of df1
==> No change
The lines 8,9 of df1 (who have same value of index =5) are replaced by
only one line with
value of Debut of df2 = Debut of line 8of df1
and value of fin of df2 = Fin of line 9 of df1
df1
Debut Fin INDX
1 24/01/1995 30/04/1997 6
2 01/05/1997 30/12/1997 6
3 31/12/1997 31/12/1997 6
4 02/02/1995 27/02/1995 11
5 28/02/1995 28/02/1995 11
6 01/03/1995 12/03/1995 11
7 13/03/1995 30/06/1995 4
8 01/01/1996 30/01/1996 5
9 31/01/1996 31/01/1996 5
Deb Fin
1 24/01/1995 31/12/1997
2 02/02/1995 12/03/1995
3 13/03/1995 30/06/1995
4 01/01/1996 31/01/1996
Thank you for your helps
Michel
Le 17/07/2013 19:57, Rui Barradas a écrit :
> Hello,
>
> As for question (1), try the following.
>
>
> y2 <- cumsum(c(TRUE, diff(x1) > 0))
> identical(as.integer(y1), y2) # y1 is of class "numeric"
>
>
> As for question (2) I'm not understanding it.
>
> Hope this helps,
>
> Rui Barradas
>
> Em 17-07-2013 18:21, Arnaud Michel escreveu:
>> Hi Arun
>>
>> I have two questions always about the question of symplify a dataframe
>>
>> I would like
>> 1) to transform the vector x1 into the vector y1
>> x1 <- c(1,1,1,-1000, 1,-1000, 1,1,1,1,1,1,-1000)
>> y1 <- c(1,1,1,1, 2,2, 3,3,3,3,3,3,3)
>>
>>
>> 2) to transform the vectors Debut and Fin by taking into account INDX
>> into the two vectors Deb and Fin
>> Debut <- c (
>> "24/01/1995", "01/05/1997" ,"31/12/1997", "02/02/1995" ,"28/02/1995"
>> ,"01/03/1995",
>> "13/03/1995", "01/01/1996", "31/01/1996", "24/01/1995", "01/07/1995"
>> ,"01/09/1995",
>> "01/07/1997", "01/01/1998", "01/08/1998", "01/01/2000",
>> "17/01/2000","29/02/2000")
>>
>> Fin <- c (
>> "30/04/1997", "30/12/1997" ,"31/12/1997", "27/02/1995", "28/02/1995",
>> "12/03/1995",
>> "30/06/1995", "30/01/1996", "31/01/1996", "30/06/1995", "31/08/1995",
>> "30/06/1997",
>> "31/12/1997", "31/07/1998", "31/12/1999", "16/01/2000", "28/02/2000",
>> "29/02/2000")
>>
>> INDX <- c(6,6,6, 11,11,11, 4, 5,5)
>>
>>
>> Deb <- c("*24/01/1995*", "*02/02/1995*", "*13/03/1995*",
>> "*01/01/1996*")
>> Fi n <- c("*31/12/1997*", "*12/03/1995*", "*30/06/1995*",
>> "*31/01/1996*")
>>
>>
>> Debut Fin INDX
>> *24/01/1995* 30/04/1997 6
>> 01/05/1997 30/12/1997 6
>> 31/12/1997 *31/12/1997* 6
>> *02/02/1995* 27/02/1995 11
>> 28/02/1995 28/02/1995 11
>> 01/03/1995 *12/03/1995* 11
>> *13/03/1995* *30/06/1995* 4
>> *01/01/1996* 30/01/1996 5
>> 31/01/1996 *31/01/1996* 5
>> ................
>>
>> Thanks for your help
>>
>>
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
--
Michel ARNAUD
Chargé de mission auprès du DRH
DGDRD-Drh - TA 174/04
Av Agropolis 34398 Montpellier cedex 5
tel : 04.67.61.75.38
fax : 04.67.61.57.87
port: 06.47.43.55.31
More information about the R-help
mailing list