[R] Using "rep", but don't know what to put after each =

jim holtman jholtman at gmail.com
Fri Feb 20 14:49:05 CET 2009


Here is one way of doing it.  It does not check for errors in the case
that there are not enough values in the first dataframe:

> x1 <- read.table(textConnection("CHR_NR      diffdatoperiode
+ 11377                29
+ 11377                59
+ 11377                78"), header=TRUE)
>
> x2 <- read.table(textConnection("CHR_NR  variab
+ 11377       1
+ 11377       0
+ 11377       1
+ 11377       0
+ 11377       0
+ 11377       0
+ 11377       1
+ 11377       0
+ 11377       0
+ 11377       0
+ 11377       0
+ 11377       0
+ 11377       0"), header=TRUE)
> closeAllConnections()
> # partition the keys (x1)
> x1.p <- split(x1, x1$CHR_NR)
> # partition the data and then process each section
> x2.p <- split(x2, x2$CHR_NR)
> # process each partition using the 'name' so you can access the key data
> result <- lapply(names(x2.p), function(.name){
+     # determine the index to use by using cumsum
+     change <- cumsum(x2.p[[.name]]$variab)
+     # add new column
+     cbind(x2.p[[.name]], x=x1.p[[.name]]$diffdatoperiode[change])
+ })
> do.call(rbind, result)
   CHR_NR variab  x
1   11377      1 29
2   11377      0 29
3   11377      1 59
4   11377      0 59
5   11377      0 59
6   11377      0 59
7   11377      1 78
8   11377      0 78
9   11377      0 78
10  11377      0 78
11  11377      0 78
12  11377      0 78
13  11377      0 78
>


On Fri, Feb 20, 2009 at 6:49 AM, joe1985 <johannes at dsr.life.ku.dk> wrote:
>
> Hello
>
> I have one DF (detheleny1periode), with some variables that mathes, in some
> way, variables in another DF (y2).
>
> The DF named detheleny1periode look like this (i have not included alle
> variables):
>
> CHR_NR      diffdatoperiode
> 11377                29
> 11377                59
> 11377                78
>
> with many different CHR_NR's.
>
> And the other DF named y2 look like this (i have not included alle
> variables):
>
> CHR_NR  variab
> 11377       1
> 11377       0
> 11377       1
> 11377       0
> 11377       0
> 11377       0
> 11377       1
> 11377       0
> 11377       0
> 11377       0
> 11377       0
> 11377       0
> 11377       0
>
> again with many different CHR_NR's.
>
> So what want is to make it look like this:
>
> CHR_NR  variab             x
> 11377       1               29
> 11377       0               29
> 11377       1               59
> 11377       0               59
> 11377       0               59
> 11377       0               59
> 11377       1               78
> 11377       0               78
> 11377       0               78
> 11377       0               78
> 11377       0               78
> 11377       0               78
> 11377       0               78
>
> So i thougth i could use y2$x <- rep(c(detheleny1periode$diffdatoperiode,
> each= ), but my problem is that i don't what to put after  "each=" because
> the number of rows differs a lot between each "CHR_NR", but what i do know
> is that want the detheleny1periode$diffdatoperiode repeated the times theres
> is between each "1" in "variab", as I illustrated above.
>
> Hope you can help me
>
>
> --
> View this message in context: http://www.nabble.com/Using-%22rep%22%2C-but-don%27t-know-what-to-put-after-each-%3D-tp22119236p22119236.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?




More information about the R-help mailing list