[R] dplyr - add/expand rows

jim holtman jholtman at gmail.com
Mon Nov 27 02:42:59 CET 2017


try this:

##########################################

library(dplyr)

input <- tribble(
  ~station, ~from, ~to, ~record,
 "07EA001" ,    1960  ,  1960  , "QMS",
 "07EA001"  ,   1961 ,   1970  , "QMC",
 "07EA001" ,    1971  ,  1971  , "QMM",
 "07EA001" ,    1972  ,  1976  , "QMC",
 "07EA001" ,    1977  ,  1983  , "QRC"
)

result <- input %>%
  rowwise() %>%
  do(tibble(station = .$station,
            year = seq(.$from, .$to),
            record = .$record)
  )

###########################



Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

On Sun, Nov 26, 2017 at 2:10 PM, Bert Gunter <bgunter.4567 at gmail.com> wrote:

> To David W.'s point about lack of a suitable reprex ("reproducible
> example"), Bill's solution seems to be for only one station.
>
> Here is a reprex and modification that I think does what was requested for
> multiple stations, again using base R and data frames, not dplyr and
> tibbles.
>
> First the reprex with **two** stations:
>
> > d <- data.frame( station = rep(c("one","two"),c(5,4)),
>                from = c(60,61,71,72,76,60,65,82,83),
>                 to = c(60,70,71,76,83,64, 81, 82,83),
>                 record = c("A","B","C","B","D","B","B","D","E"))
>
> > d
>   station from to record
> 1     one   60 60      A
> 2     one   61 70      B
> 3     one   71 71      C
> 4     one   72 76      B
> 5     one   76 83      D
> 6     two   60 64      B
> 7     two   65 81      B
> 8     two   82 82      D
> 9     two   83 83      E
>
> ## Now the conversion code using base R, especially by():
>
> > out <- by(d, d$station, function(x) with(x, {
> +    i <- to - from +1
> +    data.frame(YEAR =sequence(i) -1 +rep(from,i), RECORD =rep(record,i))
> + }))
>
>
> > out <- data.frame(station =
> rep(names(out),sapply(out,nrow)),do.call(rbind,out), row.names = NULL)
>
>
> > out
>    station YEAR RECORD
> 1      one   60      A
> 2      one   61      B
> 3      one   62      B
> 4      one   63      B
> 5      one   64      B
> 6      one   65      B
> 7      one   66      B
> 8      one   67      B
> 9      one   68      B
> 10     one   69      B
> 11     one   70      B
> 12     one   71      C
> 13     one   72      B
> 14     one   73      B
> 15     one   74      B
> 16     one   75      B
> 17     one   76      B
> 18     one   76      D
> 19     one   77      D
> 20     one   78      D
> 21     one   79      D
> 22     one   80      D
> 23     one   81      D
> 24     one   82      D
> 25     one   83      D
> 26     two   60      B
> 27     two   61      B
> 28     two   62      B
> 29     two   63      B
> 30     two   64      B
> 31     two   65      B
> 32     two   66      B
> 33     two   67      B
> 34     two   68      B
> 35     two   69      B
> 36     two   70      B
> 37     two   71      B
> 38     two   72      B
> 39     two   73      B
> 40     two   74      B
> 41     two   75      B
> 42     two   76      B
> 43     two   77      B
> 44     two   78      B
> 45     two   79      B
> 46     two   80      B
> 47     two   81      B
> 48     two   82      D
> 49     two   83      E
>
> Cheers,
> Bert
>
>
>
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along and
> sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
> On Sat, Nov 25, 2017 at 4:49 PM, William Dunlap via R-help <
> r-help at r-project.org> wrote:
>
> > dplyr may have something for this, but in base R I think the following
> does
> > what you want.  I've shortened the name of your data set to 'd'.
> >
> > i <- rep(seq_len(nrow(d)), d$YEAR_TO-d$YEAR_FROM+1)
> > j <- sequence(d$YEAR_TO-d$YEAR_FROM+1)
> > transform(d[i,], YEAR=YEAR_FROM+j-1, YEAR_FROM=NULL, YEAR_TO=NULL)
> >
> >
> > Bill Dunlap
> > TIBCO Software
> > wdunlap tibco.com
> >
> > On Sat, Nov 25, 2017 at 11:18 AM, Hutchinson, David (EC) <
> > david.hutchinson at canada.ca> wrote:
> >
> > > I have a returned tibble of station operational record similar to the
> > > following:
> > >
> > > > data.collection
> > > # A tibble: 5 x 4
> > >   STATION_NUMBER YEAR_FROM YEAR_TO RECORD
> > >            <chr>     <int>   <int>  <chr>
> > > 1        07EA001      1960    1960    QMS
> > > 2        07EA001      1961    1970    QMC
> > > 3        07EA001      1971    1971    QMM
> > > 4        07EA001      1972    1976    QMC
> > > 5        07EA001      1977    1983    QRC
> > >
> > > I would like to reshape this to one operational record (row) per year
> per
> > > station. Something like:
> > >
> > > 07EA001              1960      QMS
> > > 07EA001              1961      QMC
> > > 07EA001              1962      QMC
> > > 07EA001              1963      QMC
> > > ...
> > > 07EA001              1971      QMM
> > >
> > > Can this be done in dplyr easily?
> > >
> > > Thanks in advance,
> > >
> > > David
> > >
> > >         [[alternative HTML version deleted]]
> > >
> > > ______________________________________________
> > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide http://www.R-project.org/
> > > posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> > >
> >
> >         [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/
> > posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list