[R] problem applying the same function twice

Curtis Burkhalter curtisburkhalter at gmail.com
Tue Mar 10 21:35:21 CET 2015


Sarah,

This strategy works great for this small dataset, but when I attempt your
method with my data set I reach the maximum allowable memory allocation and
the operation just stalls and then stops completely before it is finished.
Do you know of a way around this?

Thanks

On Tue, Mar 10, 2015 at 2:04 PM, Sarah Goslee <sarah.goslee at gmail.com>
wrote:

> Hi,
>
> I didn't work through your code, because it looked overly complicated.
> Here's a more general approach that does what you appear to want:
>
> # use dput() to provide reproducible data please!
> comAn <- structure(list(animals = c("bird", "bird", "bird", "bird", "bird",
> "bird", "dog", "dog", "dog", "dog", "dog", "dog", "cat", "cat",
> "cat", "cat"), animalYears = c(1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L,
> 1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L), animalMass = c(29L, 48L, 36L,
> 20L, 34L, 34L, 21L, 28L, 25L, 35L, 18L, 11L, 46L, 33L, 48L, 21L
> )), .Names = c("animals", "animalYears", "animalMass"), class =
> "data.frame", row.names = c("1",
> "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13",
> "14", "15", "16"))
>
>
> # add reps to comAn
> # assumes comAn is already sorted on animals, animalYears
> comAn$reps <- unlist(sapply(rle(do.call("paste",
> comAn[,1:2]))$lengths, seq_len))
>
> # create full set of combinations
> outgrid <- expand.grid(animals=unique(comAn$animals),
> animalYears=unique(comAn$animalYears), reps=unique(comAn$reps),
> stringsAsFactors=FALSE)
>
> # combine with comAn
> comAn.full <- merge(outgrid, comAn, all.x=TRUE)
>
> > comAn.full
>    animals animalYears reps animalMass
> 1     bird           1    1         29
> 2     bird           1    2         48
> 3     bird           1    3         36
> 4     bird           2    1         20
> 5     bird           2    2         34
> 6     bird           2    3         34
> 7      cat           1    1         46
> 8      cat           1    2         33
> 9      cat           1    3         48
> 10     cat           2    1         21
> 11     cat           2    2         NA
> 12     cat           2    3         NA
> 13     dog           1    1         21
> 14     dog           1    2         28
> 15     dog           1    3         25
> 16     dog           2    1         35
> 17     dog           2    2         18
> 18     dog           2    3         11
> >
>
> On Tue, Mar 10, 2015 at 3:43 PM, Curtis Burkhalter
> <curtisburkhalter at gmail.com> wrote:
> > Hey everyone,
> >
> > I've written a function that adds NAs to a dataframe where data is
> missing
> > and it seems to work great if I only need to run it once, but if I run it
> > two times in a row I run into problems. I've created a workable example
> to
> > explain what I mean and why I would do this.
> >
> > In my dataframe there are areas where I need to add two rows of NAs (b/c
> I
> > need to have 3 animal x year combos and for cat in year 2 I only have
> one)
> > so I thought that I'd just run my code twice using the function in the
> code
> > below. Everything works great when I run it the first time, but when I
> run
> > it again it says that the value returned to the list 'x' is of length 0.
> I
> > don't understand why the function works the first time around and adds an
> > NA to the 'animalMass' column, but won't do it again. I've used
> > (print(str(dataframe)) to see if there is a change in class or type when
> > the function runs through the original dataframe and there is for
> > 'animalYears', but I just convert it back before rerunning the function
> for
> > second time.
> >
> > Any thoughts on this would be greatly appreciated b/c my actual data
> > dataframe I have to input into WinBUGS is 14000x12, so it's not a trivial
> > thing to just add in an NA here or there.
> >
> >>comAn
> >    animals animalYears animalMass
> > 1     bird           1         29
> > 2     bird           1         48
> > 3     bird           1         36
> > 4     bird           2         20
> > 5     bird           2         34
> > 6     bird           2         34
> > 7      dog           1         21
> > 8      dog           1         28
> > 9      dog           1         25
> > 10     dog           2         35
> > 11     dog           2         18
> > 12     dog           2         11
> > 13     cat           1         46
> > 14     cat           1         33
> > 15     cat           1         48
> > 16     cat           2         21
> >
> > So every animal has 3 measurements per year, except for the cat in year
> two
> > which has only 1. I run the code below and get:
> >
> > #combs defines the different combinations of
> > #animals and animalYears
> > combs<-paste(comAn$animals,comAn$animalYears,sep=':')
> > #counts defines how long the different combinations are
> > counts<-ave(1:nrow(comAn),combs,FUN=length)
> > #missing defines the combs that have length less than one and puts it in
> > #the data frame missing
> > missing<-data.frame(vals=combs[counts<2],count=counts[counts<2])
> >
> > genRows<-function(dat){
> >         vals<-strsplit(dat[1],':')[[1]]
> >                 #not sure why dat[2] is being converted to a string
> >         newRows<-2-as.numeric(dat[2])
> >         newDf<-data.frame(animals=rep(vals[1],newRows),
> >                           animalYears=rep(vals[2],newRows),
> >                           animalMass=rep(NA,newRows))
> >         return(newDf)
> >         }
> >
> >
> > x<-apply(missing,1,genRows)
> > comAn=rbind(comAn,
> >         do.call(rbind,x))
> >
> >> comAn
> >    animals animalYears animalMass
> > 1     bird           1         29
> > 2     bird           1         48
> > 3     bird           1         36
> > 4     bird           2         20
> > 5     bird           2         34
> > 6     bird           2         34
> > 7      dog           1         21
> > 8      dog           1         28
> > 9      dog           1         25
> > 10     dog           2         35
> > 11     dog           2         18
> > 12     dog           2         11
> > 13     cat           1         46
> > 14     cat           1         33
> > 15     cat           1         48
> > 16     cat           2         21
> > 17     cat           2       <NA>
> >
> > So far so good, but then I adjust the code so that it reads (**notice the
> > change in the specification in 'missing' to counts<3**):
> >
> > #combs defines the different combinations of
> > #animals and animalYears
> > combs<-paste(comAn$animals,comAn$animalYears,sep=':')
> > #counts defines how long the different combinations are
> > counts<-ave(1:nrow(comAn),combs,FUN=length)
> > #missing defines the combs that have length less than one and puts it in
> > #the data frame missing
> > missing<-data.frame(vals=combs[counts<3],count=counts[counts<3])
> >
> > genRows<-function(dat){
> >         vals<-strsplit(dat[1],':')[[1]]
> >                 #not sure why dat[2] is being converted to a string
> >         newRows<-2-as.numeric(dat[2])
> >         newDf<-data.frame(animals=rep(vals[1],newRows),
> >                           animalYears=rep(vals[2],newRows),
> >                           animalMass=rep(NA,newRows))
> >         return(newDf)
> >         }
> >
> >
> > x<-apply(missing,1,genRows)
> > comAn=rbind(comAn,
> >         do.call(rbind,x))
> >
> > The result for 'x' then reads:
> >
> >> x
> > [[1]]
> > [1] animals     animalYears animalMass
> > <0 rows> (or 0-length row.names)
> >
> > Any thoughts on why it might be doing this instead of adding an
> additional
> > row to get the result:
> >
> >> comAn
> >    animals animalYears animalMass
> > 1     bird           1         29
> > 2     bird           1         48
> > 3     bird           1         36
> > 4     bird           2         20
> > 5     bird           2         34
> > 6     bird           2         34
> > 7      dog           1         21
> > 8      dog           1         28
> > 9      dog           1         25
> > 10     dog           2         35
> > 11     dog           2         18
> > 12     dog           2         11
> > 13     cat           1         46
> > 14     cat           1         33
> > 15     cat           1         48
> > 16     cat           2         21
> > 17     cat           2       <NA>
> > 18     cat           2       <NA>
> >
> > Thanks
> > --
> > Curtis Burkhalter
>



-- 
Curtis Burkhalter

https://sites.google.com/site/curtisburkhalter/

	[[alternative HTML version deleted]]



More information about the R-help mailing list