[R] problem applying the same function twice

Curtis Burkhalter curtisburkhalter at gmail.com
Tue Mar 10 21:35:49 CET 2015


William,

You say not to use apply here, but what would you use in its place?

Thanks

On Tue, Mar 10, 2015 at 2:13 PM, William Dunlap <wdunlap at tibco.com> wrote:

> The key to your problem may be that
>    x<-apply(missing,1,genRows)
> converts 'missing' to a matrix, with the same type for all columns
> then makes x either a list or a matrix but never a data.frame.
> Those features of apply may mess up the rest of your calculations.
>
> Don't use apply().
>
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
> On Tue, Mar 10, 2015 at 12:43 PM, Curtis Burkhalter <
> curtisburkhalter at gmail.com> wrote:
>
>> Hey everyone,
>>
>> I've written a function that adds NAs to a dataframe where data is missing
>> and it seems to work great if I only need to run it once, but if I run it
>> two times in a row I run into problems. I've created a workable example to
>> explain what I mean and why I would do this.
>>
>> In my dataframe there are areas where I need to add two rows of NAs (b/c I
>> need to have 3 animal x year combos and for cat in year 2 I only have one)
>> so I thought that I'd just run my code twice using the function in the
>> code
>> below. Everything works great when I run it the first time, but when I run
>> it again it says that the value returned to the list 'x' is of length 0. I
>> don't understand why the function works the first time around and adds an
>> NA to the 'animalMass' column, but won't do it again. I've used
>> (print(str(dataframe)) to see if there is a change in class or type when
>> the function runs through the original dataframe and there is for
>> 'animalYears', but I just convert it back before rerunning the function
>> for
>> second time.
>>
>> Any thoughts on this would be greatly appreciated b/c my actual data
>> dataframe I have to input into WinBUGS is 14000x12, so it's not a trivial
>> thing to just add in an NA here or there.
>>
>> >comAn
>>    animals animalYears animalMass
>> 1     bird           1         29
>> 2     bird           1         48
>> 3     bird           1         36
>> 4     bird           2         20
>> 5     bird           2         34
>> 6     bird           2         34
>> 7      dog           1         21
>> 8      dog           1         28
>> 9      dog           1         25
>> 10     dog           2         35
>> 11     dog           2         18
>> 12     dog           2         11
>> 13     cat           1         46
>> 14     cat           1         33
>> 15     cat           1         48
>> 16     cat           2         21
>>
>> So every animal has 3 measurements per year, except for the cat in year
>> two
>> which has only 1. I run the code below and get:
>>
>> #combs defines the different combinations of
>> #animals and animalYears
>> combs<-paste(comAn$animals,comAn$animalYears,sep=':')
>> #counts defines how long the different combinations are
>> counts<-ave(1:nrow(comAn),combs,FUN=length)
>> #missing defines the combs that have length less than one and puts it in
>> #the data frame missing
>> missing<-data.frame(vals=combs[counts<2],count=counts[counts<2])
>>
>> genRows<-function(dat){
>>         vals<-strsplit(dat[1],':')[[1]]
>>                 #not sure why dat[2] is being converted to a string
>>         newRows<-2-as.numeric(dat[2])
>>         newDf<-data.frame(animals=rep(vals[1],newRows),
>>                           animalYears=rep(vals[2],newRows),
>>                           animalMass=rep(NA,newRows))
>>         return(newDf)
>>         }
>>
>>
>> x<-apply(missing,1,genRows)
>> comAn=rbind(comAn,
>>         do.call(rbind,x))
>>
>> > comAn
>>    animals animalYears animalMass
>> 1     bird           1         29
>> 2     bird           1         48
>> 3     bird           1         36
>> 4     bird           2         20
>> 5     bird           2         34
>> 6     bird           2         34
>> 7      dog           1         21
>> 8      dog           1         28
>> 9      dog           1         25
>> 10     dog           2         35
>> 11     dog           2         18
>> 12     dog           2         11
>> 13     cat           1         46
>> 14     cat           1         33
>> 15     cat           1         48
>> 16     cat           2         21
>> 17     cat           2       <NA>
>>
>> So far so good, but then I adjust the code so that it reads (**notice the
>> change in the specification in 'missing' to counts<3**):
>>
>> #combs defines the different combinations of
>> #animals and animalYears
>> combs<-paste(comAn$animals,comAn$animalYears,sep=':')
>> #counts defines how long the different combinations are
>> counts<-ave(1:nrow(comAn),combs,FUN=length)
>> #missing defines the combs that have length less than one and puts it in
>> #the data frame missing
>> missing<-data.frame(vals=combs[counts<3],count=counts[counts<3])
>>
>> genRows<-function(dat){
>>         vals<-strsplit(dat[1],':')[[1]]
>>                 #not sure why dat[2] is being converted to a string
>>         newRows<-2-as.numeric(dat[2])
>>         newDf<-data.frame(animals=rep(vals[1],newRows),
>>                           animalYears=rep(vals[2],newRows),
>>                           animalMass=rep(NA,newRows))
>>         return(newDf)
>>         }
>>
>>
>> x<-apply(missing,1,genRows)
>> comAn=rbind(comAn,
>>         do.call(rbind,x))
>>
>> The result for 'x' then reads:
>>
>> > x
>> [[1]]
>> [1] animals     animalYears animalMass
>> <0 rows> (or 0-length row.names)
>>
>> Any thoughts on why it might be doing this instead of adding an additional
>> row to get the result:
>>
>> > comAn
>>    animals animalYears animalMass
>> 1     bird           1         29
>> 2     bird           1         48
>> 3     bird           1         36
>> 4     bird           2         20
>> 5     bird           2         34
>> 6     bird           2         34
>> 7      dog           1         21
>> 8      dog           1         28
>> 9      dog           1         25
>> 10     dog           2         35
>> 11     dog           2         18
>> 12     dog           2         11
>> 13     cat           1         46
>> 14     cat           1         33
>> 15     cat           1         48
>> 16     cat           2         21
>> 17     cat           2       <NA>
>> 18     cat           2       <NA>
>>
>> Thanks
>> --
>> Curtis Burkhalter
>>
>> https://sites.google.com/site/curtisburkhalter/
>>
>>         [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>


-- 
Curtis Burkhalter

https://sites.google.com/site/curtisburkhalter/

	[[alternative HTML version deleted]]



More information about the R-help mailing list