[R] apply is making me crazy...

David Winsemius dwinsemius at comcast.net
Thu Jul 28 21:45:22 CEST 2011


On Jul 28, 2011, at 3:13 PM, Gene Leynes wrote:

> Very clever (as usual)…  It works, but since I wanted to switch the  
> rows and columns, which would require this:
>  answer.slightly.clumsy  =
>      lapply(exampBad, function(x) matrix(apply(x ,1, cumsum),   
> ncol=nrow(x)))
>
> However, with a slight modification of your code I can use a wrapper  
> function for apply.  This gives me the functionality, with clean  
> syntax.  I will probably add to my “standard” library.
> apply1 = function(mat, fun){
>                 matrix(apply(mat ,1, fun),  nrow=nrow(mat), byrow=T)
> }
> res = lapply(exampBad, function(x) apply1(x , cumsum))

Hmmm. I originally wrote nrow=nrow(mat) but changed it after it did  
not give your specified output. But `apply` typically transposes its  
results so you are using this matrix property: t(t(.)) == (.)

>
> As I mentioned in my email to Dennis, I am typically dealing with  
> highly nested lists of matrices that have one thing in common: 1000  
> rows.  This fact makes fact the lapply, sapply, apply, and do.call  
> family extremely effective, and makes this apply problem something  
> that is worth solving.
>
> PS: I still wish that there were a drop=TRUE option in apply!

ITYM ..., drop=FALSE, since drop = TRUE is the default behavior that  
causes loss of matrix dimensions in "[".

-- 
david.

>
> Thanks again,
>
> Gene
>
> On Thu, Jul 28, 2011 at 12:05 PM, David Winsemius <dwinsemius at comcast.net 
> > wrote:
>
> On Jul 28, 2011, at 12:31 PM, Gene Leynes wrote:
>
> (As I mentioned in my other reply to Dennis, I think I'll stick with  
> for loops, but I wanted to respond.)
>
> By "almost does it" I meant that using as.matrix helps because it  
> puts the vector into a column, that "almost does it” because half  
> the problem is that the output is a non dimensional vector when  
> apply is passed a matrix with one column.
>
> However, since the output of the apply function is transposed when  
> you’re doing row margins, the as.matrix doesn’t help because it’s  
> putting your result into a column, while the apply function is  
> putting everything else into rows. I tried several combination of  
> using t() before, after, and during (changing margin=1 to margin=2)  
> the function; but none did the trick.
>
> I was not as diligent about using your margin=1:1 suggestion in all  
> my trials, that didn't seem to be different from using margin=1.
>
> The problem is a bit hard to describe using a natural language, and  
> I think more apparent from the code.  Of course, that could my  
> shortcoming.
>
> I still think that the structure of the proposed solution, which I  
> think makes the problem apparent.
> > str(answerProposed)
> List of 3
>  $ : num [1:1000, 1] 0.5658 0.1759 1.2444 -0.0456 0.0236 ...
>  $ : num [1:2, 1:1000] 0.0392 0.7047 0.1834 -0.6644 -0.6952 ...
>  $ : num [1:3, 1:1000] -0.835 -0.0461 -0.1725 0.8365 0.7835 ...
> >
>
> Sometimes I need to be hit over the head a few times for things to  
> sink in. I hadn't noticed the reversal of dimensions in the "1" row  
> case:
>
>  answer.not.Bad  = lapply(exampBad, function(x) matrix(apply(x , 
> 1,cumsum),  ncol=nrow(x)))
>
> > str(answer.not.Bad)
> List of 3
>  $ : num [1, 1:1000] -0.159 -0.035 -0.386 -1.81 1.123 ...
>  $ : num [1:2, 1:1000] -0.7801 0.6004 -0.0869 -0.1611 -0.3594 ...
>  $ : num [1:3, 1:1000] -1.14 -2.81 -3.45 3.16 2.54 ...
>
> The 1:1 dodge was useless, anyway. And just to be sure, you did want  
> the row and col dimensions reversed? And you did want the first  
> element to just be a (transposed) copy of its argument?
>
> Are we good now?
>
> -- 
> david.
>
> I want it to do this:
> > str(answerDesired)
> List of 3
>  $ : num [1, 1:1000,] 0.5658 0.1759 1.2444 -0.0456 0.0236 ...
>  $ : num [1:2, 1:1000] 0.0392 0.7047 0.1834 -0.6644 -0.6952 ...
>  $ : num [1:3, 1:1000] -0.835 -0.0461 -0.1725 0.8365 0.7835 ...
> >
>
> There are a lot of reasons why I would want the apply function to  
> work this way, or at least have an option to work this way.  One  
> reason is so that you could perform do.call(rbind, mylist) at the  
> later
>
> I guess this behavior is described in the apply documentation:
> “If each call to FUN returns a vector of length n, then apply  
> returns an array of dimension c(n, dim(X)[MARGIN]) if n > 1. If  
> nequals 1, apply returns a vector if MARGIN has length 1 and an  
> array of dimension dim(X)[MARGIN] otherwise. If n is 0, the result  
> has length 0 but not necessarily the ‘correct’ dimension.”
>
>
> I just wish that it had an option to do return an array of dimension  
> c(n, dim(X)[MARGIN]) if n >= 1
>
> On Wed, Jul 27, 2011 at 8:25 PM, David Winsemius <dwinsemius at comcast.net 
> > wrote:
>
> On Jul 27, 2011, at 7:44 PM, Gene Leynes wrote:
>
> David,
>
> Thanks for the suggestion, but I think your answer only works  
> because I was printing the wrong thing (because apply with margin=1  
> transposes the results,
>
> And if you want to change that,  then the t() function is readily at  
> hand.
>
> something I always forget).
>
> Check this to see what I mean:
>    str(answerGood)
>    str(answerBad)
>
> Adding "as.matrix" is interesting and almost does it,
>
> "It" ... What is "it"? In a natural language,  ...  English  
> preferably.
>
> -- 
> david.
>
> however the results are still transposed.
>
> Sorry to be confusing with the initial example.
>
> Here's an updated example (adding as.matrix doesn't make a difference)
>
>
> ## Make three example matricies
> exampGood = lapply(2:4, function(x)matrix(rnorm(1000*x),ncol=x))
> exampBad  = lapply(1:3, function(x)matrix(rnorm(1000*x),ncol=x))
> ## Two ways to see what was created:
> for(k in 1:length(exampGood)) print(dim(exampGood[[k]]))
> for(k in 1:length(exampBad)) print(dim(exampBad[[k]]))
>
> ##  Take the cumsum of each row of each matrix
> answerGood =      lapply(exampGood, function(x) apply(x ,1,cumsum))
> answerBad  =      lapply(exampBad, function(x) apply(x ,1,cumsum))
> answerProposed  = lapply(exampBad, function(x) as.matrix(apply(x , 
> 1:1,cumsum)))
> str(answerGood)
> str(answerBad)
> str(answerProposed)
>
> ##  Take the first element of the final column of each answer
> for(mat in answerGood){
>    mat = t(mat)  ## To get back to 1000 rows
>    LastColumn = ncol(mat)
>    print(mat[2,LastColumn])
> }
> for(mat in answerBad){
>    mat = t(mat)  ## To get back to 1000 rows
>    LastColumn = ncol(mat)
>    print(mat[2,LastColumn])
> }
> for(mat in answerProposed){
>    mat = t(mat)  ## To get back to 1000 rows
>    LastColumn = ncol(mat)
>    print(mat[2,LastColumn])
> }
>
>
>
> On Wed, Jul 27, 2011 at 5:45 PM, David Winsemius <dwinsemius at comcast.net 
> > wrote:
>
> On Jul 27, 2011, at 6:22 PM, Gene Leynes wrote:
>
> I have tried a lot of ways around this, but I can't find a way to  
> make apply
> work in a generalized way because it causes a failure whenever  
> reduces the
> dimensions of its output.
> The following example is easier to understand than the question.
>
> I wish it had a "drop=TRUE/FALSE" option like the "["  (and I wish I  
> had
> found the drop option a year ago, and I wish that I had 1e6  
> dollars... Oops,
> I mean euros).
>
>
>  ## Make three example matricies
>  exampGood = lapply(2:4, function(x)matrix(rnorm(1000*x),ncol=x))
>  exampBad  = lapply(1:3, function(x)matrix(rnorm(1000*x),ncol=x))
>  ## Two ways to see what was created:
>  for(k in 1:length(exampGood)) print(dim(exampGood[[k]]))
>  for(k in 1:length(exampBad)) print(dim(exampBad[[k]]))
>
>  ##  Take the cumsum of each row of each matrix
>  answerGood = lapply(exampGood, function(x) apply(x ,1,cumsum))
>  answerBad  = lapply(exampBad, function(x) apply(x ,1,cumsum))
>
> Try instead:
>
> answerBad  = lapply(exampBad, function(x) as.matrix(apply(x , 
> 1:1,cumsum)))
>
>
> I also find wrapping as.matrix() around vector results inside a  
> print() call often makes my console output much more to my liking.
>
>
>  str(answerGood)
>  str(answerBad)
>
>  ##  Take the first element of the final column of each answer
>  for(mat in answerGood){
>      LastColumn = ncol(mat)
>      print(mat[1,LastColumn])
>  }
>  for(mat in answerBad){
>      LastColumn = ncol(mat)
>      print(mat[1,LastColumn])
>  }
>
>       [[alternative HTML version deleted]]

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list