[R] applying cbind (or any function) across all components in a list
Rui Barradas
rui1174 at sapo.pt
Sat May 26 00:43:07 CEST 2012
Hello,
The use of function(i) is because there are two list to be processed,
with just one I would have used a simpler form of lapply. I don't
rebember exactly who wrote this in a post some time ago (Michael
Weylandt?) but imagine a list is a train. Then, list[ i ] is a car and
list[[ i ]] is the passengers. In your case they are matrices. In the
case only one list was to be processed, this would do:
lapply(l1, colSums) # for each matrix, apply the function
# more complicated function, typically, not part of base R (or other
packages)
lapply( l1, function(x) log(1 + colSums(x)) )
# same as above, but the former uses an unnamed, temporary, function,
that we can dispose of.
f <- function(x) log(1 + colSums(x))
lapply(l1, f)
In my previous post, seq_len guarantees that the index vector is well
formed. If the list has zero elements, the form 1:length(l1) becomes 1:0
== c(1, 0) but seq_len(0) == integer(0).
Then, function(i) is a function of the indices 1, 2, ..., length(l1).
And both 'l1' and 'l2' can be indexed at the same time.
Rui Barradas
Em 25-05-2012 22:30, Hans Thompson escreveu:
> Thank you everyone for working through the confusion from me posting from
> Nabble and missing context.
>
> Both Rui's and David's solutions are working for my problem. Rui's first
> interpretation is the application I was looking for but I was also more
> generally interested in how to do the second one if I wanted.
>
> I still don't understand how I would use function(i) for working with
> components of lists in the future though. Is there a simpler example? I'm
> going to play with using seq_len also.
>
> On Fri, May 25, 2012 at 8:20 AM, Rui Barradas<ruipbarradas at sapo.pt> wrote:
>
>> Hello,
>>
>> Let me give it a try.
>> This last post made it clear, I hope. I have two interpretations of your
>> problem.
>>
>> 1. 'l1' only has three columns, corresponding to clusters (genotypes) XX,
>> XY and YY, and 'l2' has one less column, corresponding to the midpoints
>> between their closest genotype cluster.
>>
>> 2. 'l1' can have any number of columns and 'l2' is the same as above,
>> i.e., has one less column.
>>
>> In any case, the result is not the pairwise products of all possible
>> combinations of columns of 'l1' and 'l2' matrices, but only those at a
>> certain distance. In this case, fun2 below is more general.
>>
>> fun1<- function(x, y){
>> cbind((x[, 1] + y[, 1])/2, (x[, 2] + y[, 1])/2,
>> (x[, 2] + y[, 2])/2, (x[, 3] + y[, 2])/2)
>> }
>>
>>
>> fun2<- function(x, y){
>> midpoint<- function(i, j) (x[, i] + y[, j])/2
>>
>> colx<- ncol(x)
>> res<- matrix(nrow = nrow(x), ncol = 2*colx - 2)
>> k<- 1
>> res[, k]<- midpoint(1, 1)
>> for(cx in seq_len(colx)[-c(1, colx)])
>> for(dist in 1:0)
>> res[, k<- k + 1]<- midpoint(cx, cx - dist)
>> res[, k + 1]<- midpoint(colx, colx - 1)
>> res
>> }
>>
>> lapply(seq_len(length(l1)), function(i) fun1(l1[[i]], l2[[i]]))
>> lapply(seq_len(length(l1)), function(i) fun2(l1[[i]], l2[[i]]))
>>
>>
>> If I'm wrong, sorry for the mess.
>>
>> Rui Barradas
>>
>>
>> Em 25-05-2012 11:00, r-help-request at r-project.org escreveu:
>>
>>> Date: Thu, 24 May 2012 15:37:51 -0700 (PDT)
>>> From: Hans Thompson<hans.thompson1 at gmail.**com<hans.thompson1 at gmail.com>
>>> To:r-help at r-project.org
>>> Subject: Re: [R] applying cbind (or any function) across all
>>> components in a list
>>> Message-ID:<1337899071674-**4631260.post at n4.nabble.com<1337899071674-4631260.post at n4.nabble.com>
>>> Content-Type: text/plain; charset=us-ascii
>>>
>>>
>>> The function I am giving for context is cbind. Are you asking how I would
>>> like to apply the answer to my question?
>>>
>>> I am trying to take the results of a Fluidigm SNP microarray, organized by
>>> assay into a list (each component is the results of one assay), find
>>> coordinate midpoints ([1,] and [2,] of my XX, XY, and YY clusters (these
>>> are
>>> genotypes) and is represented by l1. l2 is the midpoint between XX/XY and
>>> XY/YY although I did not give this in my example for simplicity, and I am
>>> now trying to find the midpoint between these new midpoints and their
>>> closest genotype clusters. This is represented as
>>>
>>> cbind((l1[[1]][,1]+l2[[1]][,1]**)/2, (l1[[1]][,2]+l2[[1]][,1])/2,
>>>
>>> (l1[[1]][,2]+l2[[1]][,2])/2, (l1[[1]][,3]+l2[[1]][,2])/2)
>>>
>>> but only works for one assay in the list of 96. I want to apply this to
>>> the
>>>
>>> entire list. My entire code so far is:
>>>
>>> ## OPEN .CSV and ORGANIZE BY ASSAY
>>>
>>>
>>> file=""
>>> {
>>> rawdata<- read.csv(file, skip = 15)
>>> OrgAssay<- split(rawdata, rawdata$Assay)
>>>
>>>
>>> ## RETURN MIDPOINTS FOR EACH CLUSTER WITHOUT NO CALLS
>>>
>>> #for loop
>>> ClustMidPts<-list()
>>>
>>> for(locus in 1:length(names(OrgAssay))){
>>> ClustMidPts[[locus]]<-t(cbind(**tapply(OrgAssay[[locus]][,"**
>>> Allele.X.1"],
>>> OrgAssay[[locus]][,"Final"], mean,na.rm=T),
>>> tapply(OrgAssay[[locus]][,"**Allele.Y.1"],
>>> OrgAssay[[locus]][,"Final"], mean,na.rm=T)))}
>>>
>>> names(ClustMidPts)=names(**OrgAssay)
>>>
>>>
>>> ## CREATE CLUSTER-CLUSTER MIDPOINT
>>>
>>> #for loop
>>> ClustClustMidPts<- list()
>>>
>>> for(locus in 1:length(names(ClustMidPts))){
>>> ClustClustMidPts[[locus]]<-
>>> cbind(XXYX=(ClustMidPts[[**locus]][,"XX"]+ClustMidPts[[**
>>> locus]][,"YX"])/2,
>>> YXYY=(ClustMidPts[[locus]][,"**YX"]+ClustMidPts[[locus]][,"**YY"])/2)
>>> }
>>>
>>> names(ClustClustMidPts)=names(**ClustMidPts)
>>>
>>>
>>> Please also let me know how I messed up the formatting because it shows up
>>> fine in gmail even when I post on Nabble. How did I assume you were using
>>> Nabble? Is this topic included in the posting guide?
>>>
>>> --
>>> View this message in context:http://r.789695.n4.**
>>> nabble.com/applying-cbind-or-**any-function-across-all-**
>>> components-in-a-list-**tp4631128p4631260.html<http://r.789695.n4.nabble.com/applying-cbind-or-any-function-across-all-components-in-a-list-tp4631128p4631260.html>
>>>
>>> Sent from the R help mailing list archive at Nabble.com.
>>>
>>>
>>>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list