[BioC] RGList class is really slow, why?

Skewes,Aaron ASkewes at mdanderson.org
Thu Feb 5 20:52:26 CET 2009


Thanks for the reply,

I abbreviated the code to make my question clear. I am actually combining only certain rows and averaging the intensities based on certain criterion. A variation of your suggestion:

" Or perhaps you want i to be a vector,
either of integer values to indicate which rows to include, or a
logical vector of length equal to the number of rows in RG$R, and then
do (just once, not in a loop)"

might suffice, I'll think about it.

But...there is a significant difference in my case (ca. 10 mins vs. 30 second) between writing to a RGList and matrix, respectively. There is no reason for a copy of the RGList to be made (if that is what happens). Anyway, I am just puzzled why RGList is so slow.

Thanks,

-A

-----Original Message-----
From: Martin Morgan [mailto:mtmorgan at fhcrc.org] 
Sent: Thursday, February 05, 2009 1:39 PM
To: Skewes,Aaron
Cc: bioconductor at stat.math.ethz.ch
Subject: Re: [BioC] RGList class is really slow, why?

"Skewes,Aaron" <ASkewes at mdanderson.org> writes:

> I am using RGList class, and desire to transform intensity values (R,Rb,G,Gb) and overwrite them into RGList object. If I do it row-wise:
>
> RG$R[i,]<-(R+RG2$R[i,])/(2)

This probably copies the entire RG list, rather than just modifying
the ith row of the R element of the RG list. This is necessary becasue
R has 'copy on change' semantics, R makes a copy of an ojbect (in this
case RG) when a component of it (in this case RG$R) is changed.

Not sure what R or RG2 are, but what you might be aiming for is just

  RG$R <- R + RG$R / 2

this extracts the RG$R matrix, divides every element by two (which I
guess you're doing but in a loop, with an index 'i') and then adds the
content of whatever 'R' is. Or perhaps you want i to be a vector,
either of integer values to indicate which rows to include, or a
logical vector of length equal to the number of rows in RG$R, and then
do (just once, not in a loop)

  RG$R[i,] <- R + RG$R[i,] / 2

There's still a copy possible (R can be clever and not make a copy if
there are no other references to RG) but either way it should be so
fast that speed isn't important.

Hope that helps,

Martin

> The overwriting is really SLOWWWWWW. On the other hand, If I create a dummy matrix:
>
> QR=matrix(nrow=dim(RG)[1], ncol=dim(RG)[2])
>
> And write the values to it:
>
> QR[i,]<-(R+RG2 $R[i,])/(2)
>
> It is relatively fast (of course I can do it this way, then simply RG$R <-QR, which is ok)
>
> But I'd really like to know why writing to a matrix is so much faster than RGList? Is RGList making a copy each time or something?
>
>
> -Aaron
>
>
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

-- 
Martin Morgan
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M2 B169
Phone: (206) 667-2793



More information about the Bioconductor mailing list