[Bioc-devel] question about affy::plotLocation - scripts

James W. MacDonald jmacdon at uw.edu
Wed Jun 18 18:21:49 CEST 2014


Hi Kristof,

This has been fixed in both the release and devel versions of affy. 
Ideally we would change the API so the user has to tell plotLocation() 
what array they are using, so the number of rows can be extracted from a 
reputable source.

However, the current implementation is pretty fragile to begin with 
(e.g., it simply assumes the end user has already generated an image), 
and it took 12 years for anybody to notice a problem (which implies to 
me that this is an underused function), so I took the expedient of just 
getting the number of rows from the image itself.

The new version(s) should propagate through the build machines in a day 
or so.

Best,

Jim

On 6/16/2014 10:35 AM, James W. MacDonald wrote:
> Hi Kristóf,
>
> On 6/16/2014 10:20 AM, Kristóf Jakab wrote:
>> It seems I can't send attachments, I copy the codes here.
>>
>>
>> test_plotLocation_affy.R
>>
>> #!/usr/bin/env Rscript
>> #kristof.jakab at hegelab.org
>>
>> # MAKE AFFYBATCH
>> #----------------------------------------------------------
>> # download CEL file
>> library(GEOquery)
>> getGEOSuppFiles("GSM229005")
>>
>> #----------------------------------------------------------
>> # read CEL file
>> library(affy)
>> geoS <- ReadAffy(filenames=paste("GSM229005","GSM229005.CEL.gz",
>> sep="/"))
>>
>> # PLOTTING TO PNG
>> #----------------------------------------------------------
>> # raw
>> png(filename="geo_testing_spot_locations_raw.png",height=744*10,width=744*10,res=1200)
>>
>>
>> ## image (log scale intensities)
>> image(geoS,transfo=log)
>> ## perfectmatches
>> l <- indexProbes(geoS, which="pm", geneNames(geoS))
>> lapply(l,function(li){
>>     xy <- indices2xy(li, abatch=geoS)
>>     plotLocation(xy,col="tomato",pch=18,cex=0.075)
>> })
>> ## missmatches
>> l <- indexProbes(geoS, which="mm", geneNames(geoS))
>> lapply(l,function(li){
>>     xy <- indices2xy(li, abatch=geoS)
>>     plotLocation(xy,col="aquamarine",pch=18,cex=0.075)
>> })
>> dev.off()
>>
>> #----------------------------------------------------------
>> # mirrored
>> png(filename="geo_testing_spot_locations_mirrored.png",height=744*10,width=744*10,res=1200)
>>
>>
>> ## image (log scale intensities)
>> image(geoS,transfo=log)
>> ## perfectmatches
>> l <- indexProbes(geoS, which="pm", geneNames(geoS))
>> lapply(l,function(li){
>>     xy <- indices2xy(li, abatch=geoS)
>>     xy <- cbind(x=xy[,1],y=(743-xy[,2])) # mirroring
>>     plotLocation(xy,col="tomato",pch=18,cex=0.075)
>> })
>> ## missmatches
>> l <- indexProbes(geoS, which="mm", geneNames(geoS))
>> lapply(l,function(li){
>>     xy <- indices2xy(li, abatch=geoS)
>>     xy <- cbind(x=xy[,1],y=(743-xy[,2])) # mirroring
>>     plotLocation(xy,col="aquamarine",pch=18,cex=0.075)
>> })
>> dev.off()
>>
>>
>> correction_for_plotLocation.R
>>
>> plotLocation <- function(x, col="green", pch=22, ...) {
>>     if (is.list(x)) {
>>       x <- cbind(unlist(lapply(x, function(x) x[,1])),
>>                  unlist(lapply(x, function(x) x[,2])))
>>     }
>>     points(x[,1], 743-x[,2] # mirroring 744Ã---744 matrix, numbered
>> from 0 to 743
>>            , pch=pch, col=col, ...)
>> }
>
> Thanks for pointing this out. It's apparent almost nobody ever uses this
> code, as it has been in the affy package since pretty much the beginning
> (2002), and you are the first to notice this.
>
> Unfortunately, hard-coding the number of rows isn't the answer, since
> Affy arrays have different dimensions. Probably the best fix is to add
> an additional required argument 'affybatch' that we can use to extract
> the chip dimensions from.
>
> Best,
>
> Jim
>
>
>>
>>
>> On 06/16/2014 10:59 AM, Kristóf Jakab wrote:
>>> Dear BiocDevelR!
>>>
>>> I'm working lot with the excelent *affy package* of Rafael A.
>>> Irizarry, I find it very useful.
>>>
>>> I have a bit strange experience with it's *plotLocation function*.
>>> It seems, *I have to mirror Y coordinates* to plot properly.
>>> Perhaps it's because the CEL file reading starts from the top, and
>>> plotting starts from the bottom.
>>>
>>> I'm not sure if I'm rigtht, can you check, that I haven't made mistake?
>>> If yes, I suggest a (simple) solution for this.
>>>
>>> I attach two plot made from a GEO GSM CEL file (see script).
>>> First I've plotted all gene name (ProbeSet) on the CEL file images,
>>> second I've plotted after mirroring the Y coordinates.
>>> As you can see on the raw plotting there are points on chip name
>>> (printed by BioB spots).
>>>
>>> I attach my plotting script too, and a potential correction for the
>>> affy::plotLocation. (I've tried it, it seems good.)
>>>
>>> Yours sincerly:
>>> Kristóf Jakab
>>>
>>> I've linked 2 files to this email:
>>> geo_testing_spot_locations_mirrored.png
>>> <https://www.box.com/shared/ow3q5sn3fpmyz3u8w533>(6.0 MB)Box
>>> <https://www.box.com/thunderbird>https://www.box.com/shared/ow3q5sn3fpmyz3u8w533
>>>
>>>
>>> geo_testing_spot_locations_raw.png
>>> <https://www.box.com/shared/3sj9i3lpkixkq85qar0r>(6.1 MB)Box
>>> <https://www.box.com/thunderbird>https://www.box.com/shared/3sj9i3lpkixkq85qar0r
>>>
>>>
>>> Mozilla Thunderbird <http://www.getthunderbird.com> makes it easy to
>>> share large files over email.
>>
>>
>>     [[alternative HTML version deleted]]
>>
>>
>>
>> _______________________________________________
>> Bioc-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>
>

-- 
James W. MacDonald, M.S.
Biostatistician
University of Washington
Environmental and Occupational Health Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099



More information about the Bioc-devel mailing list