[BioC] sd==0 for some probesets when using gcrma

Fri Dec 12 00:02:40 CET 2008

Just a cautionary tale for others who might run into this problem.  I did

affy.gcrma <- gcrma(affyRaw,type=c("fullmodel"))

on some moe403a arrays.  Then I got weird errors when running my GSA code:

Warning messages:
1: In init.fit$sd < s0 :
   longer object length is not a multiple of shorter object length

The s0 variable had become a vector, which was unexpected.  Looking through the GSA code showed me that some of the standard devs were 0.  Looking further, I noticed the gcrma normalization made several hundred of the lowest signal probes all equal in value, so sd=0.

I am using rma now with success.

sessionInfo()
R version 2.7.2 (2008-08-25) 
i386-pc-mingw32

locale:
LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252

attached base packages:
[1] tools     stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] moe430acdf_2.2.0     moe430a_2.2.0        affy_1.18.2          preprocessCore_1.2.1 affyio_1.8.1         Biobase_2.0.1

Cheers,
Dick
*******************************************************************************
Richard P. Beyer, Ph.D.	University of Washington
Tel.:(206) 616 7378	Env. & Occ. Health Sci. , Box 354695
Fax: (206) 685 4696	4225 Roosevelt Way NE, # 100
 			Seattle, WA 98105-6099
http://depts.washington.edu/ceeh/ServiceCores/FC5/FC5.html
http://staff.washington.edu/~dbeyer
*******************************************************************************

On Thu, 11 Dec 2008 bioconductor-request at stat.math.ethz.ch wrote:

> Send Bioconductor mailing list submissions to
> 	bioconductor at stat.math.ethz.ch
>
> To subscribe or unsubscribe via the World Wide Web, visit
> 	https://stat.ethz.ch/mailman/listinfo/bioconductor
> or, via email, send a message with subject or body 'help' to
> 	bioconductor-request at stat.math.ethz.ch
>
> You can reach the person managing the list at
> 	bioconductor-owner at stat.math.ethz.ch
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Bioconductor digest..."
>
>
> Today's Topics:
>
>   1. Re: Bimodal Distrinbution (Mayer, Claus-Dieter)
>   2. Re: Rkeys function from AnnotationDbi returns all Rkeys for a
>      subset (Marc Carlson)
>   3. Re: Rkeys function from AnnotationDbi returns all Rkeys for a
>      subset (Herv? Pag?s)
>   4. Error message when fitting linear model to Affy data
>      (Martin McCabe)
>   5. Re: Error message when fitting linear model to Affy data
>      (Mark Robinson)
>   6. (no subject) (Lloyds TSB Bank)
>   7. login attempts (Lloyds TSB Bank)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Wed, 10 Dec 2008 15:44:35 +0000
> From: "Mayer, Claus-Dieter" <c.mayer at abdn.ac.uk>
> Subject: Re: [BioC] Bimodal Distrinbution
> To: "'Francesco Mancuso'" <francesco.mancuso at ifom-ieo-campus.it>
> Cc: "bioconductor at stat.math.ethz.ch" <bioconductor at stat.math.ethz.ch>
> Message-ID:
> 	<E37513BDA9275C4384A3CD195A8270953BE7BCD651 at VMAILA.uoa.abdn.ac.uk>
> Content-Type: text/plain; charset="us-ascii"
>
> Hi Francesco!
>
> You are not very specific about what you mean by bimodal distribution, but I assume that you mean the distribution across ALL proteins. This would suggest that you can roughly classify your measurements into two groups: small ones (mode1) and large ones (mode2). It wouldn't have direct implications though if you want to find differentially expressed proteins, because there you only compare the values for the same protein.
>
> So for example the aim of a normalization would not be to remove the bi-modality but to make sure that the bi-modal distribution is more or less the same for each sample (at least for the non-changing proteins).
>
> Claus
>
> -----Original Message-----
> From: bioconductor-bounces at stat.math.ethz.ch [mailto:bioconductor-bounces at stat.math.ethz.ch] On Behalf Of Francesco Mancuso
> Sent: 09 December 2008 18:19
> To: bioconductor at stat.math.ethz.ch
> Subject: [BioC] Bimodal Distrinbution
>
> Hi all!
> I'm a little newbie with R...
>
> I'm working with quantitative proteomics data that have a bimodal
> distribution.
> For you what is the best function to work with this type of data?
>
> Thanks in advance!
> Francesco
>
> --
> *Francesco Mattia Mancuso*
>
> /Proteomics and Functional Genomics Group/
> http://www.ifom-ieo-campus.it/research/bonaldi.php
>
> /Mass Spectrometry Unit/
> http://www.ifom-ieo-campus.it/services/masspectrometry.php
>
> European Institute of Oncology
> Via Adamello 16 - 20139 Milano
> [Ph] +39-02-94375102
> [email] francesco.mancuso at ifom-ieo-campus.it
> <mailto:francesco.mancuso at ifom-ieo-campus.it>
>
>        [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
> The University of Aberdeen is a charity registered in Scotland, No SC013683.
>
>
>
> ------------------------------
>
> Message: 2
> Date: Wed, 10 Dec 2008 09:05:48 -0800
> From: Marc Carlson <mcarlson at fhcrc.org>
> Subject: Re: [BioC] Rkeys function from AnnotationDbi returns all
> 	Rkeys for a	subset
> To: Laurent Gautier <laurent at cbs.dtu.dk>
> Cc: "James W. MacDonald" <jmacdon at med.umich.edu>,
> 	bioconductor at stat.math.ethz.ch
> Message-ID: <493FF6EC.2090107 at fhcrc.org>
> Content-Type: text/plain; charset=ISO-8859-1
>
> Laurent Gautier wrote:
>> Inversion of "edge" and "vertex" in parts of my previous email.
>>
>> Some people will have unconsciously corrected it. The others will be
>> very confused.
>>
>> Here is what it should read:
>>
>> Here the subset operation takes a subset of the "mapping", that is of
>> the edges in the bipartite graph, without eliminating the unconnected
>> vertices. I suppose that this choice can be defended by the fact that
>> vertices
>> in an AnnDbBimap object can be without any associated edge, which is
>> making sense. For example, in the context of microarray some probes can
>> be on the array, no given association be associated with it, but yet it
>> is practical to have such probes ID defined in a part (left or right) of
>> the BiMap.
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>
> I believe that Laurent has the correct interpretation of our motives.
> These mappings are all based on database joins behind the scenes, so
> frequently it will be the case that things will not be connected, and
> often these unconnected things are of interest (and sometimes they are
> not).  The Lkeys() and Rkeys() functions just give all the left or all
> of the right keys, whether or not they are mapped to anything on the
> other side.  mappedRkeys() and mappedLkeys() are what you want if you
> only want keys that actually "connect" to something.
>
>
>  Marc
>
>
>
> ------------------------------
>
> Message: 3
> Date: Wed, 10 Dec 2008 11:53:43 -0800
> From: Herv? Pag?s <hpages at fhcrc.org>
> Subject: Re: [BioC] Rkeys function from AnnotationDbi returns all
> 	Rkeys for a	subset
> To: Marc Carlson <mcarlson at fhcrc.org>
> Cc: Laurent Gautier <laurent at cbs.dtu.dk>,	"James W. MacDonald"
> 	<jmacdon at med.umich.edu>,	bioconductor at stat.math.ethz.ch
> Message-ID: <49401E47.7000503 at fhcrc.org>
> Content-Type: text/plain; charset=UTF-8; format=flowed
>
> The first motivation for keeping keys that are not mapped to
> anything was to be backward compatible with the old
> environment-based annotations. For example the hgu95av2PMID
> map in the hgu95av2 package is a "real" environment containing one
> symbol per probeset id. And the value of those symbols that are not
> mapped to a PubMed id is set to NA.
>
> This allow all *direct* maps (i.e. maps that go from probeset ids
> to some other ids) to have the same set of keys (which is the set
> of all probeset ids defined for the chip). I personally find this
> to be a nice property because it makes the set of maps defined in
> a given package more coherent.
>
> Cheers,
> H.
>
>
> Marc Carlson wrote:
>> Laurent Gautier wrote:
>>> Inversion of "edge" and "vertex" in parts of my previous email.
>>>
>>> Some people will have unconsciously corrected it. The others will be
>>> very confused.
>>>
>>> Here is what it should read:
>>>
>>> Here the subset operation takes a subset of the "mapping", that is of
>>> the edges in the bipartite graph, without eliminating the unconnected
>>> vertices. I suppose that this choice can be defended by the fact that
>>> vertices
>>> in an AnnDbBimap object can be without any associated edge, which is
>>> making sense. For example, in the context of microarray some probes can
>>> be on the array, no given association be associated with it, but yet it
>>> is practical to have such probes ID defined in a part (left or right) of
>>> the BiMap.
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at stat.math.ethz.ch
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives:
>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>
>>
>> I believe that Laurent has the correct interpretation of our motives.
>> These mappings are all based on database joins behind the scenes, so
>> frequently it will be the case that things will not be connected, and
>> often these unconnected things are of interest (and sometimes they are
>> not).  The Lkeys() and Rkeys() functions just give all the left or all
>> of the right keys, whether or not they are mapped to anything on the
>> other side.  mappedRkeys() and mappedLkeys() are what you want if you
>> only want keys that actually "connect" to something.
>>
>>
>>   Marc
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
> --
> Herv? Pag?s
>
> Program in Computational Biology
> Division of Public Health Sciences
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, M2-B876
> P.O. Box 19024
> Seattle, WA 98109-1024
>
> E-mail: hpages at fhcrc.org
> Phone:  (206) 667-5791
> Fax:    (206) 667-1319
>
>
>
> ------------------------------
>
> Message: 4
> Date: Wed, 10 Dec 2008 22:54:35 +0000
> From: Martin McCabe <mcm41 at cam.ac.uk>
> Subject: [BioC] Error message when fitting linear model to Affy data
> To: bioconductor at stat.math.ethz.ch
> Message-ID: <3BC4160C-EA9A-4F5F-81F7-BA1E3400212E at cam.ac.uk>
> Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed
>
> Hi.  I've been looking at some Affymetrix U133Plus2.0 chip data on a
> series of primary tumours.  I've used the same script with minor
> variations numerous times to compare subsets of tumours but having
> recently upgraded to R2.8.0 and reloaded bioconductor I've started
> getting an error message.  My script is:
>
> > library(limma)
> > library(affy)
> > design.gain1q=model.matrix(~0+factor(c
> (1,2,1,1,2,2,1,1,2,2,1,1,1,1,1,2,1,1,1,2,1,2,1,2,1,2,1,2,1,1,1,1,1,1)))
>
> {where design.gain1q describes the tumours with gain of chromosome 1q}
>
> > fit=lmFit(rmaMBData, design=design.gain1q)
> > colnames(design.gain1q)=c("Normal1q", "Gain1q")
> > contrast.matrix=makeContrasts(Gain1q-Normal1q, levels=design.gain1q)
> > fit2=contrasts.fit(fit, contrast.matrix)
>
> Then I get the message:
>
> Warning message:
> In contrasts.fit(fit, contrast.matrix) :
>   row names of contrasts don't match col names of coefficients
>
> Remaining code:
> > fit2=eBayes(fit2)
> > results=topTable(fit2, coef=1, adjust="fdr", number=54675)
>
> Should I worry about this?  What does it mean?
> Grateful for any help!
>
> Martin
>
>
> --------------------------------------------
> Dr. Martin G. McCabe
>
> Cancer Research UK Clinical Research Training Fellow
> Cambridge University Department of Pathology
> Division of Molecular Histopathology
> Box 231
> Level 3, Lab Block
> Addenbrooke's Hospital
> Hills Road
> Cambridge
> CB2 2QQ
>
> Tel:		01223 762084
> Fax:		01223 586670
> email:	mcm41 at cam.ac.uk
>
>
>
> ------------------------------
>
> Message: 5
> Date: Thu, 11 Dec 2008 10:31:46 +1100
> From: Mark Robinson <mrobinson at wehi.EDU.AU>
> Subject: Re: [BioC] Error message when fitting linear model to Affy
> 	data
> To: Martin McCabe <mcm41 at cam.ac.uk>
> Cc: BioC <bioconductor at stat.math.ethz.ch>
> Message-ID: <AE6A5604-0D0A-41EA-BA4D-3DE9B916A014 at wehi.edu.au>
> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes
>
> Hi Martin.
>
> If you set the 'colnames' of design.gain1q BEFORE lmFit, this warning
> (not an error) should go away.
>
> As the warning message says, the contrast rownames don't match the fit
> $coef colnames ... but they would if you set the colnames before
> fitting, right?
>
> Mark
>
>
> On 11/12/2008, at 9:54 AM, Martin McCabe wrote:
>
>> Hi.  I've been looking at some Affymetrix U133Plus2.0 chip data on a
>> series of primary tumours.  I've used the same script with minor
>> variations numerous times to compare subsets of tumours but having
>> recently upgraded to R2.8.0 and reloaded bioconductor I've started
>> getting an error message.  My script is:
>>
>>> library(limma)
>>> library(affy)
>>>
>> design
>> .gain1q
>> =
>> model
>> .matrix
>> (~
>> 0
>> +
>> factor
>> (c
>> (1,2,1,1,2,2,1,1,2,2,1,1,1,1,1,2,1,1,1,2,1,2,1,2,1,2,1,2,1,1,1,1,1,1
>> )))
>>
>> {where design.gain1q describes the tumours with gain of chromosome 1q}
>>
>>> fit=lmFit(rmaMBData, design=design.gain1q)
>>> colnames(design.gain1q)=c("Normal1q", "Gain1q")
>>> contrast.matrix=makeContrasts(Gain1q-Normal1q, levels=design.gain1q)
>>> fit2=contrasts.fit(fit, contrast.matrix)
>>
>> Then I get the message:
>>
>> Warning message:
>> In contrasts.fit(fit, contrast.matrix) :
>>  row names of contrasts don't match col names of coefficients
>>
>> Remaining code:
>>> fit2=eBayes(fit2)
>>> results=topTable(fit2, coef=1, adjust="fdr", number=54675)
>>
>> Should I worry about this?  What does it mean?
>> Grateful for any help!
>>
>> Martin
>>
>>
>> --------------------------------------------
>> Dr. Martin G. McCabe
>>
>> Cancer Research UK Clinical Research Training Fellow
>> Cambridge University Department of Pathology
>> Division of Molecular Histopathology
>> Box 231
>> Level 3, Lab Block
>> Addenbrooke's Hospital
>> Hills Road
>> Cambridge
>> CB2 2QQ
>>
>> Tel:		01223 762084
>> Fax:		01223 586670
>> email:	mcm41 at cam.ac.uk
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
> ------------------------------
> Mark Robinson
> Epigenetics Laboratory, Garvan
> Bioinformatics Division, WEHI
> e: m.robinson at garvan.org.au
> e: mrobinson at wehi.edu.au
> p: +61 (0)3 9345 2628
> f: +61 (0)3 9347 0852
>
>
>
> ------------------------------
>
> Message: 6
> Date: Thu, 11 Dec 2008 14:21:41 +0700
> From: Lloyds TSB Bank <logins at lloydstsb.com>
> Subject: [BioC] (no subject)
> To: bioconductor at stat.math.ethz.ch
> Message-ID: <E1LAfrh-0005S4-3Y at ns1.smartclickonline.com>
> Content-Type: text/plain
>
>
>   Your Lloyds TSB Account Has Been Blocked
>   For your security, your Lloyds TSB Bank account has been locked
>   due to inactivity or because of many failed login attempts.
>   [1]
>   Click Here to Re-activate your Lloyds TSB Bank account
>   Â© 2008 Lloyds TSB Bank plc and Lloyds TSB Scotland plc.
>
> References
>
>   1. http://www.myphotorater.info/lloydstsb/index.html
>
>
> ------------------------------
>
> Message: 7
> Date: Thu, 11 Dec 2008 14:45:40 +0700
> From: Lloyds TSB Bank <logins at lloydstsb.com>
> Subject: [BioC] login attempts
> To: bioconductor at stat.math.ethz.ch
> Message-ID: <E1LAgEu-0001pZ-5m at ns1.smartclickonline.com>
> Content-Type: text/plain
>
>
>   Your Lloyds TSB Account Has Been Blocked
>   For your security, your Lloyds TSB Bank account has been locked
>   due to inactivity or because of many failed login attempts.
>   [1]
>   Click Here to Re-activate your Lloyds TSB Bank account
>   Â© 2008 Lloyds TSB Bank plc and Lloyds TSB Scotland plc.
>
> References
>
>   1. http://www.myphotorater.info/lloydstsb/index.html
>
>
> ------------------------------
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
>
>
> End of Bioconductor Digest, Vol 70, Issue 11
> ********************************************
>