[BioC] R: Puma question
Manca Marco (PATH)
m.manca at maastrichtuniversity.nl
Thu Oct 21 11:28:09 CEST 2010
Dear Richard and dear BioC fellows
I'm following up on my previous help request.
I have corrected the flaw in my phenoData which now looks as follows:
> phenoData(Data)
An object of class "AnnotatedDataFrame"
sampleNames: 090320 Blanche 02_(MoGene-1_0-st-v1).CEL, 090320 Blanche 13_(MoGe
ne-1_0-st-v1).CEL, ..., 090320 Blanche 30_(MoGene-1_0-st-v1).CEL (12 total)
varLabels and varMetadata description:
Group: Group
Angiotensin: Angiotensin administration
> pData(Data)
Group Angiotensin
02_(MoGene-1_0-st-v1).CEL WT 0
13_(MoGene-1_0-st-v1).CEL WT 0
23_(MoGene-1_0-st-v1).CEL WT 0
07_(MoGene-1_0-st-v1).CEL KO 0
08_(MoGene-1_0-st-v1).CEL KO 0
18_(MoGene-1_0-st-v1).CEL KO 0
31_(MoGene-1_0-st-v1).CEL WT 1
10_(MoGene-1_0-st-v1).CEL WT 1
11_(MoGene-1_0-st-v1).CEL WT 1
09_(MoGene-1_0-st-v1).CEL KO 1
20_(MoGene-1_0-st-v1).CEL KO 1
30_(MoGene-1_0-st-v1).CEL KO 1
But still puma is giving me the same error:
> Data.mmgmos<-mmgmos(Data)
Error in exprs(object)[mmIndex, ] <- value :
NAs are not allowed in subscripted assignments
I have also performed RMA normalization as a reference, as suggested Richard, and that is running fine:
> Data.rma<-rma(Data)
Background correcting
Normalizing
Calculating Expression
> Data.rma
ExpressionSet (storageMode: lockedEnvironment)
assayData: 34760 features, 12 samples
element names: exprs
phenoData
sampleNames: 090320 Blanche 02_(MoGene-1_0-st-v1).CEL, 090320 Blanche 13_(MoGe
ne-1_0-st-v1).CEL, ..., 090320 Blanche 30_(MoGene-1_0-st-v1).CEL (12 total)
varLabels and varMetadata description:
Group: Group
Angiotensin: Angiotensin administration
featureData
featureNames: 10338001, 10338003, ..., 10608724 (34760 total)
fvarLabels and fvarMetadata description: none
experimentData: use 'experimentData(object)'
Annotation: mogene10sttranscriptcluster.db
> phenoData(Data.rma)
An object of class "AnnotatedDataFrame"
sampleNames: 090320 Blanche 02_(MoGene-1_0-st-v1).CEL, 090320 Blanche 13_(MoGe
ne-1_0-st-v1).CEL, ..., 090320 Blanche 30_(MoGene-1_0-st-v1).CEL (12 total)
varLabels and varMetadata description:
Group: Group
Angiotensin: Angiotensin administration
> pData(Data.rma)
Group Angiotensin
02_(MoGene-1_0-st-v1).CEL WT 0
13_(MoGene-1_0-st-v1).CEL WT 0
23_(MoGene-1_0-st-v1).CEL WT 0
07_(MoGene-1_0-st-v1).CEL KO 0
08_(MoGene-1_0-st-v1).CEL KO 0
18_(MoGene-1_0-st-v1).CEL KO 0
31_(MoGene-1_0-st-v1).CEL WT 1
10_(MoGene-1_0-st-v1).CEL WT 1
11_(MoGene-1_0-st-v1).CEL WT 1
09_(MoGene-1_0-st-v1).CEL KO 1
20_(MoGene-1_0-st-v1).CEL KO 1
30_(MoGene-1_0-st-v1).CEL KO 1
Have you got any idea what is going on here? I'm sincerely lost =P
Thank you in advance for your attention and for any feedback.
Marco.
P.S.: The code I used to patch the varMetadata is the following
pData(Data) <- data.frame("Group"=c("WT","WT","WT","KO","KO","KO","WT","WT","WT","KO","KO","KO"), "Angiotensin"=c("0","0","0","0","0","0","1","1","1","1","1","1"), row.names=rownames(pData(Data)))
varMetadata(Data)=data.frame(labelDescription=c("Group","Angiotensin administration"))
My session info is again
> sessionInfo()
R version 2.10.1 (2009-12-14)
x86_64-pc-linux-gnu
locale:
[1] LC_CTYPE=en_US.utf8 LC_NUMERIC=C
[3] LC_TIME=en_US.utf8 LC_COLLATE=en_US.utf8
[5] LC_MONETARY=C LC_MESSAGES=en_US.utf8
[7] LC_PAPER=en_US.utf8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] puma_1.12.0 mogene10stv1.r3cdf_2.5.0 affy_1.24.2
[4] Biobase_2.6.1
loaded via a namespace (and not attached):
[1] affyio_1.14.0 preprocessCore_1.8.0 tools_2.10.1
--
Marco Manca, MD
University of Maastricht
Faculty of Health, Medicine and Life Sciences (FHML)
Cardiovascular Research Institute (CARIM)
Mailing address: PO Box 616, 6200 MD Maastricht (The Netherlands)
Visiting address: Experimental Vascular Pathology group, Dept of Pathology - Room5.08, Maastricht University Medical Center, P. Debyelaan 25, 6229 HX Maastricht
E-mail: m.manca at maastrichtuniversity.nl
Office telephone: +31(0)433874633
Personal mobile: +31(0)626441205
Twitter: @markomanka
*********************************************************************************************************************
This email and any files transmitted with it are confidential and solely for the use of the intended recipient.
It may contain material protected by privacy or attorney-client privilege. If you are not the intended recipient or the person responsible for
delivering to the intended recipient, be advised that you have received this email in error and that any use is STRICTLY PROHIBITED.
If you have received this email in error please notify us by telephone on +31626441205 Dr Marco MANCA
*********************************************************************************************************************
________________________________________
Da: Richard Pearson [richard.pearson at well.ox.ac.uk]
Inviato: giovedì 14 ottobre 2010 18.28
A: Manca Marco (PATH)
Cc: bioconductor mailing list
Oggetto: Re: [BioC] Puma question
Hi Marco
My guess is that there is a problem with your targets.csv file. It seems that your "adf" object has no varMetadata:
varLabels and varMetadata description:
X.Group.:
X.Treatment.:
To confirm this is not a problem specifically with puma, could you try a different summarisation method on your AffyBatch object, e.g. what do you get
if you try:
rma(Data)
If you'd like to send your targets.csv file to me I could have a quick look to see if I can spot the problem.
Best wishes
Richard
On 14/10/2010 10:49, Manca Marco (PATH) wrote:
>
>
> Dear BioC members,
>
> I'm trying to perform an analysis of set of mouse microarrays (Affymetrix Mouse Gene 1.0-ST Array Transcriptcluster) using the package puma. I'm quite new to this package so I'm trying to follow the vignette but I'm getting stuck with a very early error that I'm unable to interpret and tackle:
>
>> Data.mmgmos<- mmgmos(Data)
> Error in exprs(object)[mmIndex, ]<- value :
> NAs are not allowed in subscripted assignments
>
> Following I'm attaching my whole commands' sequence, and sessionInfo, for your convenience
>
>
>> library("affy", "mogene10stv1.r3cdf")
> Loading required package: Biobase
>
> Welcome to Bioconductor
>
> Vignettes contain introductory material. To view, type
> 'openVignette()'. To cite Bioconductor, see
> 'citation("Biobase")' and for packages 'citation(pkgname)'.
>
>> getwd();
> [1] "/home/..."
>> workingDir = "/home/...";
>> setwd(workingDir);
>> #loading the data
>> Data<-read.affybatch("02_(MoGene-1_0-st-v1).CEL","07_(MoGene-1_0-st-v1).CEL","08_(MoGene-1_0-st-v1).CEL","09_(MoGene-1_0-st-v1).CEL","10_(MoGene-1_0-st-v1).CEL","11_(MoGene-1_0-st-v1).CEL","13_(MoGene-1_0-st-v1).CEL","18_(MoGene-1_0-st-v1).CEL","20_(MoGene-1_0-st-v1).CEL","23_(MoGene-1_0-st-v1).CEL","30_(MoGene-1_0-st-v1).CEL","31_(MoGene-1_0-st-v1).CEL", cdfname = "mogene10stv1.r3cdf")
> Warning message:
> In read.affybatch("02_(MoGene-1_0-st-v1).CEL", "07_(MoGene-1_0-st-v1).CEL", :
> Incompatible phenoData object. Created a new one.
>
>> annotation(Data) = "mogene10sttranscriptcluster.db"
>> Data
> AffyBatch object
> size of arrays=1050x1050 features (21 kb)
> cdf=mogene10stv1.r3cdf (34760 affyids)
> number of samples=12
> number of genes=34760
> annotation=mogene10sttranscriptcluster.db
> notes=
>> adf<- read.AnnotatedDataFrame("targets.csv",header=TRUE, sep="\t")
>> adf
> An object of class "AnnotatedDataFrame"
> rowNames: "02_(MoGene-1_0-st-v1).CEL", "07_(MoGe
> ne-1_0-st-v1).CEL", ..., "08_(MoGene-1_0-st-v1).CEL" (12 total)
> varLabels and varMetadata description:
> X.Group.:
> X.Treatment.:
>> phenoData(Data)<-adf
>> library("puma")
>> Data.mmgmos<- mmgmos(Data)
> Error in exprs(object)[mmIndex, ]<- value :
> NAs are not allowed in subscripted assignments
>
>> sessionInfo()
> R version 2.10.1 (2009-12-14)
> x86_64-pc-linux-gnu
>
> locale:
> [1] LC_CTYPE=en_US.utf8 LC_NUMERIC=C
> [3] LC_TIME=en_US.utf8 LC_COLLATE=en_US.utf8
> [5] LC_MONETARY=C LC_MESSAGES=en_US.utf8
> [7] LC_PAPER=en_US.utf8 LC_NAME=C
> [9] LC_ADDRESS=C LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>
> other attached packages:
> [1] puma_1.12.0 mogene10stv1.r3cdf_2.5.0 affy_1.24.2
> [4] Biobase_2.6.1
>
> loaded via a namespace (and not attached):
> [1] affyio_1.14.0 preprocessCore_1.8.0 tools_2.10.1
>
>
> Thank you in advance for your attention. Any comment or suggestion would be highly apreciated.
>
> Marco
>
>
> --
> Marco Manca, MD
> University of Maastricht
> Faculty of Health, Medicine and Life Sciences (FHML)
> Cardiovascular Research Institute (CARIM)
>
> Mailing address: PO Box 616, 6200 MD Maastricht (The Netherlands)
> Visiting address: Experimental Vascular Pathology group, Dept of Pathology - Room5.08, Maastricht University Medical Center, P. Debyelaan 25, 6229 HX Maastricht
>
> E-mail: m.manca at maastrichtuniversity.nl
> Office telephone: +31(0)433874633
> Personal mobile: +31(0)626441205
> Twitter: @markomanka
>
>
> *********************************************************************************************************************
>
> This email and any files transmitted with it are confidential and solely for the use of the intended recipient.
>
> It may contain material protected by privacy or attorney-client privilege. If you are not the intended recipient or the person responsible for
>
> delivering to the intended recipient, be advised that you have received this email in error and that any use is STRICTLY PROHIBITED.
>
> If you have received this email in error please notify us by telephone on +31626441205 Dr Marco MANCA
>
> *********************************************************************************************************************
> ________________________________________
> Da: bioconductor-bounces at stat.math.ethz.ch [bioconductor-bounces at stat.math.ethz.ch] per conto di Sean Davis [sdavis2 at mail.nih.gov]
> Inviato: martedì 12 ottobre 2010 11.40
> A: Georgia Tsiliki
> Cc: Bioconductor Newsgroup
> Oggetto: Re: [BioC] GEOquery question
>
> On Tue, Oct 12, 2010 at 4:51 AM, Georgia Tsiliki<g_tsiliki at hotmail.com>wrote:
>
>> Dear Dr Davis,
>> I am a biostatistician at BRFAA, Athens. I am currently using the
>> 'GEOquery' package with Bioconductor/R. I had a problem with GSE3494 series;
>> particularly, i cannot download the 'Data Table of the Clinicopathological
>> variables of the Upsala cohort header description' and the 'GEO Sample
>> accession numbers and associated Patient IDs header description' files. Both
>> of them are included in the GEO accession Viewer with an option to download
>> them, but I'm not sure how i can do that via the GEOquery package. I don't
>> think there's a soft file for that particular series, do you think that
>> might be the problem?
>>
>>
> Hi, Georgia.
>
> I realized a few months ago that this GSE (and others like it) existed. I
> added a function to GEOquery to grab the GSE data tables. In the case of
> GSE3494, there are two of these data tables, so the function will return a
> list of two data.frames.
>
> gsedt = getGSEDataTables('GSE3494')
>
> Now, gsedt is a list of length 2 and holds each of the GSE data tables in
> the list. You can use getGEOSuppFiles to get the actual raw data. With the
> two pieces, it is not difficult to generate an ExpressionSet using the
> normal affy/Bioc tools.
>
> Hope that helps.
>
> Sean
>
>
>
>> Thank you very much for your time,
>> Georgia Tsiliki
>>
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
--
Dr Richard D Pearson richard.pearson at well.ox.ac.uk
Wellcome Trust Centre for Human Genetics http://www.well.ox.ac.uk/~rpearson
University of Oxford Tel: +44 (0)1865 617890
Roosevelt Drive Mob: +44 (0)7971 221181
Oxford OX3 7BN, UK Fax: +44 (0)1865 287664
More information about the Bioconductor
mailing list