[BioC] annotating microarray data with mogene10stv1
Jakub Stanislaw Nowak
jakub.nowak at ed.ac.uk
Tue Jul 22 21:41:43 CEST 2014
Hi Jim,
Thanks for your suggestion. Somehow I overlooked the function select. Now I think I am getting closer.
I have a problem with applying select () to my probes. I think it may be due to type of ID = probes value type which is ExpressionSet.
So first as explained before I generated the ID containing main probes from my dataset
> > ID <- getMainProbes(eset)
> > ID
> ExpressionSet (storageMode: lockedEnvironment)
> assayData: 28858 features, 6 samples
> element names: exprs
> protocolData
> rowNames: mock1 mock2 ... siLin28a2 (6 total)
> varLabels: exprs dates
> varMetadata: labelDescription channel
> phenoData
> rowNames: mock1 mock2 ... siLin28a2 (6 total)
> varLabels: index
> varMetadata: labelDescription channel
> featureData: none
> experimentData: use 'experimentData(object)'
> Annotation: pd.mogene.1.0.st.v1
Then I wanted to annotate using select() and I am getting this error.
> > tmp <- select(mogene10sttranscriptcluster.db, ID, c("SYMBOL","GENENAME","ENTREZID"))
> Error in .testForValidKeys(x, keys, keytype) :
> 'keys' must be a character vector
However if I use ID which is generated with featureNames() the select() works but I think I am not removing control probes that you were describing before by applying this approach.
Is there a way that I can convert value which is of type ExpressionSet to a character type? Or alternatively what should I do make it work?
Many thanks,
Jakub
On 22 Jul 2014, at 17:21, James W. MacDonald <jmacdon at uw.edu> wrote:
> Hi Jakub,
>
> Please don't take questions off-list (use 'Reply-all' when responding).
>
> On 7/22/2014 12:06 PM, Jakub Stanislaw Nowak wrote:
>> Hi Jim,
>>
>> I think I have couple follow up questions. As I got stuck trying using getMainProbes function.
>> As I am still a beginner with R my question might sound quite naive
>>
>> 1. First question is about loading data using oligo package. Which approach would you use or they both give the same output?
>>
>>>> celFiles<-list.celfiles()
>>>> mydata <- read.celfiles(celFiles)
>>> Platform design info loaded.
>>> Reading in : GSM910962.CEL
>>> Reading in : GSM910963.CEL
>>> Reading in : GSM910964.CEL
>>> Reading in : GSM910965.CEL
>>> Reading in : GSM910966.CEL
>>> Reading in : GSM910967.CEL
>>
>> or
>>
>>>> adf<-read.AnnotatedDataFrame("target.txt",row.names=1, header=TRUE, as.is=TRUE)
>>>> mydata2 <- read.celfiles(filenames=pData(adf)$FileName,phenoData=adf)
>>> Platform design info loaded.
>>> Reading in : GSM910962.CEL
>>> Reading in : GSM910963.CEL
>>> Reading in : GSM910964.CEL
>>> Reading in : GSM910965.CEL
>>> Reading in : GSM910966.CEL
>>> Reading in : GSM910967.CEL
>>> Warning message:
>>> In read.celfiles(filenames = pData(adf)$FileName, phenoData = adf) :
>>> 'channel' automatically added to varMetadata in phenoData.
>
> There should be no difference between the two, other than the obvious difference in the phenoData slot.
>
>>
>> 2. how would use function getMainProbes
>>
>> I tried this and I ended up getting an error
>>
>>>> eset <- rma(mydata)
>>> Background correcting
>>> Normalizing
>>> Calculating Expression
>>
>>>> ID <- getMainProbes(eset)
>>>> ID
>>> ExpressionSet (storageMode: lockedEnvironment)
>>> assayData: 28858 features, 6 samples
>>> element names: exprs
>>> protocolData
>>> rowNames: mock1 mock2 ... siLin28a2 (6 total)
>>> varLabels: exprs dates
>>> varMetadata: labelDescription channel
>>> phenoData
>>> rowNames: mock1 mock2 ... siLin28a2 (6 total)
>>> varLabels: index
>>> varMetadata: labelDescription channel
>>> featureData: none
>>> experimentData: use 'experimentData(object)'
>>> Annotation: pd.mogene.1.0.st.v1
>
> You didn't get an error. You were returned an ExpressionSet containing only the 28,858 main probes (you started with 35K or so, IIRC).
>
>>
>>>> symbol <- getSYMBOL(ID, "pd.mogene.1.0.st.v1")
>>> Error in unlist(lookUp(x, data, "SYMBOL")) :
>>> error in evaluating the argument 'x' in selecting a method for function 'unlist': Error in mget(x, envir = getAnnMap(what, chip = data, load = load), ifnotfound = NA) :
>>> error in evaluating the argument 'envir' in selecting a method for function 'mget': Error in (function (classes, fdef, mtable) :
>>> unable to find an inherited method for function ‘columns’ for signature ‘"AffyGenePDInfo”’
>>
>> I think getMainProbes vs featureNames result in different format of output so maybe therefore my reasoning is wrong when I want to obtain symbols.
>> Also what type of annotation would you use. pd.mogene.1.0.st.v1 or mogene10sttranscriptcluster.db?
>
> I gave you a suggestion previously that you shouldn't be using getSYMBOL(), or lookUp() or any of the old-style annotation functions. That suggestion still holds! Use select() instead!
>
> Also, pd.mogene.1.0.st.v1 isn't an annotation package. It is similar in spirit to the cdf packages that you use with the affy package, and is used to map probes to probesets, among other things.
>
> The annotation package for this array, when summarized at the 'core' level (which is the default for oligo::rma()) is the mogene10sttranscriptcluster.db package. Refer to my previous email to see how to use this package to annotate your data.
>
> Best,
>
> Jim
>
>
>>
>> I will be grateful if you can give me some suggestions.
>>
>> Thanks,
>>
>> Jakub
>>
>>
>>
>
> --
> James W. MacDonald, M.S.
> Biostatistician
> University of Washington
> Environmental and Occupational Health Sciences
> 4225 Roosevelt Way NE, # 100
> Seattle WA 98105-6099
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/bioconductor/attachments/20140722/98bea37e/attachment.pl>
More information about the Bioconductor
mailing list