[BioC] How to read a subset of the .CEL files
Roayaei, Jean (DMS) [Contr]
roayaeij at css.ncifcrf.gov
Tue Jun 27 18:30:43 CEST 2006
Dear all,
Henrik's explanation is correct. Similar queries made against NCBI
soybean data sets yield the same number of genes.
Jean Roayaei
DMS, NCI-Frederick
-----Original Message-----
From: henrik.bengtsson at gmail.com [mailto:henrik.bengtsson at gmail.com] On
Behalf Of Henrik Bengtsson
Sent: Monday, June 26, 2006 6:31 AM
To: James W. MacDonald
Cc: Alvord, Greg (DMS) [Contr]; Roayaei, Jean (DMS) [Contr];
bioconductor at stat.math.ethz.ch
Subject: Re: [BioC] How to read a subset of the .CEL files
See the affxparser package, e.g. readCelUnits(filenames,
units=c(1,600:612,45)). At the moment,you have to take it from there
yourself.
Henrik Bengtsson
On 6/24/06, James W. MacDonald <jmacdon at med.umich.edu> wrote:
> Hi Greg,
>
> Alvord, Greg (DMS) [Contr] wrote:
> >
> >
> > Dear List -
> >
> >
> >
> > I am new to BioConductor and R, working under Windows with a gig of
RAM,
> > version R-2.2.1 of R. I have successfully read in six .CEL files
and
> > created the following AffyBatch object.
> >
> >
> >
> >
> >>soy.ab
> >
> >
> > AffyBatch object
> >
> > size of arrays=1164x1164 features (63516 kb)
> >
> > cdf=Soybean (61170 affyids)
> >
> > number of samples=6
> >
> > number of genes=61170
> >
> > annotation=soybean
> >
> >
> >
> > The investigator for whom I'm working is interested in an analysis
of
> > differential gene expression on a subset of affyids in this
AffyBatch
> > object, specifically in 37,744 of the 61,170 affyids (indicated
above)
> > that relate specifically to the soybean genome. I have learned that
the
> > relevant species of interest is labeled 'Glycine max'. I obtained
this
> > information from another source and have not (due to my ignorance)
been
> > able to identify any slot in soy.ab AffyBatch object that identifies
> > this species. Here is a table of the species on the soy.ab
AffyBatch
> > object (which I obtained from another source).
> >
> >
> >
> >
> >>cbind(table(Species))
> >
> >
> > [,1]
> >
> > Alfalfa mosaic virus 3
> >
> > Bean pod mottle virus strain G-7 2
> >
> > Bean pod mottle virus strain K-Hancock1 1
> >
> > Clover yellow vein virus 1
> >
> > Glycine max 37744
> >
> > Heterodera glycines 7539
> >
> > Phytophthora sojae 15864
> >
> > S. saman 4
> >
> > Southern bean mosaic virus strain SBMV-S 1
> >
> > Soybean mosaic virus 1
> >
> > Soybean mosaic virus strain G5 3
> >
> > Soybean mosaic virus strain G7 1
> >
> > Soybean mosaic virus strain N 1
> >
> > Tobacco ringspot virus 2
> >
> > Tobacco streak virus 3
> >
> >
> >
> >
> >
> > I want to select from the soy.ab AffyBatch object the relevant
> > information for the species 'Glycine max' only. I have created a
data
> > frame containing those Affy.ID's for species 'Glycine max', e.g.,
> >
> >
> >
> >
> >>Glycine.max.Species.AffyID.df[c(1:3,37742:37744),]
> >
> >
> > Species Affy.ID
> >
> > 8 Glycine max AFFX-BioB-3_at
> >
> > 9 Glycine max AFFX-BioB-5_at
> >
> > 10 Glycine max AFFX-BioB-M_at
> >
> > 37749 Glycine max soybean_rRNA_838_RC_at
> >
> > 37750 Glycine max soybean_rRNA_918_at
> >
> > 37751 Glycine max soybean_rRNA_918_RC_at
> >
> >
> >
> >
> >>dim(Glycine.max.Species.AffyID.df)
> >
> >
> > [1] 37744 2
> >
> >
> >
> > How do I extract/create an AffyBatch object containing only the
> > appropriate Affy.ID's related to the 'Glycine max' species?
>
> An AffyBatch object isn't the best for subsetting this way. Better
would
> be to compute expression values using rma() or your favorite method,
and
> then subset.
>
> eset <- rma(soy.ab)
> subsetted.exprset <- eset[Glycine.max.Species.AffyID.df[,2],]
>
> HTH,
>
> Jim
>
> --
> James W. MacDonald
> University of Michigan
> Affymetrix and cDNA Microarray Core
> 1500 E Medical Center Drive
> Ann Arbor MI 48109
> 734-647-5623
>
>
>
> **********************************************************
> Electronic Mail is not secure, may not be read every day, and should
not be used for urgent or sensitive issues.
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
More information about the Bioconductor
mailing list