[BioC] Creating a new instance of oligoSnpSet
Martin Morgan
mtmorgan at fhcrc.org
Wed Nov 26 22:22:44 CET 2008
Hi Steven --
Steven McKinney wrote:
> Hi all,
>
> Thanks to Robert Scharpf for a quick and detailed
> off-line response. For anyone else that may encounter
> this issue: my problem was that my featureData object's
> 'data' slot data frame did not have names "chromosome"
> and "position" .
>
> I originally defined my featureData object as
>
>> cclfd <-
> + new("AnnotatedDataFrame",
> + data = data.frame(position = pData(featureData(ccld)[, "MapInfo"]),
> + chromosome = pData(featureData(ccld)[, "CHR"]),
> + stringsAsFactors = FALSE),
> + varMetadata = data.frame(labelDescription = c("position", "chromosome")))
>
> extracting directly from my ccld object (a SnpSetIllumina object
> from beadarraySNP command read.SnpSetIllumina()
> ccld <- read.SnpSetIllumina(samplesheet = "ccl_CNV370SampleSheet_8samples.csv",
> reportfile = "ccl_FinalReport_2.txt")
> )
>
>
> This yielded an AnnotatedDataFrame object with slot 'data'
> containing a data frame whose names were not those I had
> put in the data.frame() code above (namely "position"
> and "chromosome").
>
>> str(cclfd)
> Formal class 'AnnotatedDataFrame' [package "Biobase"] with 4 slots
> ..@ varMetadata :'data.frame': 2 obs. of 1 variable:
> .. ..$ labelDescription: chr [1:2] "position" "chromosome"
> ..@ data :'data.frame': 373397 obs. of 2 variables:
> .. ..$ MapInfo: num [1:373397] 1.64e+08 1.66e+08 1.66e+08 1.66e+08 1.67e+08 ...
> .. ..$ CHR : Factor w/ 25 levels "1","10","11",..: 18 18 18 18 18 18 18 18 18 18 ...
> .. .. ..- attr(*, "names")= chr [1:373397] "cnvi0000001" "cnvi0000002" "cnvi0000003" "cnvi0000004" ...
> ..@ dimLabels : chr [1:2] "rowNames" "columnNames"
> ..@ .__classVersion__:Formal class 'Versions' [package "Biobase"] with 1 slots
> .. .. ..@ .Data:List of 1
> .. .. .. ..$ : int [1:3] 1 1 0
>
> So that's my R lesson for today - names specified in a
> data.frame() call don't necessarily stick!
Hmm, I'm not sure that's the right lesson -- you don't have to be that
suspicious of data.frame.
It might be AnnotatedDataFrame or oligoSnpSet, though. I wonder what
your sessionInfo() is? Also what does str(featureData(ccld)) say? An
unusual thing is the 'names' attribute of cclfd. Any chance of creating
a reproducible example (i.e., without access to your files, maybe by
referencing help pages [using the 'example()' function] or making a
version with just a few features and using dput)?
A couple of short-cuts / tips. fData(obj) gives you direct access to
pData(featureData(obj)). 'extract-then-subset' fData(obj))[,"cols"] --
will usually be more efficient that subset then extract; there's also a
subtle difference that might be causing problems here (as you do it, you
end up with a 1-column data frame for 'chromosome', whereas
extract-then-subset results in a vector). '[[' pulls out a single column
with featureData(obj)[["cols"]] (also [[<- can be useful for defining a
single column and creating a labelDescription; obj[["cols"]] gives
direct access to pData(obj)[["cols"]]).
Martin
> Explicitly forcing column names and
> mode "character" for the chromosome column
> solves the problem
>
> ccld.position <- pData(featureData(ccld)[, "MapInfo"])
> names(ccld.position) <- "position"
> ccld.chromosome <- pData(featureData(ccld)[, "CHR"])
> names(ccld.chromosome) <- "chromosome"
> ccld.chromosome$chromosome <- as.character(ccld.chromosome$chromosome)
>
> cclfd <-
> new("AnnotatedDataFrame",
> data = data.frame(position = ccld.position,
> chromosome = ccld.chromosome,
> stringsAsFactors = FALSE),
> varMetadata = data.frame(labelDescription = c("position", "chromosome")))
>
> and I can create the oligoSnpSet object successfully.
>
>> cclss <-
> + new("oligoSnpSet", copyNumber = logR, calls = gt,
> + phenoData = annotatedDataFrameFrom(logR, byrow = FALSE),
> + featureData = cclfd, annotation = "HumanCNV370-Quad")
>> str(cclss)
> Formal class 'oligoSnpSet' [package "oligoClasses"] with 6 slots
>
>
> So it was the absence of columns named "chromosome" and "position"
> in the 'data' slot of the featureData object that caused internal
> code to attempt to acquire chromosome positional information from
> an annotation source.
>
> With the featureData at data data frame having the correct column
> labels "chromosome" and "position", the annotation argument
> is not processed further (it is just added to the SnpSet
> object's 'annotation' slot).
>
> Thanks again to Robert Scharpf.
>
> Best
>
> Steve McKinney
>
>
>
>
> -----Original Message-----
> From: bioconductor-bounces at stat.math.ethz.ch on behalf of Steven McKinney
> Sent: Tue 11/25/2008 9:56 PM
> To: Bioconductor at stat.math.ethz.ch
> Subject: [BioC] Creating a new instance of oligoSnpSet
>
> Hello All,
>
> I am trying to get some Illumina HumanCNV370-Quad
> data into VanillaICE to do some copy number analysis.
>
> In attempting to create an object of class "oligoSnpSet"
> I can not seem to specify an annotation that works.
>
> e.g. as specified in a vignette
>
>> cclss <-
> + new("oligoSnpSet", copyNumber = logR, calls = gt,
> + phenoData = annotatedDataFrameFrom(logR, byrow = FALSE),
> + featureData = cclfd, annotation = "Illumina550k")
> Loading required package: Illumina550k
> Error in db(object) : Illumina550k package not available
> In addition: Warning message:
> In library(package, lib.loc = lib.loc, character.only = TRUE, logical.return = TRUE, :
> there is no package called 'Illumina550k'
> Error in dbGetQuery(db(object), sql) :
> error in evaluating the argument 'conn' in selecting a method for function 'dbGetQuery'
>
> or even if I specify some annotation that does exist
>
>> cclss <-
> + new("oligoSnpSet", copyNumber = logR, calls = gt,
> + phenoData = annotatedDataFrameFrom(logR, byrow = FALSE),
> + featureData = cclfd, annotation = "hgu133plus2cdf")
> Loading required package: hgu133plus2cdf
> Error in db(object) :
> trying to get slot "getdb" from an object of a basic class ("environment") with no slots
> Error in dbGetQuery(db(object), sql) :
> error in evaluating the argument 'conn' in selecting a method for function 'dbGetQuery'
>
>
> Is there a way to work around this annotation bit of building
> an eSet object?
>
> I can't figure out from documentation, reading source code, or
> experimenting, as to what will work for this annotation argument.
>
> I'm a bit hooped as there does not yet appear to be annotation
> for the Illumina HumanCNV370-Quad, but I have annotation
> information from other files from Illumina etc.
>
> Can I put some dummy object as an argument for annotation
> and patch it up with my known info?
>
> Any ideas?
>
>
> Steven McKinney
>
> Statistician
> Molecular Oncology and Breast Cancer Program
> British Columbia Cancer Research Centre
>
> email: smckinney +at+ bccrc +dot+ ca
>
> tel: 604-675-8000 x7561
>
> BCCRC
> Molecular Oncology
> 675 West 10th Ave, Floor 4
> Vancouver B.C.
> V5Z 1L3
> Canada
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
--
Martin Morgan
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109
Location: Arnold Building M2 B169
Phone: (206) 667-2793
More information about the Bioconductor
mailing list