[BioC] How to create a MAList or ExprSet object from a matrix
Marcus Davy
mdavy at hortresearch.co.nz
Thu Aug 17 05:02:49 CEST 2006
Further to Martin's email, this code might be useful to you for what looks
like your probe set information after normalization.
Marcus
tmp <- scan(what=character(0))
56071 1052 1062 3061 3081 8052 8072 10061 10062 10072 1415670_at
8.430148 8.899385 8.625973 8.708319 8.759182 8.281378 8.905347 8.625347
9.029528 1415671_at 9.039655 9.244914 9.121714 9.002296 8.97237 8.599152
9.004381 9.267188 9.115415 1415672_at 8.86041 8.998826 9.077138 8.994297
8.885136 8.918512 9.087072 8.867808 8.841663 1415673_at 6.565344 6.384893
6.856466 6.17951 5.786523 6.507357 6.371563 5.886887 6.42499 1415674_a_at
7.877212 8.038635 8.120319 8.067843 7.56546 7.846677 7.921398 7.629843
7.787807 1415675_at 7.524559 7.496189 7.718928 7.164805 7.102158 7.331314
7.226036 7.424044 7.368011 1415676_a_at 9.315694 9.134394 9.224642 8.821193
8.886963 8.702572 8.883647 9.028728 8.921372
""
# Get and remove first 10 observations (look like slide IDs)
SlideIDs <- LETTERS[1:9]
tmp <- tmp[-(1:10)]
# Index and get the annotation
IDindex <- seq(1,length(tmp), by=10)
probeIDs <- tmp[ IDindex ]
# Construct a matrix of expressions
expressions <- matrix(as.numeric(tmp[(!seq(tmp)%in%IDindex)]), nc=10-1,
byrow=TRUE)
# Check names ok
rownames(expressions) <- probeIDs
pd <- new("phenoData",
pData=data.frame(Slide=1:(10-1), row.names=SlideIDs),
varLabels=list(Slide="Slide identifiers"))
eset <- new("exprSet", phenoData=pd, exprs=expressions)
On 8/16/06 11:28 AM, "Martin Morgan" <mtmorgan at fhcrc.org> wrote:
> swang <swang2000 at gmail.com> writes:
>
>> Dear List:
>>
>> I got a file like the following, I guess the data is M ( log2 expression
>> ratio) from microarray:
>>
>> 56071 1052 1062 3061 3081 8052 8072 10061 10062 10072 1415670_at
>> 8.430148 8.899385 8.625973 8.708319 8.759182 8.281378 8.905347 8.625347
>
>> the rows are Affymetrix probe and columns are different mice number (arrays)
>> I need to do a category analysis using category package, so I need to
>> generate a MAList or ExprSet object.
>
> Starting with a data matrix
>
>> samples <- 3
>> sampleNames <- letters[1:samples]
>> features <- 1000
>> ## raw data
>> exprMatrix <- matrix(0, ncol=samples,
> + nrow=features,
> + dimnames=list(1:features, sampleNames))
>
> To create an old-style exprSet (not sure what an ExprSet is, or which
> package you mean by Category ;):
>
>> ## phenoData for exprSet
>> pd2 <- new("phenoData",
> + pData=data.frame(1:samples,
> + row.names=sampleNames),
> + varLabels=list(id="sample identifier"))
>> new("exprSet",
> + phenoData=pd2,
> + exprs=exprMatrix)
> Expression Set (exprSet) with
> 1000 genes
> 3 samples
> phenoData object with 1 variables and 3 cases
> varLabels
> id: sample identifier
>
> To create an ExpressionSet (using this will require different commands
> from the vignette that comes with Category) object:
>
>> ## phenoData for ExpressionSet
>> pd1 <- new("AnnotatedDataFrame",
> + data=
> + data.frame(sampleId=1:samples,
> + row.names=sampleNames),
> + varMetadata=
> + data.frame(labelDescription=I(c("Sample numeric identifier")),
> + row.names=c("sampleId")))
>> new("ExpressionSet",
> + phenoData=pd1, exprs=exprMatrix)
> Instance of ExpressionSet
>
> assayData
> Storage mode: lockedEnvironment
> featureNames: 1, 2, 3, ..., 999, 1000 (1000 total)
> Dimensions:
> exprs
> Rows 1000
> Samples 3
>
> phenoData
> sampleNames: a, b, c
> varLabels and descriptions:
> sampleId: Sample numeric identifier
>
> Experiment data
> Experimenter name:
> Laboratory:
> Contact information:
> Title:
> URL:
> PMIDs:
> No abstract available.
>
> Annotation character(0)
>
> Much of the functionality of exprSet and ExpressionSet come from
> associating phenoData with expression values; the skeletons above do
> not have any meaningful phenoData. Typically you might incorporate
> this by reading phenotypic data from a spreadsheet or tab-delimited
> file (e.g., using read.table) into data.frames, and then incorporating
> the data.frame into an ExpressionSet as outlined above.
>
>
>> sessionInfo()
> Version 2.3.1 Patched (2006-06-20 r38364)
> x86_64-unknown-linux-gnu
>
> attached base packages:
> [1] "tools" "methods" "stats" "graphics" "grDevices" "utils"
> [7] "datasets" "base"
>
> other attached packages:
> Biobase
> "1.10.1"
>
>
> Martin
______________________________________________________
The contents of this e-mail are privileged and/or confidenti...{{dropped}}
More information about the Bioconductor
mailing list