[BioC] Extracting pacient ID from AffyBatch objects
Sean Davis
seandavi at gmail.com
Tue Apr 27 13:05:47 CEST 2010
On Tue, Apr 27, 2010 at 2:41 AM, Popa Tiberiu <popatiberiuo at yahoo.com> wrote:
> I have a set of 31 CEL files which i read into an AffyBatch object
>
>> myAB = ReadAffy()
>
>> sampleNames(myAB)
> [1] "GSM2474.CEL.gz" "GSM2475.CEL.gz" "GSM2476.CEL.gz" "GSM2477.CEL.gz" "GSM2478.CEL.gz" "GSM2479.CEL.gz" "GSM2480.CEL.gz" "GSM2481.CEL.gz"
> [9] "GSM2482.CEL.gz" "GSM2483.CEL.gz" "GSM2484.CEL.gz" "GSM2485.CEL.gz" "GSM2486.CEL.gz" "GSM2487.CEL.gz" "GSM2488.CEL.gz" "GSM2489.CEL.gz"
> [17] "GSM2490.CEL.gz" "GSM2491.CEL.gz" "GSM2492.CEL.gz" "GSM2493.CEL.gz" "GSM2494.CEL.gz" "GSM2495.CEL.gz" "GSM2496.CEL.gz" "GSM2497.CEL.gz"
> [25] "GSM2498.CEL.gz" "GSM2499.CEL.gz" "GSM2500.CEL.gz" "GSM2501.CEL.gz" "GSM2502.CEL.gz" "GSM2503.CEL.gz" "GSM2504.CEL.gz"
>
> I have a a CSV file containing some extra data for our samples.
>
>> disease= as.matrix(read.table("s12.csv", header=T, sep=",", row.names=1))
>> rownames(disease)
> [1] "968-1" "928-1" "934-1" "709-1" "930-1" "524-1" "455-1" "370-1" "810-1" "1146-1" "1161-1" "1006-1" "942-1" "1060-1" "1255-1" "441-1"
> [17] "780-1" "815-2" "829-1" "861-1" "925-1" "1008-1" "1086-1" "1105-1" "1145-1" "1327-1" "1352-1" "1379-1" "533-1" "679-1" "692-1"
>
> What i am trying to do is attach the extra data to the coresponding samples.
>
> Each CEL file contains this row in which its specified the pacients ID (709 in this case):
>
> ...
> DatHeader=[59..46191] 709 Ta gr2 ...
> ...
>
> Is there any way to get the sample ID list from the AffyBatch object?
Not a direct answer, but since these data are from NCBI GEO, why not
use GEOquery to get the information about samples?
library(GEOquery)
Loading required package: Biobase
Welcome to Bioconductor
Vignettes contain introductory material. To view, type
'openVignette()'. To cite Bioconductor, see
'citation("Biobase")' and for packages 'citation(pkgname)'.
Loading required package: RCurl
Loading required package: bitops
> gse <- getGEO('GSE88')[[1]]
Found 1 file(s)
GSE88_series_matrix.txt.gz
trying URL 'ftp://ftp.ncbi.nih.gov/pub/geo/DATA/SeriesMatrix/GSE88/GSE88_series_matrix.txt.gz'
ftp data connection made, file length 492380 bytes
opened URL
==================================================
downloaded 480 Kb
File stored at:
/var/folders/F+/F+PwkbXqF6WeunvinD8pZk+++TI/-Tmp-//Rtmp2M4lZS/GPL80.soft
> head(pData(gse))
title geo_accession status
GSM2474 Bladder sample 709-1 GSM2474 Public on Dec 08 2002
GSM2475 Bladder sample 928-1 GSM2475 Public on Dec 08 2002
GSM2476 Bladder sample 930-1 GSM2476 Public on Dec 08 2002
GSM2477 Bladder tumour 934-1 GSM2477 Public on Dec 08 2002
GSM2478 Bladder sample 968-1 GSM2478 Public on Dec 08 2002
GSM2479 Bladder sample 1006-1 GSM2479 Public on Dec 08 2002
> sessionInfo()
R version 2.11.0 Under development (unstable) (2009-11-13 r50424)
i386-apple-darwin10.2.0
locale:
[1] en_US/en_US/C/C/en_US/en_US
attached base packages:
[1] stats graphics grDevices datasets utils methods base
other attached packages:
[1] GEOquery_2.11.2 RCurl_1.3-1 bitops_1.0-4.1 Biobase_2.7.3
More information about the Bioconductor
mailing list