[Bioc-devel] ExperimentHub::GSE62944 outdated
Ludwig Geistlinger
Ludwig.Geistlinger at bio.ifi.lmu.de
Thu Jun 2 16:06:49 CEST 2016
Hi,
I would like to do some analysis on the TCGA data as provided in
ExperimentHub's GSE62944 ExpressionSet.
The Description of the dataset reads:
"TCGA re-processed RNA-Seq data from 9264 Tumor Samples and 741 normal
samples across 24 cancer types"
However, when loading the dataset via
> eh <- ExperimentHub()
> query(eh , "GSE62944")
> tcga_data <- eh[["EH1"]]
and counting the samples
> dim(tcga_data)
Features Samples
23368 7706
as well as the cancer types
> length(table(pData(tcga_data)[,"CancerType"]))
results in the observed discrepancies with the above description,
indicating that this is an outdated version of the dataset.
Is it possible to
(1) update it accordingly
(2) include a varLabel, i.e. pData column indicating whether this is a
tumor or an adjacent normal sample for the respective cancer type.
That would be great!
Thx & Best,
Ludwig
--
Dr. Ludwig Geistlinger
Lehr- und Forschungseinheit für Bioinformatik
Institut für Informatik
Ludwig-Maximilians-Universität München
Amalienstrasse 17, 2. Stock, Büro A201
80333 München
Tel.: 089-2180-4067
eMail: Ludwig.Geistlinger at bio.ifi.lmu.de
More information about the Bioc-devel
mailing list