[BioC] GSVA questions
Robert Castelo
robert.castelo at upf.edu
Fri Sep 6 09:46:58 CEST 2013
Dear Joe,
the function gsva() needs to match the identifiers from the
ExpressionSet object with those from the gene sets. This is done using
the available Bioconductor infrastructure for this purpose which relies
on gene-centric annotation packages typically anchored at Entrez Gene
Identifiers. Notice that the object 'c2BroadSets' has its gene set
definitions in terms of Entrez identifiers and this facilitates the
matching operation with ExpressionSet objects.
In principle, this should not be a problem if you download the .gmt
files from the Broad that contain the gene set definitions in terms of
Entrez Gene identifiers.
cheers,
robert.
On 09/06/2013 07:46 AM, Joe [guest] wrote:
>
> Dear Markus,
>
> I do it as you said, âc2BroadSetsâsource from package, and "C5allBraodSets" is load from GMT file that download from broadinstitute.
> as:
>> c2BroadSets
> GeneSetCollection
> names: NAKAMURA_CANCER_MICROENVIRONMENT_UP, NAKAMURA_CANCER_MICROENVIRONMENT_DN, ..., ST_PHOSPHOINOSITIDE_3_KINASE_PATHWAY (3272 total)
> unique identifiers: 5167, 100288400, ..., 57191 (29340 total)
> types in collection:
> geneIdType: EntrezIdentifier (1 total)
> collectionType: BroadCollection (1 total)
>
>> C5allBroadSets
> GeneSetCollection
> names: NUCLEOPLASM, EXTRINSIC_TO_PLASMA_MEMBRANE, ..., INOSITOL_OR_PHOSPHATIDYLINOSITOL_KINASE_ACTIVITY (1454 total)
> unique identifiers: HNRPK, XRCC6, ..., PGM1 (8299 total)
> types in collection:
> geneIdType: NullIdentifier (1 total)
> collectionType: NullCollection (1 total)
>
> when I use "c2BroadSets" GeneSetCollection, it works, and "NSCLC_norm_GSE32474_rma_Filter" is ExpressionSet, Because of customized CDF, unique gene ID used in the file, so I adjust the min.sz to 1
>> NSCLC_gsva_c2<- gsva(NSCLC_norm_GSE32474_rma_Filter, c2BroadSets,min.sz=1, max.sz=500, verbose=TRUE)$es.obs
>
> when I use "C5allBroadSets", report error, as
>> NSCLC_gsva_c5<- gsva(NSCLC_norm_GSE32474_rma_Filter, C5allBroadSets,min.sz=1, max.sz=500, verbose=TRUE)$es.obs
> Mapping identifiers between gene sets and feature names
> Error in GSVA:::.gsva(Biobase::exprs(expr), mapped.gset.idx.list, method, :
> The gene set list is empty! Filter may be too stringent.
>
> SO, how could I set parameters and make gsva work.................
>
> Thanks,
> Joe
>
> -- output of sessionInfo():
>
> R version 3.0.1 (2013-05-16)
> Platform: x86_64-w64-mingw32/x64 (64-bit)
>
> locale:
> [1] LC_COLLATE=Chinese (Simplified)_People's Republic of China.936
> [2] LC_CTYPE=Chinese (Simplified)_People's Republic of China.936
> [3] LC_MONETARY=Chinese (Simplified)_People's Republic of China.936
> [4] LC_NUMERIC=C
> [5] LC_TIME=Chinese (Simplified)_People's Republic of China.936
>
> attached base packages:
> [1] splines grid parallel stats graphics grDevices utils
> [8] datasets methods base
>
> other attached packages:
> [1] GSVA_1.8.0 GSVAdata_0.99.10
> [3] hgu95a.db_2.9.0 hgu133plus2hsentrezgprobe_17.1.0
> [5] hgu133plus2hsentrezgcdf_17.1.0 hgu133plus2hsentrezg.db_17.1.0
> [7] hgu95av2.db_2.9.0 a4Classif_1.8.0
> [9] varSelRF_0.7-3 randomForest_4.6-7
> [11] pamr_1.54.1 survival_2.37-4
> [13] ROCR_1.0-5 gplots_2.11.3
> [15] KernSmooth_2.23-10 caTools_1.14
> [17] gdata_2.13.2 gtools_3.0.0
> [19] MLInterfaces_1.40.0 sfsmisc_1.0-24
> [21] cluster_1.14.4 rda_1.0.2-2
> [23] rpart_4.1-3 MASS_7.3-29
> [25] a4Preproc_1.8.0 a4Core_1.8.0
> [27] glmnet_1.9-5 Matrix_1.0-12
> [29] lattice_0.20-23 GSEABase_1.22.0
> [31] affy_1.38.1 GOstats_2.26.0
> [33] graph_1.38.3 Category_2.26.0
> [35] VennDiagram_1.6.5 pheatmap_0.7.6
> [37] statmod_1.4.17 limma_3.16.7
> [39] biomaRt_2.16.0 annotate_1.38.0
> [41] genefilter_1.42.0 primeviewhsentrezgprobe_17.1.0
> [43] primeviewhsentrezg.db_17.1.0 org.Hs.eg.db_2.9.0
> [45] RSQLite_0.11.4 DBI_0.2-7
> [47] primeviewhsentrezgcdf_17.1.0 AnnotationDbi_1.22.6
> [49] Biobase_2.20.1 BiocGenerics_0.6.0
> [51] rj_1.1.3-1
>
> loaded via a namespace (and not attached):
> [1] affyio_1.28.0 AnnotationForge_1.2.2 BiocInstaller_1.10.3
> [4] bitops_1.0-6 GO.db_2.9.0 IRanges_1.18.3
> [7] mboost_2.2-2 preprocessCore_1.22.0 RBGL_1.36.2
> [10] RCurl_1.95-4.1 rj.gd_1.1.3-1 stats4_3.0.1
> [13] tools_3.0.1 XML_3.98-1.1 xtable_1.7-1
> [16] zlibbioc_1.6.0
>
> --
> Sent via the guest posting facility at bioconductor.org.
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
--
Robert Castelo, PhD
Associate Professor
Dept. of Experimental and Health Sciences
Universitat Pompeu Fabra (UPF)
Barcelona Biomedical Research Park (PRBB)
Dr Aiguader 88
E-08003 Barcelona, Spain
telf: +34.933.160.514
fax: +34.933.160.550
More information about the Bioconductor
mailing list