[BioC] Pathview with non-KEGG organism
Luo Weijun
luo_weijun at yahoo.com
Mon Sep 16 17:48:11 CEST 2013
Christian,
You’ve done the gene ID mapping to KO correctly. To proceed with the GAGE pathway analysis, you will need the KO gene set data (which I will send you next). The KO gene set data will be provided in the next release of gageData package too.
To see whether KEGG includes your research species, you may check:
library(pathview)
data(korg)
head(korg)
If it is included, you don’t really have to map your gene ID to KO given that you can get the corresponding gene set data.
As you have multiple samples/replicates, you may choose to visualize the average gene expression of all samples together or each individual sample separately using Pathview. Pathview will also be able to integrate/plot multiple states/samples on the same graph by splitting each node, from next devel release (version 1.17): http://bioconductor.org/packages/devel/bioc/html/pathview.html. So stay tunned.
HTH.
Weijun
--------------------------------------------
On Mon, 9/16/13, Christian De Santis <christian.desantis at stir.ac.uk> wrote:
Subject: Pathview with non-KEGG organism
To: "'bioconductor at r-project.org'" <bioconductor at r-project.org>
Date: Monday, September 16, 2013, 4:44 AM
Hi Wejun,
I am new to BIOC and Pathview/Gage packages. I am
analysing microarray data from an experiment on Atlantic
salmon and I am attempting to visualize the results in
Pathview, if possible.
Following up a previous thread (https://stat.ethz.ch/pipermail/bioconductor/2013-August/054161.html),
I have been trying to do a similar
thing and I believe I have similar limitation. As for the
previous user, I have obtained KEGG Orthology annotation
using KAAS. Briefly, the principal steps of my workflow look
like the following:
>
DIET12_14_KO <-
read.csv("DIET12_14_KO.csv",header=T,
sep=",") # Upload the KEGG annotation file from
KAAS
>
DIET12_14_KO[1:3,]
ProbeName KO
1 Omy#AB024321
K04079
2 Omy#BG360545
K13506
3 Omy#BX072887
K00412
>
MAlist[1:3,1:6] # Visualize my expression
list
DIET14 DIET14.1
DIET14.2 DIET14.3
DIET02 DIET02.1
Omy#AB024321
0.06296557 0.08865075 0.1186315 -0.1847021
-0.41212414 -0.42385673
Omy#BG360545 -0.50762181
-0.35763304 -0.4939668 -0.6973216 -0.11339368
0.15489712
Omy#BX072887
0.23447458 0.22487856 0.3930821 0.1515031
-0.04694996 -0.04836203
>
dim(MAlist)
[1] 7955
16
>
D2 <- as.matrix(DIET12_14_KO) # create the two column
character matrix for id.map argument
>
D2[1:3,]
ProbeName
KO
[1,]
"Omy#AB024321"
"K04079"
[2,]
"Omy#BG360545"
"K13506"
[3,]
"Omy#BX072887"
"K00412"
>
gene.data <- mol.sum(MAlist, id.map =
D2)
>
gene.data [1:3,1:6]
DIET14
DIET14
DIET14
DIET14
DIET02 DIET02
K00006
0.7170382 0.5351467 0.1207924
0.1782242 0.228860514 -0.5426538
K00008 -0.8112601
-0.5910453 -0.7691811 -0.1919992 -0.003848065
0.1771637
K00011
1.9645823 1.2305297 2.3335377
1.4813718 0.185036373 -1.2886788
>
dim(gene.data)
[1] 2449
16
I am a bit stuck here. I should now have the data in
the correct format for the pathview argument
“gene.data” with genes as row and samples as
column and KO ids as row names. From my understanding, to
proceed I will now need a KO gene set data for non-model
species? Or could I use one from a close species like
zebrafish?
Also, one thing I have not clear is if the gene.data
should include the expression values of all sample (i.e.
biological replicates) or the average value per
treatment.
Your help will be very much appreciated.
Regards,
Christian
The University
of Stirling has been ranked in the top 12 of UK universities
for graduate employment*.
94% of
our 2012 graduates were in work and/or further study within
six months of graduation.
*The
Telegraph
The University of
Stirling is a charity registered in Scotland, number SC
011159.
More information about the Bioconductor
mailing list