[BioC] using easyRNASeq to calculate RPKM values
Fatemehsadat Seyednasrollah
fatsey at utu.fi
Tue Jan 22 17:10:43 CET 2013
Many Thanks.
________________________________________
From: Nicolas Delhomme [delhomme at embl.de]
Sent: Tuesday, January 22, 2013 4:46 PM
To: Fatemehsadat Seyednasrollah
Cc: bioconductor at r-project.org
Subject: Re: [BioC] using easyRNASeq to calculate RPKM values
Dear Fatemehsadat,
It is indeed possible. The function RPKM would do that for you. Have a look at the help page by doing ?RPKM after loading easyRNASeq. The last example takes as argument a matrix (your count table), the gene sizes (or whatever feature you used, e.g. transcripts) and the sizes of your RNA-Seq libraries. These two last arguments should be named vectors where the name are the rownames and colnames of your count table, respectively. The library size can be retrieved simply by summing your columns, i.e. colSums(count.table).
Words of caution though, RPKM is a correction and not a normalization, so it's fine for visualizing the data, but I would not use it as input to any statistical tools such as DESeq, edgeR, etc. Moreover, depending on how you counted your reads per feature, you might have counted some reads multiple time in which case, it is better to retrieve your library size from your original BAM file using samtools.
HTH,
Nico
---------------------------------------------------------------
Nicolas Delhomme
Genome Biology Computational Support
European Molecular Biology Laboratory
Tel: +49 6221 387 8310
Email: nicolas.delhomme at embl.de
Meyerhofstrasse 1 - Postfach 10.2209
69102 Heidelberg, Germany
---------------------------------------------------------------
On Jan 22, 2013, at 2:03 PM, Fatemehsadat Seyednasrollah wrote:
> Dear list,
>
> I have used HTSeq to get the count table of an RNA seq dataset which has 8 biological replicates and two conditions ( so 4 biological replicates for each condition ) and the count table is like below:
>
>> head(a)
>
> V1 V2 V3 V4 V5 V6 V7 V8 V9
> 1 1/2-SBSRNA4 3 5 4 4 2 3 1 1
> 2 A1BG 200 93 246 102 86 46 58 85
> 3 A1BG-AS1 24 28 16 32 17 10 19 14
> 4 A1CF 1 1 1 2 1 0 0 1
> 5 A2LD1 100 71 98 97 59 128 88 114
> 6 A2M 5 5 23 1 5 6 10 5
>
> Now for getting familiar with the expression level of each gene I want to calculate the RPKM values. Can I use the easyRNASeq package over the above count table to calculate the values or not?
>
> Thank you in advance
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
More information about the Bioconductor
mailing list