[BioC] how to calculate gene length to be used in rpkm() in edgeR
Ryan
rct at thompsonclan.org
Sat May 3 00:15:07 CEST 2014
Hi Shirley,
The appropriate gene length to use is whatever gene length was used to
compute RPKM values for data set B. If you don't have that information,
then I don't see how you can compute comparable RPKM values for your
data.
-Ryan
On Fri May 2 15:01:32 2014, shirley zhang wrote:
> Dear List,
>
> I've been used edgeR for differential expression analysis for data
> generated from the same tissue, but different conditions.
>
> Now I have a RNAseq data A (n=20), and would like to compare them with
> another RNAseq data B (n=1,000 across different tissues). Since data B is
> normalized and batch-effect adjusted RPKM value, I need to generate RPKM
> value for my own data A.
>
> I already had a count table, and would like to use rpkm() in edgeR, but
> first I have to get a gene length vector. My question is how to count gene
> length from an "Ensembl.gtf" file by taking into account the following:
>
> 1. Gene 1 is much longer than Gene 2 if including both exon and intron. But
> Gene 1 only has 3 exons, and Gene 2 has 10 exons --> for the
> transcripts, Gene2>Gene1
>
> 2. For the same Gene, there are > 1 transcript isoforms. In different
> tissues, different transcript isoforms will be expressed.
>
> Many thanks,
> Shirley
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
More information about the Bioconductor
mailing list