[BioC] Limma Voom R package
James W. MacDonald
jmacdon at uw.edu
Tue Jan 8 17:26:05 CET 2013
Hi Pedro,
On 1/7/2013 3:47 PM, Pedro Blecua wrote:
> Dear Sir/Madam,
>
> I am a postdoctoral researcher at Chris Mason's lab at the ICB Cornell
> Medical College in NYC.
> I would be very interested in using your R package for RNA-seq
> analysis of some raw data we have.
> We went through the manual quickly, and it is not clear for me how to
> start the analysis, i.e., input file.
>
> To be more specific: given the fastq file (or binary fastq.tbz), could
> we use it as input for Voom, and then
> use the result for Limma? Or should we align first our raw fastq data
> and then use the sam or bam files as
> input for the Vomm or Limma packages? How should I proceed to start an
> analysis from raw fastq files?
You need to align using a gapped aligner (bowtie2, gsnap, etc), and then
use the resulting bam file to get counts per transcript, which is the
input to voom.
Once you have the aligned data, you can use GenomicFeatures and the
correct transcript.db package to get the counts using
summarizeOverlaps(). Given aligned bam files, I usually do something like
library(Rsamtools)
library(GenomicFeatures)
bflst <- BamFileList(<character vector of bam files, including path if
not in working dir>)
library(Tx.Db.Hsapiens.UCSC.hg19.knownGene) ## substitute applicable
species here
feat <- exonsBy(Tx.Db.Hsapiens.UCSC.hg19.knownGene, by = "gene")
olaps <- summarizeOverlaps(feat, bflst)
then you can do
counts <- assays(olaps)$counts
voom(counts)
Best,
Jim
>
> I would highly appreciate an answer at your earliest convenience.
>
> Thank you very much in advance for your attention,
>
>
--
James W. MacDonald, M.S.
Biostatistician
University of Washington
Environmental and Occupational Health Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099
More information about the Bioconductor
mailing list