[BioC] edgeR GLM using factor that varies for each gene
Daniel Lang [guest]
guest at bioconductor.org
Thu May 8 09:33:05 CEST 2014
Hi,
after going over the user guide and searching this mailing list I'm not quite clear on how to best address my specific situation:
I'd like to test differential "expression" of specific splicing events between a mutant and the wild type in a replicated design. To do so, I've specifically counted reads that are specific to a certain splicing event for each gene.
e.g.
event AS.type mutant.line1.rep1 mutant.line1.rep2 mutant.line2.rep1 mutant.line2.rep2 wt.rep1 wt.rep2
S102-F_10.883 alt_donor 4 7 4 7 0 1
S102-F_12.884 alt_donor 0 1 0 1 0 2
S102-F_10.887 alt_donor 0 0 0 0 30 33
S102-F_10.886 alt_acceptor 0 0 0 0 22 21
S102-F_11.890 alt_donor 0 0 0 0 0 0
S102-F_11.889 alt_acceptor 0 0 0 0 0 0
S102-F_10.891 alt_acceptor 0 0 0 0 0 0
S103-R_3.901 alt_acceptor 4 5 4 5 10 11
S103-R_2.904 skipped_exon 2 4 2 4 33 28
S103-R_2.902 alt_acceptor 4 5 4 5 0 0
S103-R_1.906 alt_acceptor 0 1 0 1 1 0
It's not clear from this example, but overall there is a difference between abundances and noise levels of specific types of alternative splicing I'd like to correct for, but also assess using GLM. Thus, ideally I'd like to find differentially abundant splicing events between the mutant and the wild type irrespective of line and biological replicate.
As far as I understood the UserGuide and the ReferenceManual design always refers to factors for describing the libraries/experiments the counts are derived from.
If I'd be using "normal" GLM, what I want to do would look like glm(count ~ AS.type + genotype + line + biological.replicate).
Can I accomplish this with edgeR without splitting up the events into different data sets per splice type?
Any advise on this would be greatly appreciated.
Best,
Daniel
-- output of sessionInfo():
R version 3.0.1 (2013-05-16)
Platform: x86_64-pc-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_US.utf8 LC_NUMERIC=C
[3] LC_TIME=de_DE.utf8 LC_COLLATE=en_US.utf8
[5] LC_MONETARY=de_DE.utf8 LC_MESSAGES=en_US.utf8
[7] LC_PAPER=C LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=de_DE.utf8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
--
Sent via the guest posting facility at bioconductor.org.
More information about the Bioconductor
mailing list