[Bioc-sig-seq] using Cuffdiff with biological replicates

Martin Morgan mtmorgan at fhcrc.org
Thu Sep 1 19:17:28 CEST 2011


On 09/01/2011 01:45 AM, Jane Merlevede wrote:
> Hello,
>
> I try to use Cuffdiff to do analyze differentially expressed isoforms.

Hi Jane -- this mailing list is for Bioconductor software; please ask 
questions exclusively about cuffdiff elsewhere, e.g., SeqAnswers, 
Biostar, or at the link at http://cufflinks.cbcb.umd.edu/manual.html.

Or are you looking for help using edgeR or DESeq ?

Martin


> I am studying 2 conditions (Raman - HM1) and there are 3 replicates
> per condition. There are 5295 genes and 6432 isoforms.
> In the documentation, it's written "If you have more than one
> replicate for a sample, supply the SAM files for the sample as a
> single comma-separated list".
> So I ran Cuffdiff this way:
>
> cuffdiff -L Raman,HM1 -N --FDR 0.05 /path/fichier.gtf
> /path/Cufflinks/L2/accepted_hits.bam,/path/Cufflinks/L4/accepted_hits.bam,/path/Cufflinks/L6/accepted_hits.bam
> /path/Cufflinks/L3/accepted_hits.bam,/path/Cufflinks/L7/accepted_hits.bam,/path/Cufflinks/L8/accepted_hits.bam
>
> The replicates for the condition Raman are L2, L4 and L6. The
> replicates for the condition HM1 are L3, L7 and L8.
> I'm interested in the output files isoform_exp.diff:
>
> test_id	gene	locus	sample_1	sample_2	status	value_1	value_2	ln(fold_change)	test_stat	p_value	significant
> EHI_000010.ref	-	DS571600:2419-3622	q1	q2	NOTEST	8.28385	21.4211	0.950069	-2.32208	0.0202284	no
> EHI_000130.alt1	-	DS571600:7792-8309	q1	q2	OK	108.42	6.20207	-2.86113	4.25521	2.08856e-05	yes
> EHI_000130.ref	-	DS571600:7792-8309	q1	q2	OK	1152.64	2299.79	0.690763	-16.1849	0	yes
> EHI_000240.ref	-	DS571186:1554-2669	q1	q2	OK	558.654	857.323	0.428284	-7.87676	3.33067e-15	yes
> EHI_000250.ref	-	DS571186:2850-3551	q1	q2	OK	134.444	301.066	0.80618	-7.77203	7.77156e-15	yes
> ...
> EHI_C00159.ref	-	EH4264:45-392	q1	q2	NOTEST	11.8076	27.8643	0.858599	-2.4726	0.0134133	no
> EHI_000010.ref	-	DS571600:2419-3622	q1	q3	NOTEST	8.28385	14.2453	0.542123	-1.24073	0.214705	no
> EHI_000130.alt1	-	DS571600:7792-8309	q1	q3	OK	108.42	4.26615	-3.2353	4.30991	1.63319e-05	yes
> EHI_000130.ref	-	DS571600:7792-8309	q1	q3	OK	1152.64	1736.74	0.409956	-9.25409	0	yes
> ...
> EHI_C00155.ref	-	DS571588:780-994	q5	q6	OK	4382.71	4702.45	0.0704184	-3.35392	0.000796741	yes
> EHI_C00156.ref	-	DS571646:5447-6016	q5	q6	NOTEST	12.2637	40.6463	1.19826	-3.67794	0.000235123	no
> EHI_C00157.ref	-	DS571705:1482-1899	q5	q6	OK	776.201	1585.64	0.714329	-16.3066	0	yes
> EHI_C00159.ref	-	EH4264:45-392	q5	q6	NOTEST	9.16532	4.74531	-0.65827	1.16396	0.244442	no
>
> Here is my problem. The file contains 96480 (15*6432) lines (instead
> of 6432). 15 is the number of combinations between the 6 data sets...
> I think Cuffdiff did not consider the 3 biological replicates as one
> condition !
>
> Moreover, I don't know why I don't have the adjusted p-value information...
>
> How can I use Cuffdiff to get differentially expressed isoforms ?
> Thanks for your help,
> Jane
>
> _______________________________________________
> Bioc-sig-sequencing mailing list
> Bioc-sig-sequencing at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing


-- 
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109

Location: M1-B861
Telephone: 206 667-2793



More information about the Bioc-sig-sequencing mailing list