[Bioc-sig-seq] using Cuffdiff with biological replicates

Jane Merlevede jane.merlevede at gmail.com
Thu Sep 1 10:45:57 CEST 2011


Hello,

I try to use Cuffdiff to do analyze differentially expressed isoforms.
I am studying 2 conditions (Raman - HM1) and there are 3 replicates
per condition. There are 5295 genes and 6432 isoforms.
In the documentation, it's written "If you have more than one
replicate for a sample, supply the SAM files for the sample as a
single comma-separated list".
So I ran Cuffdiff this way:

cuffdiff -L Raman,HM1 -N --FDR 0.05 /path/fichier.gtf
/path/Cufflinks/L2/accepted_hits.bam,/path/Cufflinks/L4/accepted_hits.bam,/path/Cufflinks/L6/accepted_hits.bam
/path/Cufflinks/L3/accepted_hits.bam,/path/Cufflinks/L7/accepted_hits.bam,/path/Cufflinks/L8/accepted_hits.bam

The replicates for the condition Raman are L2, L4 and L6. The
replicates for the condition HM1 are L3, L7 and L8.
I'm interested in the output files isoform_exp.diff:

test_id	gene	locus	sample_1	sample_2	status	value_1	value_2	ln(fold_change)	test_stat	p_value	significant
EHI_000010.ref	-	DS571600:2419-3622	q1	q2	NOTEST	8.28385	21.4211	0.950069	-2.32208	0.0202284	no
EHI_000130.alt1	-	DS571600:7792-8309	q1	q2	OK	108.42	6.20207	-2.86113	4.25521	2.08856e-05	yes
EHI_000130.ref	-	DS571600:7792-8309	q1	q2	OK	1152.64	2299.79	0.690763	-16.1849	0	yes
EHI_000240.ref	-	DS571186:1554-2669	q1	q2	OK	558.654	857.323	0.428284	-7.87676	3.33067e-15	yes
EHI_000250.ref	-	DS571186:2850-3551	q1	q2	OK	134.444	301.066	0.80618	-7.77203	7.77156e-15	yes
...
EHI_C00159.ref	-	EH4264:45-392	q1	q2	NOTEST	11.8076	27.8643	0.858599	-2.4726	0.0134133	no
EHI_000010.ref	-	DS571600:2419-3622	q1	q3	NOTEST	8.28385	14.2453	0.542123	-1.24073	0.214705	no
EHI_000130.alt1	-	DS571600:7792-8309	q1	q3	OK	108.42	4.26615	-3.2353	4.30991	1.63319e-05	yes
EHI_000130.ref	-	DS571600:7792-8309	q1	q3	OK	1152.64	1736.74	0.409956	-9.25409	0	yes
...
EHI_C00155.ref	-	DS571588:780-994	q5	q6	OK	4382.71	4702.45	0.0704184	-3.35392	0.000796741	yes
EHI_C00156.ref	-	DS571646:5447-6016	q5	q6	NOTEST	12.2637	40.6463	1.19826	-3.67794	0.000235123	no
EHI_C00157.ref	-	DS571705:1482-1899	q5	q6	OK	776.201	1585.64	0.714329	-16.3066	0	yes
EHI_C00159.ref	-	EH4264:45-392	q5	q6	NOTEST	9.16532	4.74531	-0.65827	1.16396	0.244442	no

Here is my problem. The file contains 96480 (15*6432) lines (instead
of 6432). 15 is the number of combinations between the 6 data sets...
I think Cuffdiff did not consider the 3 biological replicates as one
condition !

Moreover, I don't know why I don't have the adjusted p-value information...

How can I use Cuffdiff to get differentially expressed isoforms ?
Thanks for your help,
Jane



More information about the Bioc-sig-sequencing mailing list