[Bioc-sig-seq] using Cuffdiff with biological replicates
Jane Merlevede
jane.merlevede at gmail.com
Thu Sep 1 10:45:57 CEST 2011
Hello,
I try to use Cuffdiff to do analyze differentially expressed isoforms.
I am studying 2 conditions (Raman - HM1) and there are 3 replicates
per condition. There are 5295 genes and 6432 isoforms.
In the documentation, it's written "If you have more than one
replicate for a sample, supply the SAM files for the sample as a
single comma-separated list".
So I ran Cuffdiff this way:
cuffdiff -L Raman,HM1 -N --FDR 0.05 /path/fichier.gtf
/path/Cufflinks/L2/accepted_hits.bam,/path/Cufflinks/L4/accepted_hits.bam,/path/Cufflinks/L6/accepted_hits.bam
/path/Cufflinks/L3/accepted_hits.bam,/path/Cufflinks/L7/accepted_hits.bam,/path/Cufflinks/L8/accepted_hits.bam
The replicates for the condition Raman are L2, L4 and L6. The
replicates for the condition HM1 are L3, L7 and L8.
I'm interested in the output files isoform_exp.diff:
test_id gene locus sample_1 sample_2 status value_1 value_2 ln(fold_change) test_stat p_value significant
EHI_000010.ref - DS571600:2419-3622 q1 q2 NOTEST 8.28385 21.4211 0.950069 -2.32208 0.0202284 no
EHI_000130.alt1 - DS571600:7792-8309 q1 q2 OK 108.42 6.20207 -2.86113 4.25521 2.08856e-05 yes
EHI_000130.ref - DS571600:7792-8309 q1 q2 OK 1152.64 2299.79 0.690763 -16.1849 0 yes
EHI_000240.ref - DS571186:1554-2669 q1 q2 OK 558.654 857.323 0.428284 -7.87676 3.33067e-15 yes
EHI_000250.ref - DS571186:2850-3551 q1 q2 OK 134.444 301.066 0.80618 -7.77203 7.77156e-15 yes
...
EHI_C00159.ref - EH4264:45-392 q1 q2 NOTEST 11.8076 27.8643 0.858599 -2.4726 0.0134133 no
EHI_000010.ref - DS571600:2419-3622 q1 q3 NOTEST 8.28385 14.2453 0.542123 -1.24073 0.214705 no
EHI_000130.alt1 - DS571600:7792-8309 q1 q3 OK 108.42 4.26615 -3.2353 4.30991 1.63319e-05 yes
EHI_000130.ref - DS571600:7792-8309 q1 q3 OK 1152.64 1736.74 0.409956 -9.25409 0 yes
...
EHI_C00155.ref - DS571588:780-994 q5 q6 OK 4382.71 4702.45 0.0704184 -3.35392 0.000796741 yes
EHI_C00156.ref - DS571646:5447-6016 q5 q6 NOTEST 12.2637 40.6463 1.19826 -3.67794 0.000235123 no
EHI_C00157.ref - DS571705:1482-1899 q5 q6 OK 776.201 1585.64 0.714329 -16.3066 0 yes
EHI_C00159.ref - EH4264:45-392 q5 q6 NOTEST 9.16532 4.74531 -0.65827 1.16396 0.244442 no
Here is my problem. The file contains 96480 (15*6432) lines (instead
of 6432). 15 is the number of combinations between the 6 data sets...
I think Cuffdiff did not consider the 3 biological replicates as one
condition !
Moreover, I don't know why I don't have the adjusted p-value information...
How can I use Cuffdiff to get differentially expressed isoforms ?
Thanks for your help,
Jane
More information about the Bioc-sig-sequencing
mailing list