Hi Naomi-

For MACS .xls files, the score is derived from  column 7: -10*log10(pvalue). Can you let me know where it is documented as column 8 so I can fix that?

These scores aren't very important. They are only used in the correlation heatmaps and PCA plots after peaks are read in. These scores are discarded once reads are counted for a consensus peakset. Generally, these peak-derived plots are driven more by which samples have the peak called and which don't, rather than the specific scores for where they are called.

Cheers-
Rory

On Sat Apr 5 00:06:26 CEST 2014, Naomi Altman wrote:

Hi All,
I am still trying to understand DiffBind.

After reading in my data, I find that the peaks component looks
something like this:

head(myCHIP$peaks[[1]])
      V1      V2      V3         V8
1 chr19 3182597 3183033 0.10326322
2 chr19 3589475 3589990 0.09515837
3 chr19 3831795 3832326 0.06208947
4 chr19 4122385 4123105 0.06524229
5 chr19 4504682 4505416 0.15118871
6 chr19 4558434 4559635 0.22387278


The peaks were called by MACS, and looking at the code, dba pulls the
peak score out of "column 8" and normalizes it to be between 0 and 1.

However, the peak spreadsheet has 9 columns and none of them appear to
be normalizable to obtain the numbers in this column.  Where are these
numbers coming from?  What do they mean?  And should I care?

Thanks,
Naomi

	[[alternative HTML version deleted]]