[BioC] applicability of tilingArray package

Fri Nov 7 21:42:58 CET 2008

Dear Michael,

> thanks for your thoughts. i have to say i'm afraid i only sort of follow 
> what you've said. in an effort to clarify, it sounds like you've said 
> the methods in the tilingArray package probably aren't a good approach 
> to do the segmentation given the data i have.

I think that, for the data you have, there are *two* different 
segmentation tasks.

(i) segmentation of what is transcribed (at all) in each of the conditions

(ii) identification of what is *differentially* transcribed between the 
conditions,

The method in the tilingArray package was designed for (i). Perhaps it 
would be helpful if you could clarify which one you are after. 
Personally, I think that solving (ii) without at least giving some shot 
at (i) will leave you with biological interpretation problems, and 
underuses the data.

> you've said TAS's approach to the segmentation might be good, but 
> finding the best parameters might be difficult. you've also said that 
> for (i) i could use the methods of David et al and Huber et al using MM 
> probes. if i do that, i'll be left with two separate collections (wt and 
> mut) of normalized data, which i'll then need to find (ii), ie, the 
> differentially transcribed regions and then segment those results.

Yes. And, more precisely, you could get two separate collections of 
expressed segments (on for wt and one for mutant); then you need to find 
some sort of consensus segmentation that is a compromise and superset of 
them both, and then you can ask:
- which segments have different expression levels between the two conditions
- which segments change size (transcription start and stop sites) 
between the conditions
But for this there is no readymade software that I am aware of.

> the confusing part for me is connecting what you've said about TAS to 
> using the MM normalizing methods. i don't see how i could use the MM 
> normalizing methods and get 2 data sets of expression levels and then 
> use TAS to find the differentially transcribed data and segments. 

Me neither.

> maybe 
> you're suggestion one or the other, ie, stick with TAS to do it all, 

Yes, that is an option.

> or 
> use huber et al, MM for normalizing and then find some other method to 
> find the differentially transcribed regions and segmentation?

Yes, that is another option. See above. It seems that the second option 
might turn out to be more flexible to adapt to your biological 
questions, and possibly more sensitive, but it's also more work for you.

Best wishes
  Wolfgang

----------------------------------------------------
Wolfgang Huber, EMBL-EBI, http://www.ebi.ac.uk/huber

>> Hi Michael,
>>
>> there are two separate issues:
>> (i) finding the transcribed regions, separately in each of the samples
>> (wt, mut).
>> (ii) finding the differentially transcribed regions.
>>
>> For (i), you could use an approach similar to that in the David et al.
>> and Huber et al. papers. Since you don't have the DNA reference hybes,
>> you could use the MM probes. This is described in Section 4.2 of the
>> vignette
>> http://www.bioconductor.org/packages/2.3/bioc/vignettes/tilingArray/inst/doc/assessNorm.pdf 
>>
>> and as the benchmarks in Section 5 show, it is not quite as good, but
>> still pretty good.
>>
>> Don't think of this in terms of "normalising" the mutant against the
>> "wt" type, that doesn't make much sense.
>>
>> For (ii), if you want to segment e.g. a probe-wise (moderated)
>> t-statistic, the piecewise constant model using in the tilingArray
>> package is not useful. A running window approach (like in TAS) makes
>> sense, the hard part is of course tuning its parameters.
>>
>> AfaIk, there are methods for (i) and (ii) separately, and to join /
>> align them, the approaches are ad hoc. It would be nice if there were a
>> clean method that does (i) and (ii) jointly - maybe someone else has
>> insights in this?
>>
>> Best wishes
>>  Wolfgang
>>
>> ------------------------------------------------------------------
>> Wolfgang Huber  EBI/EMBL  Cambridge UK  http://www.ebi.ac.uk/huber
>>
>>
>> 04/11/2008 16:42 Michael Palumbo scripsit
>>  
>>> hello,
>>>
>>> i have general questions regarding the applicability of the tilingArray
>>> package to my problem/data. i've used bioconductor in the past, but by
>>> no means am i an expert.
>>>
>>> i have data from affy yeast tiling arrays - 3 mut and 3 wild type. i've
>>> run affy's TAS program on the CEL files - as a two sample analysis, ie,
>>> comparing wt to mut and viewed the results in IGB. my initial goal is to
>>> segment the results as was done in David et al, PNAS 2006. it seems to
>>> me there are fundamental differences in my data and the data of David et
>>> al. e.g., the normalization step described in tilingArray doc uses DNA
>>> hybridized to the chips as a reference - i don't have that, although i
>>> do have the wt data. a colleague thought i might be able to use the wt
>>> data in the normalization step, but that doesn't seem quite right to me.
>>> it is also described that normalization can occur by MM probes - maybe i
>>> can normalize the mut chip data w/ MM probes and completely ignore the
>>> wt data? i realize that if i did that, the result would no longer be a
>>> comparison of mut and wt and what i would 'see' would be different from
>>> what i currently see in IGB of the two sample TAS analysis. this also
>>> seems like it's not the best approach.
>>>
>>> on the other hand, again, all i really want to do is segment the
>>> two-sample analysis that i've done. is there anything wrong with using
>>> the results of TAS's analysis? TAS does a normalization and has
>>> bandwidth averaging - as a non-expert, these are convenient and seem
>>> good to me.
>>>
>>> thanks in advance for any and all responses/thoughts,
>>> mike palumbo
>>>
>>>     
>