[Bioc-devel] FDR estimation for Biological ChIP-Seq replicate in the context of GRanges.

Leonardo Collado Torres lcollado at jhu.edu
Wed Apr 13 17:14:57 CEST 2016


I see that you posted this at
https://support.bioconductor.org/p/80782/ which is more appropriate
for your question than this mailing list.

On Wed, Apr 13, 2016 at 10:28 AM, Jurat Shayidin <juratbupt at gmail.com> wrote:
> Hi, BioC devel:
>
> I have been working on my packages and it is about to close up works except
> FDR estimations. However, I have started to read & load three replicates
> (bed file format) in GRanges objects, and I have to consider the case when
> chosen sample is Biological or Technical respectively ,so this is general
> workflow that I have implemented in my packages.
>
> in the context of processing three GRanges object for finding
> colocalization evidence across these sample, and this is the general
> workflow:
>
> -> read & load multiple sample (bed format) in GRanges  - > find overlapped
> regions conditionally in parallel -> filtering function with specific
> threshold value (a.k.a, count overall overlapped regions in parallel) ->
> chisq.test() for data that passed from previous step- > based on the
> combined pvalue, further filtering process with second threshold value
> (data that passed from previous step) - > final output as GRanges (preserve
> data who also passed from previous step, but not export them to hard disk)
>
> first running of my packages are: (a as chosenSample, b,c are
> supportingSamples):
>
> ov_ab_1 <- as(findOverlaps(a, b), "List")
> ov_ac_1 <- as(findOverlaps(a, c), "List")
>
> in second running of my packages, I have to switch parameter (where b as
> chosenSample, a,c are supportingSample), such as:
>
> ov_ba_2 <- as(findOverlaps(b,a), "List")
> ov_bc_2 <- as(findOverlaps(b,c), "List")
>
> in the third running test, I am gonna do like this (where c as
> chosenSample, a,b are supportingSample)):
>
> ov_ca_3 <- as(findOverlaps(c,a), "List")
> ov_cb_3 <- as(findOverlaps(c,b), "List")
>
> However, implementing FDR estimation for a, b, c from first , second, third
> running test, where each processed sample has three different output :
>
> for example:
> a_preserved_first_test, a_preserved_second_test, a_preserved_third_test and
> same ouput format for b, c respectively
>
> *Objective*: in the context of Biological replicates, I want to retrieve
> common regions that found at least two running test (but how, I am seeking
> solution for them), then pass these regions to p.adjust() to get adjusted
> pvalue, then do further filtering process with third threshold value, and
> generate output for these regions that passed previous step finally .
>
> *Question*:
>
> In order to do FDR estimation, I need to run my packages three times (if
> three sample are an input), where I may put result of each test into
> specific R environment (I am not sure this is right things to do). Is there
> any possible optimizing approach regarding running my packages three times
> (any chance to recursively switch to next running test when previous
> running test is done).?
>
> I am not sure if I create sub-environment where saving the result of each
> running test. I hope there might be better solution.  Maybe my question is
> bit of straightforward to you, forgive my naive question if it was. Any
> possible  approach, suggestion, trivial solution or any recommended
> bioconductor packages may help out above question, that are highly
> appreciated. Thank a lot
>
> Best regards
>
> --
> Jurat Shahidin
> Ph.D. candidate
> Dipartimento di Elettronica, Informazione e Bioingegneria
> Politecnico di Milano
> Piazza Leonardo da Vinci 32 - 20133 Milano, Italy
> Mobile : +39 3279366608
>
>         [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>



More information about the Bioc-devel mailing list