[BioC] FW: Request for the assistance to use MEDME

Mon Dec 1 11:29:47 CET 2008

Dear Mattia,

I am little confused. Though read.table is working on .txt, I have now converted normalized .txt output to .gff output using script. We get chromosomal assignment and probe position information in raw as well as normalized data. We get position information as range parameter (Ex: strat - end) but what I see in the example below that you have provided single parameter for "pos" slot. Now I have again gone through the paper and understood that you have done some more work (as explained in "Weighting of MeDIP enrichment" section) to get positon in single parameter. Can you please explian me little more about "Weighting of MeDIP enrichment" and the care needs to be taken in this step? Is there any command to this in R or we have to script for it? To implement MEDME sucessfully in our lab as a third party, do we need to have refrence genome where all the "C" are methylated (As explianed in the "Derivation of fully methylated DNA" section of paper)?

As you see in attachment files logRatio is not in matrix form. we get only logFC value. So, Can you please tell me how should I proceed further?

Thanking you and Dr. Molinaro in anticipation for the support.

Regards,

Prashantha

-----Original Message-----
From: Mattia Pelizzola [mailto:mattia.pelizzola at yale.edu]
Sent: Wed 11/26/2008 8:25 PM
To: Prashantha Hebbar Kiradi [MU-MLSC]
Cc: Annette Molinaro
Subject: Re: FW: Request for the assistance to use MEDME

Hi Prashantha,

thanks for using MEDME. Unfortunately, the Agilent file format that you 
are using is not supported and MEDME.readFiles can't be used. In this 
case you have to use a little bit basic R functions to "manually" load 
the data into R and create a MEDMEset object.

As you can find in the documentation of the MEDMEset class (type 
"class?MEDMEset" in R) there are several data slots (chr, pos, logR, 
etc..). Few of these are actually mandatory to create a minimal MEDMEset 
(chr, pos, logR and organims). You can find details on all of these in 
the documentation page above, but I'll provide here some examples:

 > chr = c("chr1","chr1","chr5","chr6")
 > pos = c(1000, 3000, 100, 5000)
 > logR = cbind(c(2.2, 4.1, -0.5, 0.1),c(3, 0, 0.2, -1))    # a matrix 
with N columns for N samples
 > rownames(logR) = letters[1:4] # probes are named a,b,c,d here
 > organism = "hsa"    # for human or "mmu" for mouse

finally you use these to initialize a new MEDMEset:
Mset = new('MEDMEset', chr=chr, pos=pos, logR=logR, organism=organism)

Now, in you case you have to load all these data from your files. I 
could not find chromosomal assignment and probe position in the header 
of your file (e.g. "test_cis_reg.gff") so I guess these are available 
(or can be exported from the Agilent software) in a separate file. Be 
careful of matching chr and pos with the data in "test_cis_reg.gff" 
respecting the probe order.

You can load the data from datafile.txt with the function read.table:

 > data = read.table(file = "test_cis_reg.gff", sep="\t", header=TRUE, 
row.names = NULL, stringsAsFactors = FALSE)
now you can extract the MedIP data with:
 > MeDIPdata = data[,"logFC"] # assuming that this is the column 
containing the MeDIP logRatio ..
and the probe names with:
 > probeNames = data[,"ProbeName"]

In case you have many samples you have to repeat that many times (you 
can use a "for" cycle to iterate on file names in case ..) and you can 
put them together with the "cbind" function that I used in the example 
above (be careful about the order of probeNames being consistent through 
files!). Finally you assign probe names and you are set.

You have to do something similar with another file to get probe chr and 
pos. Then you can initialize the MEDMEset object.

Let me know if you have any problem,

mattia

Annette Molinaro wrote:
> Hi Mattia -
> Can you follow up on this email?
> 
> Many thanks,
> 
> Annette
> 
>  
> 
>  
> 
> *From:* Prashantha Hebbar Kiradi [MU-MLSC] 
> [mailto:prashantha.hebbar at manipal.edu]
> *Sent:* Wednesday, November 26, 2008 1:41 AM
> *To:* annette.molinaro at yale.edu
> *Subject:* Request for the assistance to use MEDME
> 
>  
> 
>  
> 
> Dear Dr. Molinaro,
> 
> I am Prashantha from Manipal Life Sciences Center, Manipal University, 
> India. Recently, I have gone through MEDME (Pelizzola et.al, Genome Res. 
> 2008) and tried to implement the algorithm on our data. I used limma for 
> the normalization. But now I am stucked in MEDME. So, I would like to 
> discuss the issues with you. Incase you do not find the time, Please put 
> me in touch with any of your lab mates.
> 
> Following are my doubts:
> 
> 1. The normalized data is in .tsv format and has 'Row', 'Col', 
> 'ProbeUID', 'ControlType', 'ProbeName', 'GeneName', 'SystematicName', 
> 'Description', 'logFC', 'AveExpr', 't', 'P.Value', 'adj.P.Val', 'B' as 
> columns. As you know the GFF format has 9 fields (seqname, source, 
> feature, start, end, score, strand, frame, group). So, for the "score" 
> field in GFF format which field you have chosen from the fileds in 
> normalized data. We use here Agilents image extraction software for 
> feature extraction and I think, there is no way to get the data in GFF.
> 
> 2. In order to test my data now I have created a dummy GFF by taking 
> logFc as "score". When I use following command
>>  MEDME.readFiles(path = getwd(), files ="test_cis_reg.gff", 
> format="gff", organism="hsp")
> 
> I will get following error message,
> ---------------------------------------------------------------------------------------
> Error in initialize(value, ...) :
>   logR has to be a matrix with probeIds as rownames ..
> In addition: Warning message:
> In MEDME.readFiles(path = getwd(), files = "test_cis_reg.gff", format = 
> "gff",  :
>   no unique probe names provided on column 3; the resulting dataset 
> lacks rownames ..
> -----------------------------------------------------------------------------------------
> 
> I request you to kindly help me to come out of these problems.
> 
> Thank you in anticipation for the support.
> 
> Sincerely,
> 
> Mr. Prashantha Hebbar
> Bioinformatician
> Manipal Life Sciences Center,
> Manipal University,
> Manipal, India
> PIN:576104
> Ph:+91-9886359007
> 
> ------------------------------------------------------------------------
> 
> This e-mail is privileged and confidential. If you are not the
> intended recipient please delete the message and notify the sender.
> Any views or opinions presented are solely those of the author.
> 
> ------------------------------------------------------------------------

######################################################################
Attention: 
This e-mail message is privileged and confidential. If you are not the 
intended recipient please delete the message and notify the sender. 
Any views or opinions presented are solely those of the author.

######################################################################
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: test.txt
URL: <https://stat.ethz.ch/pipermail/bioconductor/attachments/20081201/99bd9679/attachment.txt>