[BioC] help needed on avereps function

Francois Pepin fpepin at cs.mcgill.ca
Wed Jun 17 00:51:56 CEST 2009


Hi Erika,

I'm bringing the discussion back to the list so other people can chime 
in and so it's archived for future reference.

What are you using for the ID argument in avereps? Since the code 
doesn't seem to work for you (i.e. you still have duplicates), I'm 
guessing it's not using the proper identifiers. Without any code, it's 
impossible for us to understand what is happening.

As for the lists of differentially expressed genes, you'd have to tell 
us just how many genes you get with each method and how different the 
lists are. Methods like Limma borrow information from the other genes 
when calculating significance, so this could change the p-values. In 
addition, multiple hypothesis testing will also be affected if you have 
a different number of probes.

So other than guessing, there's not much that we can do. Sending your 
code (including sessionInfo()) and giving us more details of your 
results will allow people to get a better idea of what is happening and 
how to fix it, if necessary.

Francois

Erika Melissari wrote:
> 
> Dear Dr Pepin,
>  
> sorry to disturb you, but I sent several times an email to Bioconductor 
> list about some problems that I have using avereps function and no 
> answer I received.
> Perhaps my question is very unimportant for Bioconductor list, but I 
> noted some uncounted results when I use this function that concerned me 
> and I do not manage to give an explanation.
> If you have a little time and you would like to help me, I would like to 
> have your opinion about these problems.
> As LIMMA help suggested, I use avereps function after normalization and 
> before using lmFit, that is I perform lmFit with data normalized and 
> averaged.
> I noted two strange results:
> 1) I obtain a different list of differentially expressed genes if I use 
> or not avereps function. If I have well understood this function, his 
> effect is to average M, A and weights values for spot with the same 
> probe id code (in my case this is an Agilent code). Why should my 
> statistical significance  change and what list of differentially 
> expressed genes is right...or more safe?
> 2) when I checked the averaged list of genes I found spot not averaged 
> with the same Probe id. You can see an example of this below. What are 
> the reason that does not allow for the averaging?
>  
> Maybe the problems that I see are not a consequence of using avereps 
> function, particularly for the point 1), but should they to be explained 
> in other terms?
>  
> I apologize again for the disturb that I am causing you and I thank you 
> in advance for any help you will like to give me.
>  
> Best regards
>  
> Erika
>  
>  
> Erika Melissari
> Ph.D. student
> Department of Experimental Pathology, MBIE,
> University of Pisa
> Santa Chiara Hospital, via Roma 67
> 56126 Pisa
> e-mail: erika.melissari at bioclinica.unipi.it 
> <mailto:erika.melissari at bioclinica.unipi.it>
> ----- Original Message -----
> *From:* Erika Melissari <mailto:erika.melissari at bioclinica.unipi.it>
> *To:* bioconductor at stat.math.ethz.ch 
> <mailto:bioconductor at stat.math.ethz.ch> ; Francois Pepin 
> <mailto:fpepin at cs.mcgill.ca>
> *Sent:* Friday, June 05, 2009 18:23 PM
> *Subject:* avereps function
> 
> Dear list,
>  
> I used averep function after normalization and before lmFit to average 
> spot copies on microarrays.
> I noted that since a lot of spots have been averaged (the total number 
> of spots have been reduced to 41000 from 43000), other spots do not have.
> See this example:
>  
> Block 	Column 	Row 	ID 	Name 	Sequence 	ProbeUID 	GeneName 	logFC 
> adj.P.Val 	B
> 1 	85 	183 	A_23_P135769 	NM_001101 
> TTTAAAAACTGGAACGGTGAAGGTGACAGCAGTCGGTTGGAGCGAGCATCCCCCAAAGTT 	2871 
> ACTB 	0.266302 	0.048228 	0.181434
> 1 	20 	393 	A_23_P135769 	NM_001101 
> TTTAAAAACTGGAACGGTGAAGGTGACAGCAGTCGGTTGGAGCGAGCATCCCCCAAAGTT 	2871 
> ACTB 	-0.20687 	0.068295 	-0.6233
> 1 	56 	294 	A_23_P135769 	NM_001101 
> TTTAAAAACTGGAACGGTGAAGGTGACAGCAGTCGGTTGGAGCGAGCATCCCCCAAAGTT 	2871 
> ACTB 	0.110065 	0.382405 	-4.54642
> 1 	22 	299 	A_23_P135769 	NM_001101 
> TTTAAAAACTGGAACGGTGAAGGTGACAGCAGTCGGTTGGAGCGAGCATCCCCCAAAGTT 	2871 
> ACTB 	0.085017 	0.405978 	-4.66767
> 1 	53 	457 	A_23_P135769 	NM_001101 
> TTTAAAAACTGGAACGGTGAAGGTGACAGCAGTCGGTTGGAGCGAGCATCCCCCAAAGTT 	2871 
> ACTB 	0.080708 	0.483304 	-5.0517
> 1 	17 	39 	A_23_P135769 	NM_001101 
> TTTAAAAACTGGAACGGTGAAGGTGACAGCAGTCGGTTGGAGCGAGCATCCCCCAAAGTT 	2871 
> ACTB 	0.063279 	0.710629 	-5.73913
> 1 	45 	199 	A_23_P135769 	NM_001101 
> TTTAAAAACTGGAACGGTGAAGGTGACAGCAGTCGGTTGGAGCGAGCATCCCCCAAAGTT 	2871 
> ACTB 	0.051584 	0.778778 	-5.87993
> 1 	64 	279 	A_23_P135769 	NM_001101 
> TTTAAAAACTGGAACGGTGAAGGTGACAGCAGTCGGTTGGAGCGAGCATCCCCCAAAGTT 	2871 
> ACTB 	-0.04158 	0.800246 	-5.91735
> 1 	16 	358 	A_23_P135769 	NM_001101 
> TTTAAAAACTGGAACGGTGAAGGTGACAGCAGTCGGTTGGAGCGAGCATCCCCCAAAGTT 	2871 
> ACTB 	0.024847 	0.880504 	-6.03386
> 1 	21 	435 	A_23_P135769 	NM_001101 
> TTTAAAAACTGGAACGGTGAAGGTGACAGCAGTCGGTTGGAGCGAGCATCCCCCAAAGTT 	2871 
> ACTB 	0.000153 	0.999438 	-6.11393
> 1 	4 	111 	A_23_P31323 	NM_001101 
> ACTCTTCCAGCCTTCCTTCCTGGGCATGGAGTCCTGTGGCATCCACGAAACTACCTTCAA 	8562 
> ACTB 	0.283846 	0.043577 	0.472915
> 1 	17 	275 	A_24_P226554 	NM_001101 
> GCACCCAGCACAATGAAGATCAAGATCATTGCTCCTCCTGAGCGCAAGTACTCCGTGTGG 	21338 
> ACTB 	0.030637 	0.848504 	-5.9958
> 1 	74 	251 	A_32_P137939 	NM_001101 
> AGGCAGCCAGGGCTTACCTGTACACTGACTTGAGACCAGTTGAATAAAAGTGCGCACCTT 	19564 
> ACTB 	-0.20387 	0.177982 	-2.87144
> 
>  
> Why the group of first 10 probes was not averaged by avereps?
> Any suggestion will be appreciated.
>  
> Thank you so much
>  
> Erika
>



More information about the Bioconductor mailing list