[R] permutation analysis with randomly chosen subsets of a matrix

David Winsemius dwinsemius at comcast.net
Sun Jan 31 21:53:18 CET 2010


On Jan 31, 2010, at 2:19 PM, Zoppoli, Gabriele (NIH/NCI) [G] wrote:

> Hello,
>
> here is the problem:
>
> I want to demonstrate that, on average, the Pearson's correlations  
> of a specified subset of genes from a huge list (>18,000 columns)  
> are higher than any randomly chosen subset of that list. I would  
> therefore like to do a number of tests between that specified subset  
> and randomly chosen ones from the "mother" list.
>
> How could I do that? What would be an appropriate statistical test?

You should construct a small dataset that resembles your data and post  
that. You are currently (and in the past) using terms that have  
specific meaning in R ("list", "columns", subset, "header") but I  
don't think you have enough experience in R to use them for  
unambiguous communication with experienced users. For example, in the  
current question, it really has no unambiguous meaning to say that  
"genes" have "correlations". Some measurements regarding genes might,  
but you have not indicated what sort of measurements you performed.  
Use R expressions to exemplify what the data looks like. That removes  
ambiguities.

-- 
David.

>
> Thank you for your help!
>
>
> Gabriele Zoppoli, MD
> Ph.D. Fellow, Experimental and Clinical Oncology and Hematology,  
> University of Genova, Genova, Italy
> Guest Researcher, LMP, NCI, NIH, Bethesda MD
>
> Work: 301-451-8575
> Mobile: 301-204-5642
> Email: zoppolig at mail.nih.gov
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
Heritage Laboratories
West Hartford, CT



More information about the R-help mailing list