[R] Pearson correlation with randomization
David Winsemius
dwinsemius at comcast.net
Wed Jan 19 05:56:06 CET 2011
On Jan 18, 2011, at 11:23 PM, Brahmachary, Manisha wrote:
> Hello,
>
>
>
> I will be very obliged if someone can help me with this statistical R
> problem:
>
> I am trying to do a Pearson correlation on my datasets X, Y with
> randomization test. My X and Y datasets are pairs.
>
> 1. I want to randomize (rearrange) only my X dataset per
> row ,while
> keeping the my Y dataset as it is.
X <- X[sample(1:nrow(Y)), ]
>
> 2. Then Calculate the correlation for this pair, and compare it
> to
> your true value of correlation.
>
> 3. Repeat 2 and 3 maybe a 100 times
You may want to look at the replicate function.
>
> 4. If your true p-value is greater than 95% of the random values,
> then you can reject the null hypothesis at p<0.05.
You won't have a very stable estimate of the 95th order statistics
with "maybe" 100 replications.
--
David.
>
>
>
> I am stuck at the randomization step. I need some help in implementing
> it the appropriate randomization step in my correlation.
>
> Below is my incomplete code. I will be very obliged if someone could
> help:
>
>
>
> X <- read.table("X.txt",as.is=T,header=T,row.names=1)
>
> Y <- read.table("Y.txt",as.is=T,header=T,row.names=1)
>
>
>
> X.mat<- as.matrix(X)
>
> Y.mat<- as.matrix(Y)
>
>
>
> Corrs<- cor.test(X.mat[1,],Y.mat[1,],alternative =c("greater"),method=
> c("pearson"))
>
>
>
> Corrs.rand <- list()
>
>
>
> for (i in 1:length(X.mat)){
>
> for (j in 1:100){
>
>
>
> # This doesnot seem to wrok correctly. How do I run sample function
> 100
> times for the same row?
>
>
>
> SNP.rand<- sample(SNP.mat[i,],56, replace = FALSE, prob = NULL)
>
> Corrs.rand[[j]]<- cor.test(SNP.rand,CNV.mat[j,],alternative
> =c("greater"),method= c("pearson"))
>
>
>
> # need to calculate how many times my pvalue from true p-value> random
> pvalue
>
> }
>
> }
>
>
>
> X dataset:
>
>
>
> #Probes
>
> X10851
>
> X12144
>
> X12155
>
> X11882
>
> X10860
>
> X12762
>
> X12239
>
> X12154
>
> 1
>
> 1
>
> 1
>
> 0
>
> 0
>
> 1
>
> 0
>
> 2
>
> 0
>
> 2
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 3
>
> 2
>
> 2
>
> 2
>
> 2
>
> 1
>
> 2
>
> 1
>
> 2
>
> 4
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 5
>
> 2
>
> 2
>
> 2
>
> 2
>
> 2
>
> 2
>
> 2
>
> 2
>
> 6
>
> 0
>
> 1
>
> 0
>
> 0
>
> 1
>
> 1
>
> 1
>
> 1
>
> 7
>
> 2
>
> 2
>
> NaN
>
> 2
>
> 2
>
> 2
>
> 2
>
> 2
>
> 8
>
> 2
>
> 2
>
> 2
>
> 2
>
> 2
>
> 2
>
> 2
>
> 2
>
> 9
>
> 0
>
> 1
>
> 0
>
> 1
>
> 1
>
> NaN
>
> 1
>
> 2
>
> 10
>
> 2
>
> 2
>
> 2
>
> 2
>
> 2
>
> 2
>
> 2
>
> 2
>
> 11
>
> 2
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 12
>
> 0
>
> 1
>
> 0
>
> 1
>
> 1
>
> 0
>
> 1
>
> 1
>
>
>
> Y dataset:
>
> Probes
>
> X10851
>
> X12144
>
> X12155
>
> X11882
>
> X10860
>
> X12762
>
> X12239
>
> X12154
>
> 1
>
> 793.0831
>
> 788.1814
>
> 867.8504
>
> 729.8321
>
> 816.852
>
> 805.2114
>
> 774.599
>
> 854.6384
>
> 2
>
> 12.8695
>
> 4.312894
>
> 10.69769
>
> 5.872213
>
> 13.793
>
> 9.394133
>
> 6.297553
>
> 9.307943
>
> 3
>
> 699.7792
>
> 826.9974
>
> 795.641
>
> 770.9376
>
> 806.1241
>
> 782.397
>
> 817.1075
>
> 859.7155
>
> 4
>
> 892.8217
>
> 869.0481
>
> 806.3387
>
> 812.0431
>
> 873.5565
>
> 794.4752
>
> 813.9587
>
> 814.8681
>
> 5
>
> 892.8217
>
> 869.0481
>
> 806.3387
>
> 812.0431
>
> 873.5565
>
> 794.4752
>
> 813.9587
>
> 814.8681
>
> 6
>
> 839.735
>
> 943.4456
>
> 950.7575
>
> 859.0208
>
> 894.246
>
> 853.5241
>
> 941.4842
>
> 913.0246
>
> 7
>
> 653.1272
>
> 751.5218
>
> 750.1758
>
> 737.3821
>
> 757.8486
>
> 758.2407
>
> 724.2186
>
> 770.8669
>
> 8
>
> 12.8695
>
> 4.312894
>
> 10.69769
>
> 5.872213
>
> 13.793
>
> 9.394133
>
> 6.297553
>
> 9.307943
>
> 9
>
> 839.735
>
> 943.4456
>
> 950.7575
>
> 859.0208
>
> 894.246
>
> 853.5241
>
> 941.4842
>
> 913.0246
>
> 10
>
> 653.1272
>
> 751.5218
>
> 750.1758
>
> 737.3821
>
> 757.8486
>
> 758.2407
>
> 724.2186
>
> 770.8669
>
> 11
>
> 653.1272
>
> 751.5218
>
> 750.1758
>
> 737.3821
>
> 757.8486
>
> 758.2407
>
> 724.2186
>
> 770.8669
>
> 12
>
> 839.735
>
> 943.4456
>
> 950.7575
>
> 859.0208
>
> 894.246
>
> 853.5241
>
> 941.4842
>
> 913.0246
>
>
>
>
>
>
>
> Thanks in advance
>
>
>
> Manisha
>
>
>
> Mount Sinai School of Medicine
>
> Icahn Medical Institute,
>
> 1425 Madison Avenue, Box 1498
>
> NY-10029, NEW-YORK, USA
>
>
>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD
West Hartford, CT
More information about the R-help
mailing list