[R] Sampling in R

Mike Lawrence Mike.Lawrence at dal.ca
Tue Apr 21 13:32:46 CEST 2009


When you shuffle the observations independently, you are performing a
permutation test (though for this you only need to shuffle one side of
the pairs). When you sort the observations you are doing something
ridiculous that has no statistical meaning that I know.

I'm not very familiar with bootstrap CI's, but I think the idea is to
sample the PAIRS of data WITH replacement:
http://lmgtfy.com/?q=bootstrap+correlation

(first link is to a good overview by David Howell)

On Tue, Apr 21, 2009 at 7:25 AM, Seyit Ali Kayis <s_a_kayis at hotmail.com> wrote:
>
> Dear R users,
>
> I need to do sampling without replacement (bootstraps). I have two variables (Xvar, Yvar).
> I have a correlation from original data set cor(Xvar, Yvar)=0.6174221. I am doing 50000 sampling,
> and in each sampling  calculating correlations, saving, sorting and  getting 95% cutt off point (0.1351877).
> I am getting maximum value as 0.3507219 (much smaller than correlation of my original data).
> I repeated the sampling a  couple of time and none of them produced a correlation
> coefficient higher than my original data set. However, if I sort out my Xvar and Yvar and
> obtain correlation it is 0.9657125 which is much higher than correlation for my original data.
> I am doing sampling in another program and getting at least 1% higher correlation than mine.
> Now I am getting confused with sampling(random data) in R. My data and codes for the scenario above are below
>
>
> Xvar<-c(0.1818182,0.5384615,0.5535714,0.4680851,0.4545455,0.4385965,0.5185185,0.4035088,0.4901961,0.3650794,0.462963,0.4,0.56,0.3965517,0.4909091,
>        0.4716981,0.4310345,0.2,0.1509434,0.2647059,0.173913,0.1914894,0.1914894,0.1489362,0.1363636,0.2244898,0.2325581,0.1333333,0.1818182,0.1702128,
>        0.2173913,0.2380952,0.1632653,0.5614035,0.3396226,0.4909091,0.3770492,0.5,0.5185185,0.5,0.4666667,0.4464286,0.362069,0.4285714,0.4561404,
>        0.4736842,0.4545455,0.4166667,0.4181818,0.4590164,0.5166667,0.5423729,0.4833333,0.5454545,0.4393939,0.5172414,0.4098361,0.4745763,0.4754098,
>        0.5166667,0.5,0.4603175,0.42,0.4038462,0.4897959,0.3148148,0.3673469,0.4,0.4583333,0.3877551,0.4375,0.4117647,0.4313725,0.5333333,0.3962264,
>        0.3548387,0.5272727,0.4137931,0.3928571,0.4666667,0.4210526,0.4363636,0.4545455,0.4310345,0.4237288,0.4814815,0.4912281,0.4333333,0.4,0.4285714,
>        0.4516129,0.5090909,0.4464286,0.4642857,0.4166667,0.4098361,0.4909091,0.3809524,0.5272727,0.4814815,0.5254237,0.627451,0.5,0.5471698,0.5454545,
>        0.5925926,0.5769231,0.5818182,0.4444444,0.4915254,0.4727273,0.4107143,0.4285714,0.4310345,0.4237288,0.4285714,0.440678,0.4237288,0.4807692,
>        0.4150943,0.4615385,0.4107143,0.4814815,0.4074074,0.4210526,0.5263158,0.440678,0.4576271,0.5344828,0.5,0.5636364,0.4677419,0.5,0.5192308,
>        0.4642857,0.5090909,0.58,0.4482759,0.5098039,0.4035088,0.4210526,0.5098039,0.4385965,0.5283019,0.5471698,0.625,0.4310345,0.4912281,0.5283019,
>        0.4576271,0.5471698,0.4745763,0.4821429)
>
> Yvar<-c(0.2553191,0.4107143,0.5660377,0.3888889,0.3606557,0.2898551,0.3818182,0.4,0.4,0.3278689,0.2903226,0.4074074,0.4181818,0.3,0.2238806,0.3728814,
>        0.3709677,0.2307692,0.2830189,0.2244898,0.2142857,0.2131148,0.22,0.2258065,0.2321429,0.2,0.2264151,0.22,0.2115385,0.2459016,0.1166667,0.1785714,
>        0.2068966,0.6,0.4285714,0.3134328,0.4461538,0.3965517,0.4769231,0.6181818,0.4827586,0.3709677,0.3965517,0.4821429,0.4545455,0.359375,0.4576271,
>        0.4516129,0.5272727,0.4603175,0.4,0.4912281,0.5384615,0.5,0.4516129,0.4126984,0.4655172,0.5263158,0.4925373,0.358209,0.4285714,0.4920635,
>        0.4482759,0.3235294,0.4,0.4375,0.440678,0.3898305,0.35,0.4528302,0.58,0.4153846,0.3174603,0.5185185,0.3870968,0.2894737,0.3709677,0.369863,
>        0.3676471,0.3636364,0.3088235,0.328125,0.4032258,0.4084507,0.3188406,0.3636364,0.3823529,0.2816901,0.4722222,0.5,0.3521127,0.4393939,0.3787879,
>        0.453125,0.4324324,0.4057971,0.4545455,0.4492754,0.5,0.4098361,0.4067797,0.3666667,0.3928571,0.4285714,0.5,0.2923077,0.4561404,0.45,0.5538462,
>        0.4626866,0.4057971,0.3676471,0.5322581,0.5428571,0.375,0.4411765,0.4571429,0.4,0.3846154,0.3870968,0.4915254,0.530303,0.4375,0.4918033,0.4179104,
>        0.4032258,0.3606557,0.5178571,0.4848485,0.390625,0.375,0.4375,0.3666667,0.4,0.4477612,0.2571429,0.4032258,0.3382353,0.4814815,0.4090909,0.3548387,
>        0.4821429,0.5,0.557377,0.4333333,0.5454545,0.4590164,0.3943662,0.5076923,0.5,0.3283582,0.3676471,0.559322)
>
> my.cor<-cor(Xvar, Yvar)
> print(my.cor)
>
> nperm<-49999
> Perm.Cor<-NULL
>
> for (iperm in 1:nperm)  {
> XvarNew<-sample(Xvar, size=length(Xvar), replace=FALSE)
> YvarNew<-sample(Yvar, size=length(Yvar), replace=FALSE)
> perm.cor<-cor(XvarNew, YvarNew)
> Perm.Cor<-c(Perm.Cor, perm.cor)
>                        }
> print(max(Perm.Cor))
> XvarSorted<-sort(Xvar, decreasing=TRUE)
> YvarSorted<-sort(Yvar, decreasing=TRUE)
> max.cor<-cor(XvarSorted, YvarSorted)
> print(max.cor)
> if(mat.cor>0) Perm.Cor.Sorted<-sort(Perm.Cor, decreasing=TRUE)
> if(mat.cor<0) Perm.Cor.Sorted<-sort(Perm.Cor, decreasing=FALSE)
> T95<-Perm.Cor.Sorted[(nperm+1)*0.05]    # 95% treshold value
> T99<-Perm.Cor.Sorted[(nperm+1)*0.01]    # 99% treshold value
>
>
>
> I want to understand where I am making a mistake. Any comment is deeply appreciated.
>
> Kind Regards
>
> Seyit Ali
>
>
> ------------------------------------------------------------------------------------------------------------------
> Dr. Seyit Ali KAYIS
> Selcuk University
> Faculty of Agriculture
> Kampus, Konya, TURKEY
>
>            s_a_kayis at yahoo.com,    s_a_kayis at hotmail.com
> Tell: +90 332 223 2830  Mobile: +90 535 587 1139  Fax: +90 332 241 0108
>
>                   Greetings from Konya, TURKEY
>                http://www.ziraat.selcuk.edu.tr/skayis/
> ----------------------------------------------------------------------------------------------------------------------
>
>
>
>
>
>
>
> _________________________________________________________________
> Earning enough? Find out with SEEK Salary Survey
>
> %2Eco%2Enz%2F%3Ftracking%3Dsk%3Atl%3Asknzsal%3Amsnnz%3A0%3Ahottag%3Aearn%5Fenough&_t=757263783&_r=Seek_NZ_tagline&_m=EXT
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tr.im/mikes_public_calendar

~ Certainty is folly... I think. ~




More information about the R-help mailing list