[R] doing 1000 permutations and doing test statistics distribution

Tue Feb 4 22:41:53 CET 2020

On 04/02/2020 4:28 p.m., Ana Marija wrote:
> I tired your code on this simplified data just for say 10 permutations:
> 
> dat <- read.table(text = "   code.1 code.2 code.3 code.4
> 1     82     93     NA     NA
> 2     15     85     93     NA
> 3     93     89     NA     NA
> 4     81     NA     NA     NA",
>                    header = TRUE, stringsAsFactors = FALSE)
> 
> dat2=data.matrix(dat)
> 
>> result<- lapply(seq_len(10), FUN = function(dat2){
> +     dat2 <- dat2[, sample.int(4)]
> +     print(colnames(dat2))
> + } )
> Error in dat2[, sample.int(4)] : incorrect number of dimensions

Yes, Bert did suggest that, but it's obviously wrong.  The argument to 
FUN is an element of seq_len(10), it's not the full dataset.  Try

result<- lapply(seq_len(10), FUN = function(i){
      dat <- dat2[, sample.int(4)]
      print(colnames(dat))
  } )

Duncan Murdoch

> 
> 
> On Tue, Feb 4, 2020 at 3:10 PM Bert Gunter <bgunter.4567 using gmail.com> wrote:
>>
>> I am not going to do your programming for you. If the following doesn't suffice, maybe someone else will provide you something that will.
>>
>> m = your matrix
>>
>> code = your code that uses m
>>
>> your list of results <- lapply(seq_len(1000), FUN = function(m){
>>    m <- m[, sample.int(132)]
>>   code
>> } )
>>
>> or use an explicit for() loop to populate a list or vector with your results.
>>
>> Again, if I have misunderstood what you want to do, then clarify, and perhaps someone else will help.
>>
>> -- Bert
>>
>>
>>
>> Bert Gunter
>>
>> "The trouble with having an open mind is that people keep coming along and sticking things into it."
>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>>
>>
>> On Tue, Feb 4, 2020 at 12:41 PM Ana Marija <sokovic.anamarija using gmail.com> wrote:
>>>
>>> Hi Bert,
>>>
>>> thanks for getting back to me. I have to permute those 132 columns
>>> 1000 times and perform the code given in the previous email.
>>>
>>> Can you please show me how you would do that in the loop? This is also
>>> a huge data set ...
>>>
>>> Thanks
>>> Ana
>>>
>>> On Tue, Feb 4, 2020 at 2:34 PM Bert Gunter <bgunter.4567 using gmail.com> wrote:
>>>>
>>>> If you just want to permute columns of a matrix,
>>>>
>>>> ?sample
>>>>> sample.int(10)
>>>>   [1]  9  2 10  8  4  6  3  1  5  7
>>>>
>>>> and you can just use this as an index into the columns of your matrix, presumably within a loop of some sort.
>>>>
>>>> If I have misunderstood, just ignore.
>>>>
>>>> Cheers,
>>>> Bert
>>>>
>>>>
>>>>
>>>>
>>>> On Tue, Feb 4, 2020 at 12:23 PM Ana Marija <sokovic.anamarija using gmail.com> wrote:
>>>>>
>>>>> Hello,
>>>>>
>>>>> I have a matrix
>>>>>> dim(dat)
>>>>> [1] 15568   132
>>>>>
>>>>> It looks like this:
>>>>>
>>>>>                     NoD_14381_norm.1 NoD_14381_norm.2 NoD_14381_norm.3
>>>>> NoD_14520_30mM.1 NoD_14520_30mM.2 NoD_14520_30mM.3
>>>>> Ku8QhfS0n_hIOABXuE             4.75             4.25             4.79
>>>>>             4.33             4.63             3.85
>>>>> Bx496XsFXiAlj.Eaeo             6.15             6.23             6.55
>>>>>             6.26             6.24             5.99
>>>>> W38p0ogk.wIBVRXllY             7.13             7.35             7.55
>>>>>             7.37             7.36             7.55
>>>>> QIBkqIS9LR5DfTlTS8             6.27             6.73             6.45
>>>>>             5.39             4.75             4.96
>>>>> BZKiEvS0eQ305U0v34             6.35             7.02             6.76
>>>>>             5.45             5.25             5.02
>>>>> 6TheVd.HiE1UF3lX6g             5.53             5.02             5.36
>>>>>             5.61             5.66             5.37
>>>>>
>>>>> So it is a matrix with gene names ex. Ku8QhfS0n_hIOABXuE, and subjects
>>>>> named ex. NoD_14381_norm.1
>>>>>
>>>>>
>>>>> How to do 1000 permutations of these 132 columns and on each created
>>>>> new permuted matrix perform this code:
>>>>>
>>>>> subject="all_replicate"
>>>>> targets<-readTargets(paste(PhenotypeDir,"hg_sg_",subject,"_target.txt", sep=''))
>>>>> Treat <- factor(targets$Treatment,levels=c("C","T"))
>>>>> Replicates <- factor(targets$rep)
>>>>> design <- model.matrix(~Replicates+Treat)
>>>>> corfit <- duplicateCorrelation(dat, block = targets$Subject)
>>>>> corfit$consensus.correlation
>>>>> fit <-lmFit(dat,design,block=targets$Subject,correlation=corfit$consensus.correlation)
>>>>> fit<-eBayes(fit)
>>>>> qval.cutoff=0.1; FC.cutoff=0.17
>>>>> y1=topTable(fit, coef="TreatT", n=nrow(genes),adjust.method="BH",genelist=genes)
>>>>>
>>>>> y1 for each iteration of permutation would  have P.Value column and
>>>>> these I would have plotted on the end to find the distribution of all
>>>>> p values generated in those 1000 permutations.
>>>>>
>>>>> Please advise,
>>>>> Ana
>>>>>
>>>>> ______________________________________________
>>>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>>>> and provide commented, minimal, self-contained, reproducible code.
> 
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>