[R] doing 1000 permutations and doing test statistics distribution

Tue Feb 4 22:28:33 CET 2020

I tired your code on this simplified data just for say 10 permutations:

dat <- read.table(text = "   code.1 code.2 code.3 code.4
1     82     93     NA     NA
2     15     85     93     NA
3     93     89     NA     NA
4     81     NA     NA     NA",
                  header = TRUE, stringsAsFactors = FALSE)

dat2=data.matrix(dat)

> result<- lapply(seq_len(10), FUN = function(dat2){
+     dat2 <- dat2[, sample.int(4)]
+     print(colnames(dat2))
+ } )
Error in dat2[, sample.int(4)] : incorrect number of dimensions

On Tue, Feb 4, 2020 at 3:10 PM Bert Gunter <bgunter.4567 using gmail.com> wrote:
>
> I am not going to do your programming for you. If the following doesn't suffice, maybe someone else will provide you something that will.
>
> m = your matrix
>
> code = your code that uses m
>
> your list of results <- lapply(seq_len(1000), FUN = function(m){
>   m <- m[, sample.int(132)]
>  code
> } )
>
> or use an explicit for() loop to populate a list or vector with your results.
>
> Again, if I have misunderstood what you want to do, then clarify, and perhaps someone else will help.
>
> -- Bert
>
>
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along and sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Tue, Feb 4, 2020 at 12:41 PM Ana Marija <sokovic.anamarija using gmail.com> wrote:
>>
>> Hi Bert,
>>
>> thanks for getting back to me. I have to permute those 132 columns
>> 1000 times and perform the code given in the previous email.
>>
>> Can you please show me how you would do that in the loop? This is also
>> a huge data set ...
>>
>> Thanks
>> Ana
>>
>> On Tue, Feb 4, 2020 at 2:34 PM Bert Gunter <bgunter.4567 using gmail.com> wrote:
>> >
>> > If you just want to permute columns of a matrix,
>> >
>> > ?sample
>> > > sample.int(10)
>> >  [1]  9  2 10  8  4  6  3  1  5  7
>> >
>> > and you can just use this as an index into the columns of your matrix, presumably within a loop of some sort.
>> >
>> > If I have misunderstood, just ignore.
>> >
>> > Cheers,
>> > Bert
>> >
>> >
>> >
>> >
>> > On Tue, Feb 4, 2020 at 12:23 PM Ana Marija <sokovic.anamarija using gmail.com> wrote:
>> >>
>> >> Hello,
>> >>
>> >> I have a matrix
>> >> > dim(dat)
>> >> [1] 15568   132
>> >>
>> >> It looks like this:
>> >>
>> >>                    NoD_14381_norm.1 NoD_14381_norm.2 NoD_14381_norm.3
>> >> NoD_14520_30mM.1 NoD_14520_30mM.2 NoD_14520_30mM.3
>> >> Ku8QhfS0n_hIOABXuE             4.75             4.25             4.79
>> >>            4.33             4.63             3.85
>> >> Bx496XsFXiAlj.Eaeo             6.15             6.23             6.55
>> >>            6.26             6.24             5.99
>> >> W38p0ogk.wIBVRXllY             7.13             7.35             7.55
>> >>            7.37             7.36             7.55
>> >> QIBkqIS9LR5DfTlTS8             6.27             6.73             6.45
>> >>            5.39             4.75             4.96
>> >> BZKiEvS0eQ305U0v34             6.35             7.02             6.76
>> >>            5.45             5.25             5.02
>> >> 6TheVd.HiE1UF3lX6g             5.53             5.02             5.36
>> >>            5.61             5.66             5.37
>> >>
>> >> So it is a matrix with gene names ex. Ku8QhfS0n_hIOABXuE, and subjects
>> >> named ex. NoD_14381_norm.1
>> >>
>> >>
>> >> How to do 1000 permutations of these 132 columns and on each created
>> >> new permuted matrix perform this code:
>> >>
>> >> subject="all_replicate"
>> >> targets<-readTargets(paste(PhenotypeDir,"hg_sg_",subject,"_target.txt", sep=''))
>> >> Treat <- factor(targets$Treatment,levels=c("C","T"))
>> >> Replicates <- factor(targets$rep)
>> >> design <- model.matrix(~Replicates+Treat)
>> >> corfit <- duplicateCorrelation(dat, block = targets$Subject)
>> >> corfit$consensus.correlation
>> >> fit <-lmFit(dat,design,block=targets$Subject,correlation=corfit$consensus.correlation)
>> >> fit<-eBayes(fit)
>> >> qval.cutoff=0.1; FC.cutoff=0.17
>> >> y1=topTable(fit, coef="TreatT", n=nrow(genes),adjust.method="BH",genelist=genes)
>> >>
>> >> y1 for each iteration of permutation would  have P.Value column and
>> >> these I would have plotted on the end to find the distribution of all
>> >> p values generated in those 1000 permutations.
>> >>
>> >> Please advise,
>> >> Ana
>> >>
>> >> ______________________________________________
>> >> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> >> https://stat.ethz.ch/mailman/listinfo/r-help
>> >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> >> and provide commented, minimal, self-contained, reproducible code.