[R] Fwd: R apply() help -urgent

Tue May 11 10:04:35 CEST 2010

Set up a function for the fisher.test on a 2x2 table and then include 
this in the apply function for columns as in the example below. The 
result is a list with names A to Z

# set up a dummy data set with 100 rows
Cat<-LETTERS[sample(1:6,100, replace=T)]
GL<-sample(1:6, 100, replace=T)
dat<-matrix(sample(c(0,1),100*27, replace=T), nrow=100)
colnames(dat)<-c(LETTERS[1:26],"pLoss")
data1<-data.frame(Cat, GL, dat)

# define function fro fisher.test
ff<-function(x,y){
fisher.test(table(x,y))
}

# apply function to columns A to Z
results<-apply(data1[,LETTERS[1:26]],2, ff, y=data1[,"pLoss"])
# the results are in the form of a list with names A to Z
results$C

On 19:59, Venkatesh Patel wrote:
> ---------- Forwarded message ----------
> From: Dr. Venkatesh<drvenki at liv.ac.uk>
> Date: Sun, May 9, 2010 at 4:55 AM
> Subject: R apply() help -urgent
> To: r-help at r-project.org
>
>
> I have a file with 4873 rows of 1s or 0s and has 26 alphabets (A-Z) as
> columns. the 27th column also has 1s and 0s but stands for a different
> variable (pLoss). columns 1 and 2 are not significant and hence lets ignore
> them for now.
>
> here is how the file looks
>
> Cat    GL  A   B   C   D   E   F   G   H   I   J   K   L   M   N   O   P   Q
>    R   S   T   U   V   W   X   Y   Z     pLoss
> H      5   0   0   0   0   0   0   0   1   0   0   0   0   0   0   0   0   0
>    0   0   0   0   0   0   0   0   0     1
> E      5   0   0   0   0   1   0   0   0   0   0   0   0   0   0   0   0   0
>    0   0   0   0   0   0   0   0   0     1
> P      6   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   1   0
>    0   0   0   0   0   0   0   0   0     1
> P      5   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   1   0
>    0   0   0   0   0   0   0   0   0     1
> F      6   0   0   0   0   0   1   0   0   0   0   0   0   0   0   0   0   0
>    0   0   0   0   0   0   0   0   0     1
> E      4   0   0   0   0   1   0   0   0   0   0   0   0   0   0   0   0   0
>    0   0   0   0   0   0   0   0   0     1
> H      5   0   0   0   0   0   0   0   1   0   0   0   0   0   0   0   0   0
>    0   0   0   0   0   0   0   0   0     1
> J      4   0   0   0   0   0   0   0   0   0   1   0   0   0   0   0   0   0
>    0   0   0   0   0   0   0   0   0     1
> J      4   0   0   0   0   0   0   0   0   0   1   0   0   0   0   0   0   0
>    0   0   0   0   0   0   0   0   0     1
> E      5   0   0   0   0   1   0   0   0   0   0   0   0   0   0   0   0   0
>    0   0   0   0   0   0   0   0   0     1
> S      6   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
>    0   1   0   0   0   0   0   0   0     1
> ..
> ..
> ..
> ..
> ..
> ..
>
> Alphabets A-Z stand for different categories of protein families and pLoss
> stands for their presence or absence in an animal.
>
> I intend to do Fisher's test for 26 individual 2X2 tables constructed from
> each of these alphabets vs pLoss.
>
> For example, here is what I did for alphabet A and then B and then C.... so
> on. (I have attached R-input.csv for your perusal)
>
>    
>> data1<- read.table("R_input.csv", header = T)
>> datatable<- table(data1$A, data1$pLoss) #create a new datatable2 or 3
>>      
> with table(data1$B.. or  (data1$C.. and so on
>    
>> datatable
>>      
>         0    1
>    0   31 4821
>    1    0   21
>
> now run the Fisher's test for these datatables one by one for the 26
> alphabets :(
>
> fisher.test(datatable), ... fisher.test(datatable2)...
>
> in this case, the task is just for 26 columns.. so I can do it manually.
>
> But I would like to do an automated extraction and fisher's test for all the
> columns.
>
> I tried reading the tutorials and trying a few examples. Cant really come up
> with anything sensible.
>
> How can I use apply() in this regard? or is there any other way, a loop may
> be? to solve this issue.
>
> Please help.
>
> Thanks a million in advance,
>
> Dr Venkatesh Patel
> School of Biological Sciences
> University of Liverpool
> United Kingdom
>
>
>