[R] help, please! matrix operations inside 3 nested loops

Fridolin smells_like_rock at gmx.net
Thu Aug 9 10:51:36 CEST 2012


thank you for your help.

my input data looks like this (tab separated):

Ind.nr.	Pop.nr.	scm266	rms1280	scm247	rms1107
1	101	305	318	222	135
1	101	305	318	231	135
2	101	305	313	999	96
2	101	305	321	999	130
3	101	305	324	231	135
3	101	305	324	231	135
4	101	305	313	230	126
4	101	305	313	230	135
6	101	305	313	231	135
6	101	305	321	231	135

it is a dataset with genetic marker alleles for single individuals. 
the first row is the header, all following rows are individuals. 2 rows
count for 1 individual.
first colum is the individual's number, second colum is the number for the
population the individual comes from, and all following colums are different
genetic markers.

what i want to do with this data in R, is to compare one individual with
each of the other individuals, allele-wise. there are five possibilities:
the two compared individuals share 4,3,2,1,0 alleles of the currently
examined marker (=colum). for each shared allele this pair of individuals
shall get 1 scoring point. for each pair of individuals, all scoring points
shall be summarized over all markers.


my code again, modified according to your suggestions:

#1) read in data:
daten<-read.table('K:/Analysen/STRUCTURE/test.txt', header=TRUE, sep="\t")
daten<-as.data.frame(daten)

#2) create empty matrix:
indxind<-matrix(0,nrow=617, ncol=617) 
indxind[1:20,1:19]

#3) compare cells to each other, score:
#for the whole dataset: s in 3:34, z1 in 1:617, z2 in 1:617
for (s in 3:6) {   #walks though the matrix colum by colum, starting at
colum 3
  for (z1 in 1:6) {  #for each current colum, take one row (z1)...
    for (z2 in 1:6) {  #...and compare it to another row (z2) of the current
colum
      if (z1!=z2) {topf<-indxind[z1,z2]
                   if (daten[2*z1-1,s]==daten[2*z2-1,s]) topf<-topf+1  
#actually, 2 rows make up 1 individual,
                   if (daten[2*z1-1,s]==daten[2*z2,s]) topf<-topf+1     
#therefore i compare 2 rows
                   if (daten[2*z1,s]==daten[2*z2-1,s]) topf<-topf+1     
#with another 2 rows
                   if (daten[2*z1,s]==daten[2*z2,s]) topf<-topf+1
                   indxind[z1,z2]<-topf
                   indxind[z2,z1]<-topf
      }
      #print(c(s,z1,z2,indxind[1,2])) ##counts s, z1 and z2 properly, but
gives always 8 for indxind[1,2]
    }
    #indxind[1:5,1:5] #empty matrix
  }
  #indxind[1:5,1:5] #empty matrix
}

#4) check:
indxind[1:5,1:5]



@ Michael Weylandt: i've done my best with regard to the "big picture" of my
algorithm and the small reproducible example. i hope both is sufficient.
@ Petr Pikal-3: in this case, there are only numerical values, but it's a
useful hint for my other codes.
@ Petr Pikal-3 and Berend Hasselman: initializing indxind with 0's instead
of NAs helps, it fills something in indxind now. but it does the calculation
only for the first marker (colum 3), afterwards i get an error: 
Fehler in if (daten[2 * z1 - 1, s] == daten[2 * z2 - 1, s]) topf <- topf + 
: 
  Fehlender Wert, wo TRUE/FALSE nötig ist
Error in if (daten[2 * z1 - 1, s] == daten[2 * z2 - 1, s]) topf <- topf +  :
  Missing value, where TRUE/FAlse is required
Has this something to do with the changing to daten<-as.data.frame(daten) in
line 3 (instead of as.matrix before)?



--
View this message in context: http://r.789695.n4.nabble.com/help-please-matrix-operations-inside-3-nested-loops-tp4639592p4639730.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list