[R] help, please! matrix operations inside 3 nested loops
Fridolin
smells_like_rock at gmx.net
Thu Aug 9 10:51:36 CEST 2012
thank you for your help.
my input data looks like this (tab separated):
Ind.nr. Pop.nr. scm266 rms1280 scm247 rms1107
1 101 305 318 222 135
1 101 305 318 231 135
2 101 305 313 999 96
2 101 305 321 999 130
3 101 305 324 231 135
3 101 305 324 231 135
4 101 305 313 230 126
4 101 305 313 230 135
6 101 305 313 231 135
6 101 305 321 231 135
it is a dataset with genetic marker alleles for single individuals.
the first row is the header, all following rows are individuals. 2 rows
count for 1 individual.
first colum is the individual's number, second colum is the number for the
population the individual comes from, and all following colums are different
genetic markers.
what i want to do with this data in R, is to compare one individual with
each of the other individuals, allele-wise. there are five possibilities:
the two compared individuals share 4,3,2,1,0 alleles of the currently
examined marker (=colum). for each shared allele this pair of individuals
shall get 1 scoring point. for each pair of individuals, all scoring points
shall be summarized over all markers.
my code again, modified according to your suggestions:
#1) read in data:
daten<-read.table('K:/Analysen/STRUCTURE/test.txt', header=TRUE, sep="\t")
daten<-as.data.frame(daten)
#2) create empty matrix:
indxind<-matrix(0,nrow=617, ncol=617)
indxind[1:20,1:19]
#3) compare cells to each other, score:
#for the whole dataset: s in 3:34, z1 in 1:617, z2 in 1:617
for (s in 3:6) { #walks though the matrix colum by colum, starting at
colum 3
for (z1 in 1:6) { #for each current colum, take one row (z1)...
for (z2 in 1:6) { #...and compare it to another row (z2) of the current
colum
if (z1!=z2) {topf<-indxind[z1,z2]
if (daten[2*z1-1,s]==daten[2*z2-1,s]) topf<-topf+1
#actually, 2 rows make up 1 individual,
if (daten[2*z1-1,s]==daten[2*z2,s]) topf<-topf+1
#therefore i compare 2 rows
if (daten[2*z1,s]==daten[2*z2-1,s]) topf<-topf+1
#with another 2 rows
if (daten[2*z1,s]==daten[2*z2,s]) topf<-topf+1
indxind[z1,z2]<-topf
indxind[z2,z1]<-topf
}
#print(c(s,z1,z2,indxind[1,2])) ##counts s, z1 and z2 properly, but
gives always 8 for indxind[1,2]
}
#indxind[1:5,1:5] #empty matrix
}
#indxind[1:5,1:5] #empty matrix
}
#4) check:
indxind[1:5,1:5]
@ Michael Weylandt: i've done my best with regard to the "big picture" of my
algorithm and the small reproducible example. i hope both is sufficient.
@ Petr Pikal-3: in this case, there are only numerical values, but it's a
useful hint for my other codes.
@ Petr Pikal-3 and Berend Hasselman: initializing indxind with 0's instead
of NAs helps, it fills something in indxind now. but it does the calculation
only for the first marker (colum 3), afterwards i get an error:
Fehler in if (daten[2 * z1 - 1, s] == daten[2 * z2 - 1, s]) topf <- topf +
:
Fehlender Wert, wo TRUE/FALSE nötig ist
Error in if (daten[2 * z1 - 1, s] == daten[2 * z2 - 1, s]) topf <- topf + :
Missing value, where TRUE/FAlse is required
Has this something to do with the changing to daten<-as.data.frame(daten) in
line 3 (instead of as.matrix before)?
--
View this message in context: http://r.789695.n4.nabble.com/help-please-matrix-operations-inside-3-nested-loops-tp4639592p4639730.html
Sent from the R help mailing list archive at Nabble.com.
More information about the R-help
mailing list