[R] help, please! matrix operations inside 3 nested loops
Petr PIKAL
petr.pikal at precheza.cz
Thu Aug 9 14:08:47 CEST 2012
Hi
> thank you for your help.
>
> my input data looks like this (tab separated):
>
> Ind.nr. Pop.nr. scm266 rms1280 scm247 rms1107
> 1 101 305 318 222 135
> 1 101 305 318 231 135
> 2 101 305 313 999 96
> 2 101 305 321 999 130
> 3 101 305 324 231 135
> 3 101 305 324 231 135
> 4 101 305 313 230 126
> 4 101 305 313 230 135
> 6 101 305 313 231 135
> 6 101 305 321 231 135
Better to use dput(your.data) for sharing data. Anyway I am still confused
but you probably are able to clarify things further.
>
> it is a dataset with genetic marker alleles for single individuals.
> the first row is the header, all following rows are individuals. 2 rows
> count for 1 individual.
> first colum is the individual's number, second colum is the number for
the
> population the individual comes from, and all following colums are
different
> genetic markers.
>
> what i want to do with this data in R, is to compare one individual with
In those 2 rows for one individual sometimes the genetic marker differs
> test[1:2, "scm247"]
[1] 222 231
What do you want to do with them?
> each of the other individuals, allele-wise. there are five
possibilities:
> the two compared individuals share 4,3,2,1,0 alleles of the currently
> examined marker (=colum). for each shared allele this pair of
individuals
> shall get 1 scoring point. for each pair of individuals, all scoring
points
> shall be summarized over all markers.
Based on your example,
> dput(test)
structure(list(Ind.nr. = c(1L, 1L, 2L, 2L, 3L, 3L, 4L, 4L, 6L,
6L), Pop.nr. = c(101L, 101L, 101L, 101L, 101L, 101L, 101L, 101L,
101L, 101L), scm266 = c(305L, 305L, 305L, 305L, 305L, 305L, 305L,
305L, 305L, 305L), rms1280 = c(318L, 318L, 313L, 321L, 324L,
324L, 313L, 313L, 313L, 321L), scm247 = c(222L, 231L, 999L, 999L,
231L, 231L, 230L, 230L, 231L, 231L), rms1107 = c(135L, 135L,
96L, 130L, 135L, 135L, 126L, 135L, 135L, 135L)), .Names = c("Ind.nr.",
"Pop.nr.", "scm266", "rms1280", "scm247", "rms1107"), class =
"data.frame", row.names = c(NA,
-10L))
what is your desired result?
Regards
Petr
>
>
> my code again, modified according to your suggestions:
>
> #1) read in data:
> daten<-read.table('K:/Analysen/STRUCTURE/test.txt', header=TRUE,
sep="\t")
> daten<-as.data.frame(daten)
>
> #2) create empty matrix:
> indxind<-matrix(0,nrow=617, ncol=617)
> indxind[1:20,1:19]
>
> #3) compare cells to each other, score:
> #for the whole dataset: s in 3:34, z1 in 1:617, z2 in 1:617
> for (s in 3:6) { #walks though the matrix colum by colum, starting at
> colum 3
> for (z1 in 1:6) { #for each current colum, take one row (z1)...
> for (z2 in 1:6) { #...and compare it to another row (z2) of the
current
> colum
> if (z1!=z2) {topf<-indxind[z1,z2]
> if (daten[2*z1-1,s]==daten[2*z2-1,s]) topf<-topf+1
> #actually, 2 rows make up 1 individual,
> if (daten[2*z1-1,s]==daten[2*z2,s]) topf<-topf+1
> #therefore i compare 2 rows
> if (daten[2*z1,s]==daten[2*z2-1,s]) topf<-topf+1
> #with another 2 rows
> if (daten[2*z1,s]==daten[2*z2,s]) topf<-topf+1
> indxind[z1,z2]<-topf
> indxind[z2,z1]<-topf
> }
> #print(c(s,z1,z2,indxind[1,2])) ##counts s, z1 and z2 properly,
but
> gives always 8 for indxind[1,2]
> }
> #indxind[1:5,1:5] #empty matrix
> }
> #indxind[1:5,1:5] #empty matrix
> }
>
> #4) check:
> indxind[1:5,1:5]
>
>
>
> @ Michael Weylandt: i've done my best with regard to the "big picture"
of my
> algorithm and the small reproducible example. i hope both is sufficient.
> @ Petr Pikal-3: in this case, there are only numerical values, but it's
a
> useful hint for my other codes.
> @ Petr Pikal-3 and Berend Hasselman: initializing indxind with 0's
instead
> of NAs helps, it fills something in indxind now. but it does the
calculation
> only for the first marker (colum 3), afterwards i get an error:
> Fehler in if (daten[2 * z1 - 1, s] == daten[2 * z2 - 1, s]) topf <- topf
+
> :
> Fehlender Wert, wo TRUE/FALSE nötig ist
> Error in if (daten[2 * z1 - 1, s] == daten[2 * z2 - 1, s]) topf <- topf
+ :
> Missing value, where TRUE/FAlse is required
> Has this something to do with the changing to
daten<-as.data.frame(daten) in
> line 3 (instead of as.matrix before)?
>
>
>
> --
> View this message in context: http://r.789695.n4.nabble.com/help-please-
> matrix-operations-inside-3-nested-loops-tp4639592p4639730.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list