[R] Script for searching in a kinship matrix

ginger biino at igm.cnr.it
Fri Nov 9 19:30:05 CET 2012


Hi everybody!
In a case-control study I have already sampled cases stratifying for sex
(0,1) and age (<62y, >=62y). I need to sample a group of controls with the
same characteristics (which I can easily do) plus one more: the level of
relatedness. Therefore controls should be matched to cases for sex (2
strata), age(2 strata) and, relatedness (less than a certain level). In
particular I need that controls are as least as possible related to cases,
for example each control should have a kinship coefficient less than 0.0156
(i.e. 1/64 as for second cousins) with its matched case. In the attached
example data set (Example.txt) there are the sampled cases (20 cases: 5
subjects for each strata) and 200 possible controls I have already sampled
stratifying for age and sex (50 subjects per strata).
What I do not know is how to solve the relatedness problem. I have already
computed the kinship coefficients matrix of the extended pedigree to whom
the cases and controls in the example data belong. I do not provide it now
because is a 4762 rows per 4762 columns matrix, and I do not know how to
extract the sub-matrix containing just the relevant 220 subjects of this
example data.
As an example, such a kinship matrix for 5 subjects (ID: 51, 59, 119, 156
and 178) is like:
	51	59	119	156	178
51	0.500	0.000	0.000	0.000	0.000
59	0.000	0.500	0.250	0.000	0.250
119	0.000	0.250	0.500	0.000	0.250
156	0.000	0.000	0.000	0.500	0.000
178	0.000	0.250	0.250	0.000	0.500
In conclusion I think I need a script that looks down such kinship matrix
searching for controls satisfying the relatedness condition and that adds to
my data set (Example.txt) as many columns as the maximum number of controls
satisfying the relatedness condition for their matched cases. In particular
in correspondence of each case (rows for which disease=1) the new columns
should return the ID of the match control satisfying the
condition:case-control kinship< 0.0156 ; otherwise a missing value (such as
NA). Such new columns will contain a value (ID code) only for cases and a
missing value for controls.
Does anybody can help me?
Ginger
Example.txt <http://r.789695.n4.nabble.com/file/n4649089/Example.txt>  



--
View this message in context: http://r.789695.n4.nabble.com/Script-for-searching-in-a-kinship-matrix-tp4649089.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list