[R] R-help Digest, Vol 54, Issue 30

David Duffy David.Duffy at qimr.edu.au
Thu Aug 30 23:14:20 CEST 2007

Ron Crump wrote:
> Hi,
> I have a dataframe that contains pedigree information;
> that is individual, sire and dam identities as separate
> columns. It also has date of birth.
> These identifiers are not numeric, or not sequential.
> Obviously, an identifier can appear in one or two columns,
> depending on whether it was a parent or not. These should
> be consistent.
> Not all identifiers appear in the individual column - it
> is possible for a parent not to have its own record if its
> parents were not known.
> Missing parental (sire and/or dam) identifiers can occur.
> I need to export the data for use in another program that
> requires the pedigree to be coded as integers, increasing
> with date of birth (therefore sire and dam always have
> lower identifiers than their offspring) and with missing
> values coded as 0.
> How would I go about doing this?

You might look at http://www.qimr.edu.au/davidD/sib-pair.R,
specifically the read.pedigree() and wrlink() functions.  The former is not
very impressive speedwise -- I usually perform these tasks in the
my Sib-pair (Fortran) program, which is on the same webpage.  It will order
the pedigree by generational position, so a DOB is not required to do the sort.

Terry Therneau's kinship package does that ordering, but doesn't include
output routines for the Linkage format.

David Duffy.

| David Duffy (MBBS PhD)                                         ,-_|\
| email: davidD at qimr.edu.au  ph: INT+61+7+3362-0217 fax: -0101  /     *
| Epidemiology Unit, Queensland Institute of Medical Research   \_,-._/
| 300 Herston Rd, Brisbane, Queensland 4029, Australia  GPG 4D0B994A v

More information about the R-help mailing list