[R] data.table/ifelse conditional new variable question

Jorge I Velez jorgeivanvelez at gmail.com
Sun Aug 17 03:31:11 CEST 2014


Dear Kate,

Try this:

res <- do.call(rbind, lapply(xs, function(l){
 l$PID <- l$MID <- 0
father <- with(l, Relationship == 'father')
 mother <- with(l, Relationship == 'mother')
 if(sum(father) == 0)
l$PID[l$Relationship == 'sibling'] <- 0
 else l$PID[l$Relationship == 'sibling'] <- l$Sample.ID[father]
 if(sum(mother) == 0)
l$MID[l$Relationship == 'sibling'] <- 0
 else l$MID[l$Relationship == 'sibling'] <- l$Sample.ID[mother]
 l
}))

It is assumed that when either parent is not available the M/PID is 0.

Best,
Jorge.-


On Sun, Aug 17, 2014 at 10:58 AM, Kate Ignatius <kate.ignatius at gmail.com>
wrote:

> Actually - I didn't check this before, but these are not all nuclear
> families (as I assumed they were).  That is, some don't have a father
> or don't have a mother.... Usually if this is the case PID or MID will
> become 0, respectively, for the child.  How can the code be edit to
> account for this?
>
> On Sat, Aug 16, 2014 at 8:02 PM, Kate Ignatius <kate.ignatius at gmail.com>
> wrote:
> > Thanks!
> >
> > I think I know what is being done here but not sure how to fix the
> > following error:
> >
> > Error in l$PID[l$\Relationship == "sibling"] <- l$Sample.ID[father] :
> >   replacement has length zero
> >
> >
> >
> > On Sat, Aug 16, 2014 at 6:48 PM, Jorge I Velez <jorgeivanvelez at gmail.com>
> wrote:
> >> Dear Kate,
> >>
> >> Assuming you have nuclear families, one option would be:
> >>
> >> x <- read.table(textConnection("Family.ID Sample.ID Relationship
> >> 14           62  sibling
> >> 14          94  father
> >> 14           63  sibling
> >> 14           59 mother
> >> 17         6004  father
> >> 17           6003 mother
> >> 17         6005   sibling
> >> 17         368   sibling
> >> 130           202 mother
> >> 130           203  father
> >> 130           204   sibling
> >> 130           205   sibling
> >> 130           206   sibling
> >> 222         9 mother
> >> 222         45  sibling
> >> 222         34  sibling
> >> 222         10  sibling
> >> 222         11  sibling
> >> 222         18  father"), header = TRUE)
> >> closeAllConnections()
> >>
> >> xs <- with(x, split(x, Family.ID))
> >> res <- do.call(rbind, lapply(xs, function(l){
> >> l$PID <- l$MID <- 0
> >> father <- with(l, Relationship == 'father')
> >> mother <- with(l, Relationship == 'mother')
> >> l$PID[l$Relationship == 'sibling'] <- l$Sample.ID[father]
> >> l$MID[l$Relationship == 'sibling'] <- l$Sample.ID[mother]
> >> l
> >> }))
> >> res
> >>
> >> HTH,
> >> Jorge.-
> >>
> >>
> >> Best regards,
> >> Jorge.-
> >>
> >>
> >>
> >> On Sun, Aug 17, 2014 at 5:42 AM, Kate Ignatius <kate.ignatius at gmail.com
> >
> >> wrote:
> >>>
> >>> Hi,
> >>>
> >>> I have a data.table question (as well as if else statement query).
> >>>
> >>> I have a large list of families (file has 935 individuals that are
> >>> sorted by famiy of varying sizes).  At the moment the file has the
> >>> columns:
> >>>
> >>> SampleID FamilyID Relationship
> >>>
> >>> To prevent from having to make a pedigree file by hand - ie adding a
> >>> PaternalID and a MaternalID one by one I want to try write a script
> >>> that will quickly do this for me  (I eventually want to run this
> >>> through a program such as plink)   Is there a way to use data.table
> >>> (maybe in conjucntion with ifelse to do this effectively)?
> >>>
> >>> An example of the file is something like:
> >>>
> >>> Family.ID Sample.ID Relationship
> >>> 14           62  sibling
> >>> 14          94  father
> >>> 14           63  sibling
> >>> 14           59 mother
> >>> 17         6004  father
> >>> 17           6003 mother
> >>> 17         6005   sibling
> >>> 17         368   sibling
> >>> 130           202 mother
> >>> 130           203  father
> >>> 130           204   sibling
> >>> 130           205   sibling
> >>> 130           206   sibling
> >>> 222         9 mother
> >>> 222         45  sibling
> >>> 222         34  sibling
> >>> 222         10  sibling
> >>> 222         11  sibling
> >>> 222         18  father
> >>>
> >>> But the goal is to have a file like this:
> >>>
> >>> Family.ID Sample.ID Relationship PID MID
> >>> 14           62  sibling 94 59
> >>> 14          94  father 0 0
> >>> 14           63  sibling 94 59
> >>> 14           59 mother 0 0
> >>> 17         6004  father 0 0
> >>> 17           6003 mother 0 0
> >>> 17         6005   sibling 6004 6003
> >>> 17         368   sibling 6004 6003
> >>> 130           202 mother 0 0
> >>> 130           203  father 0 0
> >>> 130           204   sibling 203 202
> >>> 130           205   sibling 203 202
> >>> 130           206   sibling 203 202
> >>> 222         9 mother 0 0
> >>> 222         45  sibling 18 9
> >>> 222         34  sibling 18 9
> >>> 222         10  sibling 18 9
> >>> 222         11  sibling 18 9
> >>> 222         18  father 0 0
> >>>
> >>> I've tried searches for this but with no luck.  Greatly appreciate any
> >>> help - even if its just a link to a great example/solution!
> >>>
> >>> Thanks!
> >>>
> >>> ______________________________________________
> >>> R-help at r-project.org mailing list
> >>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>> PLEASE do read the posting guide
> >>> http://www.R-project.org/posting-guide.html
> >>> and provide commented, minimal, self-contained, reproducible code.
> >>
> >>
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list