[R] family

David Winsemius dwinsemius at comcast.net
Sat Nov 18 03:17:26 CET 2017


> On Nov 17, 2017, at 4:28 PM, Val <valkremk at gmail.com> wrote:
> 
> Hi all,
> I am reading a huge data set(12M rows) that contains family information,
> Offspring, Parent1 and Parent2
> 
> Parent1 and parent2 should be in the first column as an offspring
> before their offspring information. Their parent information (parent1
> and parent2) should be  set to zero, if unknown.  Also the first
> column should be unique.
> 
> 
> Here is my sample data  set  and desired output.
> 
> 
> fam <- read.table(textConnection(" offspring  Parent1 Parent2
> Smith Alex1  Alexa
> Carla Alex1     0
> Jacky Smith   Abbot
> Jack  0       Jacky
> Almo  Jack    Carla
> "),header = TRUE)
> 
> 
> 
> desired output.
> Offspring Parent1 Parent2
> Alex1      0        0
> Alexa      0        0
> Abbot      0        0
> Smith    Alex1  Alexa
> Carla    Alex1      0
> Jacky    Smith   Abbot
> Jack       0     Jacky
> Almo     Jack    Carla

You might get useful ideas by looking at ?'%in%" and ?union (set operations)

> fam$Parent1[!fam$Parent1 %in% fam$offspring]
[1] "Alex1" "Alex1" "0"    
> fam$Parent2[!fam$Parent1 %in% fam$offspring]
[1] "Alexa" "0"     "Jacky"

David.
> 
> Thank you.
> 
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

'Any technology distinguishable from magic is insufficiently advanced.'   -Gehm's Corollary to Clarke's Third Law



More information about the R-help mailing list