[R] subsets

Keith Jewell k.jewell at campden.co.uk
Thu Jan 20 12:26:17 CET 2011


I don't think Ivan's solution meets the OP's needs.

I think you could do it using %in% and the approriate logical operations 
e.g.

aDF <- data.frame(id=c(1,2,2,2,3,3,4,4,4,5),
     diagnosis=c("ah", "ah", "ihd", "im", "ah", "stroke", "ah", "ihd", 
"angina", "ihd"))
aDF[with(aDF,(id %in% id[diagnosis=="ah"]) & (id %in% 
id[diagnosis=="ihd"])),]
aDF[with(aDF,(id %in% id[diagnosis=="ah"]) & !(id %in% 
id[diagnosis=="ihd"])),]
aDF[with(aDF,!(id %in% id[diagnosis=="ah"]) & (id %in% 
id[diagnosis=="ihd"])),]

That starts to feel a bit fiddly for me. You might want to look at package 
sqldf.

HTH

Keith J
--------------------------
"Ivan Calandra" <ivan.calandra at uni-hamburg.de> wrote in message 
news:4D37FBEA.5070100 at uni-hamburg.de...
Hi!

I think you should read the intro to R, as well as ?"[" and ?subset. It
should help you to understand.

Let's say your data is in a data.frame called df:
# 1. ah and ihd
df_ah_ihd <- df[df$diagnosis=="ah" | df$diagnosis=="ihd", ]  ## the "|"
is the boolean OR (you want one OR the other). Note the last comma

#2. ah
df_ah <- df[df$diagnosis=="ah", ]

#3. ihd
df_ihd <- df[df$diagnosis=="ihd", ]

You could do the same using subset() if you feel better with this function.

HTH,
Ivan

Le 1/20/2011 09:53, Den a écrit :
> Dear R people
> Could you please help.
>
> Basically, there are two variables in my data set. Each patient ('id')
> may have one or more diseases ('diagnosis'). It looks like
>
> id diagnosis
> 1 ah
> 2 ah
> 2 ihd
> 2 im
> 3 ah
> 3 stroke
> 4 ah
> 4 ihd
> 4 angina
> 5 ihd
> ..............
> Q: How to make three data sets:
> 1. Patients with ah and ihd
>   2. Patients with ah but no ihd
> 3. Patients with  ihd but no ah?
>
>   If you have any ideas could just guide what should I look for. Is a
> subset or aggregate, or loops, or something else??? I am a bit lost. (F1
> F1 F1 !!!:)
> Thank you
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Ivan CALANDRA
PhD Student
University of Hamburg
Biozentrum Grindel und Zoologisches Museum
Abt. Säugetiere
Martin-Luther-King-Platz 3
D-20146 Hamburg, GERMANY
+49(0)40 42838 6231
ivan.calandra at uni-hamburg.de

**********
http://www.for771.uni-bonn.de
http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php



More information about the R-help mailing list