[R] filtering out data

Thomas Lumley tlumley at u.washington.edu
Fri Aug 22 18:28:52 CEST 2008


On Fri, 22 Aug 2008, Kate Rohrbaugh wrote:

> Greetings,
>
> Apologies for such a simple question, but I have been trying to figure 
> this out for a few days now (I'm quite new to R), have read manuals, 
> list posts, etc., and not finding precisely what I need.  I am 
> accustomed to working in Stata, but I am currently working through the 
> multilevel package right now.  In Stata, I would include the statement 
> "if model1 == 1" at the end of my regression statement, but I can't seem 
> to find the correct syntax for doing the same thing in R.

You want the subset= argument to lm(), eg subset= model1==1

 	-thomas

> I have successfully converted Stata datasets, merged, aggregated, run lm, etc., but this simple task is evading me entirely.
>
> Here's what I have tried (and various iterations):
>
> if(merged.dataset$model1 == 1) model1.coeff <- lm(dv ~ iv1 + iv2 + iv3, data = merged.dataset)
> I get this:
> Warning message:
> In if (merged.dataset$model1 == 1) model1.coeff <- lm(dv ~ iv +  :
>  the condition has length > 1 and only the first element will be used
> What it seems to do is just run lm on the first opportunities of model1==1, but then doesn't try it again.  I tried sorting so that the model1==1 appear first, but that doesn't seem to be a great solution either.
>
> I'll just go back into Stata and create separate new datasets if I have to, but there HAS to be a more elegant way.
>
> Thank you for ANY feedback!
>
> Kate
>
> Kate Rohrbaugh
> Independent Project Analysis, Inc.
> 44426 Atwater Drive
> Ashburn, VA  20147
>
> office - 703.726.5465
> fax - 703.729.8301
> email - krohrbaugh at ipaglobal.com
> website - www.ipaglobal.com
>
>
> <HTML><HEAD>
> <META http-equiv=Content-Type content="text/html; charset=iso-8859-15">
> <META content="MSHTML 6.00.5730.11" name=GENERATOR></HEAD>
> <BODY style="MARGIN: 4px 4px 1px; FONT: 12pt Comic Sans MS; COLOR: #000000">
> <DIV>&nbsp;</DIV>
> <DIV>&nbsp;</DIV>
> <DIV>
> <HR>
> </DIV>
> <DIV><FONT face=Arial size=1>This email message and any attached files are confidential and are intended solely for the use of the addressee(s) named above. This communication may contain material protected by legal privileges. If you are not the intended recipient or person responsible for delivering this confidential communication to the intended recipient, you have received this communication in error; any review, use, dissemination, forwarding, printing, copying or other distribution of this email message and any attached files is strictly prohibited. Independent Project Analysis Inc. reserves the right to monitor any communication that is created, received, or sent on its network. If you have received this confidential communication in error, please notify the sender immediately by reply email message and permanently delete the original message. Thank you for your cooperation.</FONT></DIV></BODY></HTML>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

Thomas Lumley			Assoc. Professor, Biostatistics
tlumley at u.washington.edu	University of Washington, Seattle



More information about the R-help mailing list