[R] Eliminate cases in a subset of a dataframe

James W. MacDonald jmacdon at med.umich.edu
Mon Sep 14 19:02:49 CEST 2009


linmod2 <- update(linmod, data = subdata[-c(11,22,33),])

Hollix wrote:
> Hi folks,
> 
> I created a subset of a dataframe (i.e., selected only men):
> 
> subdata <- subset(data,data$gender==1)
> 
> After a residual diagnostic of a regression analysis, I detected three
> outliers:
> 
> linmod <- lm(y ~ x, data=subdata)
> plot(linmod)
> 
> Say, the cases 11,22, and 33 were outliers.
> 
> Here comes the problem: When I want to exclude these three cases in a
> further regression analysis, 
> - for instance with linmod2 <- lm(y[-c(11,22,33)] ~ x[-c(11,22,33)],
> data=subdata) - it does not work.
> 
> I guess this has something to do with this strange "row.names"-vector which
> has been added to the dataframe when creating the subset. I find it very
> strange why R gives the case numbers in the diagnostics but then doesn't
> allow me to use these numbers for further exclusion. 
> 
> Can anybody tell me:
> 1. what this row.names vector is
> 2. How I can refer to cases after creating a subset (e.g., in order to
> exclude them).
> 
> Many thanks in advance,
> Best,
> Holger

-- 
James W. MacDonald, M.S.
Biostatistician
Douglas Lab
University of Michigan
Department of Human Genetics
5912 Buhl
1241 E. Catherine St.
Ann Arbor MI 48109-5618
734-615-7826




More information about the R-help mailing list