[R] subset data.frame with value != in all columns
Tim Howard
tghoward at gw.dec.state.ny.us
Thu Feb 3 20:57:58 CET 2005
I am trying to extract rows from a data.frame based on the
presence/absence of a single value in any column. I've figured out how
to do get the positive matches, but the remainder (rows without this
value) eludes me. Mining the help pages and archives brought me,
frustratingly, very close, as you'll see below.
My goal: two data frames, one with -99 in at least one column in each
row, one with no occurrences of -99. I want to preserve rownames in
each.
My questions:
Is there a cleaner way to extract all rows containing a specified
value?
How can I extract all rows that don't have this value in any col?
#create dummy dataset
x <- data.frame(
c1=c(-99,-99,-99,4:10),
c2=1:10,
c3=c(1:3,-99,5:10),
c4=c(10:1),
c5=c(1:9,-99))
#extract data.frame of rows with -99 in them
for(i in 1:ncol(x))
{
y<-subset(x, x[,i]==-99, drop=FALSE);
ifelse(i==1, z<-y, z <- rbind(z,y));
}
#various attempts to get rows not containing "-99":
# this attempt was to create, in "list", the exclusion formula for each
column.
# Here, I couldn't get subset to recognize "list" as the correct type.
# e.g. it works if I paste the value of list in the subset command
{
for(i in 1:ncol(x)){
if(i==1)
list<-paste("x[",i,"]!=-99", sep="")
else
list<-paste(list," ", " & x[",i,"]!=-99", sep="")
}
y<-subset(x, list, drop=FALSE);
}
# this will do it for one col, but if I index more
# it returns all rows
y <- x[!(x[,3] %in% -99),]
# this also works for one col
y<-x[x[,1]!=-99,]
# but if I index more, I get extra rows of NAs
y<-x[x[,1:5]!=-99,]
Thanks in advance.
Tim Howard
platform i386-pc-mingw32
arch i386
os mingw32
system i386, mingw32
status
major 2
minor 0.1
year 2004
month 11
day 15
language R
More information about the R-help
mailing list