[R] subset data.frame with value != in all columns
Gabor Grothendieck
ggrothendieck at myway.com
Thu Feb 3 21:46:02 CET 2005
Do the -99 entries really mean NA? In that case, I think it
would be clearer to recode your data frame with NAs and then select
out the complete or incomplete rows:
x[x == -99] <- NA
x[compete.cases(x),] # or na.omit(x)
x[!complete.cases(x),]
Tim Howard <tghoward <at> gw.dec.state.ny.us> writes:
:
: apply, of course, does the trick exceptionally well. Thank you,
: everyone, for the help.
:
: tim
:
: >>> Chuck Cleland <ccleland <at> optonline.net> 02/03/05 03:10PM >>>
: How about this?
:
: #extract data.frame of rows with -99 in them
:
: subset(x, apply(x, 1, function(x){any(x == -99)}))
:
: #extract data.frame of rows not containing -99 in them
:
: subset(x, apply(x, 1, function(x){all(x != -99)}))
:
: hope this helps,
:
: Chuck Cleland
:
: Tim Howard wrote:
: > I am trying to extract rows from a data.frame based on the
: > presence/absence of a single value in any column. I've figured out
: how
: > to do get the positive matches, but the remainder (rows without this
: > value) eludes me. Mining the help pages and archives brought me,
: > frustratingly, very close, as you'll see below.
: >
: > My goal: two data frames, one with -99 in at least one column in
: each
: > row, one with no occurrences of -99. I want to preserve rownames in
: > each.
: >
: > My questions:
: > Is there a cleaner way to extract all rows containing a specified
: > value?
: > How can I extract all rows that don't have this value in any col?
: >
: > #create dummy dataset
: > x <- data.frame(
: > c1=c(-99,-99,-99,4:10),
: > c2=1:10,
: > c3=c(1:3,-99,5:10),
: > c4=c(10:1),
: > c5=c(1:9,-99))
: >
: > #extract data.frame of rows with -99 in them
: > for(i in 1:ncol(x))
: > {
: > y<-subset(x, x[,i]==-99, drop=FALSE);
: > ifelse(i==1, z<-y, z <- rbind(z,y));
: > }
: >
: > #various attempts to get rows not containing "-99":
: >
: > # this attempt was to create, in "list", the exclusion formula for
: each
: > column.
: > # Here, I couldn't get subset to recognize "list" as the correct
: type.
: > # e.g. it works if I paste the value of list in the subset command
: > {
: > for(i in 1:ncol(x)){
: > if(i==1)
: > list<-paste("x[",i,"]!=-99", sep="")
: > else
: > list<-paste(list," ", " & x[",i,"]!=-99", sep="")
: > }
: > y<-subset(x, list, drop=FALSE);
: > }
: >
: > # this will do it for one col, but if I index more
: > # it returns all rows
: > y <- x[!(x[,3] %in% -99),]
: >
: > # this also works for one col
: > y<-x[x[,1]!=-99,]
: >
: > # but if I index more, I get extra rows of NAs
: > y<-x[x[,1:5]!=-99,]
: >
: > Thanks in advance.
: > Tim Howard
: >
: > platform i386-pc-mingw32
: > arch i386
: > os mingw32
: > system i386, mingw32
: > status
: > major 2
: > minor 0.1
: > year 2004
: > month 11
: > day 15
: > language R
: >
: > ______________________________________________
: > R-help <at> stat.math.ethz.ch mailing list
: > https://stat.ethz.ch/mailman/listinfo/r-help
: > PLEASE do read the posting guide!
: http://www.R-project.org/posting-guide.html
: >
:
More information about the R-help
mailing list