[R] Indexing and partially replacing 99, 999 in data frames

Mark Hempelmann neo27 at rakers.de
Sat Nov 24 17:52:57 CET 2007


Dear WizaRds,

	unfortunately, I have been unable to replace the '99' and '999' entries in

library(UsingR)
attach(babies)

as definitions for missing values NA, because sometimes the 99 entry is
indeed a correct value. Usually, or so I thought, NAs can
easily replace a, say, 999 entry via

mymat[mymat==999] <- "yodl"

in a matrix or data frame. Alas, the babies' dataset also includes 99
entries as true values. So, here is what I did:

#to remove all 999:
babies[babies==999] <- NA

, but to remove the 99 in columns nr. 10,12,17 I have come to a complete
stop. The corny idea of

babies$ht[babies$ht==99] <- NA
babies$dht[babies$dht==99] <- NA
babies$dwt[babies$dwt==99] <- NA

works, but seems to show that I have not really understood the art of
indexing, have I? The archives did not really offer enough insight for
me to solve the problem, I am ashamed.

I tried something with
babies[is.element(babies[,c(10,12,17)], 99)] <- NA # beeep, wrong or
babies[babies[,c(10,12,17)]==99] # no way, indeed.

detach(babies)

There must be a more intelligent and elegant solution.

Also, what is the nr. of rows after I remove all NA entries? Easy example:

frog <- matrix(1:42, ncol=3)
frog[sample(42, 7)] <- NA

length(frog[!is.na(frog)])
# ok, but I want to know the nr of rows without NAs
dim(frog[!is.na(frog),]) #no
nrow(!is.na(frog)) # no


Thank you for your help and
Cheers
mark



More information about the R-help mailing list