[R] Creating missing values.
Marc Feldesman
feldesmanm at pdx.edu
Sun Mar 24 18:12:31 CET 2002
I'm trying to figure out whether there is a simple one or two-pass approach
to randomly creating missing values for a set of existing (complete)
data. For example, I want to randomly make 10% of the entries in the Iris
dataset missing (i.e. NA). I don't want any case to have all missing
values and I don't want any case to be missing the classification
variable. I can do this in about 3 passes, but I haven't figured out
whether there is an efficient way to do this in one or two passes through
the data.
My approach involves creating a dummy vector with a length equal to the
full length
of the Iris data (750 elements). >sample(750, 1:10, replace=T). I then
assigned all values of 2 to be 0 and all others to be 1. This left me with
approximately 10% of the entries as "missing". I reshaped this into a 150
x 5 matrix. From here, things were pretty straightforward.
Is there anyway to bypass the dummy vector and operate directly on a copy
of the original Iris matrix and get to the point above without the
intermediate steps?
Thanks.
=====================
Dr. Marc R. Feldesman
Professor and Chairman
Anthropology Department
Portland State University
1721 SW Broadway
Portland, Oregon 97201
email: feldesmanm at pdx.edu
phone: 503-725-3081
fax: 503-725-3905
http://web.pdx.edu/~h1mf
PGP Key Available On Request
======================
"Beyond every credibility gap lies a gullibility fill"
Powered by Latochoerus and Windows 2000, SP1
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
More information about the R-help
mailing list