[R] Remove missings (quick question)
Marc Schwartz
marc_schwartz at me.com
Fri Nov 9 19:17:44 CET 2012
On Nov 9, 2012, at 11:23 AM, Bert Gunter <gunter.berton at gene.com> wrote:
> Marc et. al:
>
> On Fri, Nov 9, 2012 at 9:05 AM, Marc Schwartz <marc_schwartz at me.com> wrote:
>> On Nov 9, 2012, at 10:50 AM, Eiko Fried <torvon at gmail.com> wrote:
>>
>>> A colleague wrote the following syntax for me:
>>>
>>> D = read.csv("x.csv")
>>>
>>> ## Convert -999 to NA
>>> for (k in 1:dim(D)[2]) {
>>> I = which(D[,k]==-999)
>>> if (length(I) > 0) {
>>> D[I,k] = NA
>>> }
>>> }
>>>
>>> The dataset has many missing values. I am running several regressions on
>>> this dataset, and want to ensure every regression has the same subjects.
>>>
>>> Thus I want to drop subjects listwise for dependent variables y1-y9 and
>>> covariates x1-x5 (if data is missing on ANY of these variables, drop
>>> subject).
>>>
>>> How would I do this after running the syntax above?
>>>
>>> Thank you
>>
>>
>> Modify the initial read.csv() call to:
>>
>> D <- read.csv("x.csv", na.strings = "-999")
>>
>> That will convert all -999 values to NA's upon import so that you don't have to post-process it.
>>
>> See ?read.csv for more info.
>>
>> Once that is done, R's default behavior is to remove observations with any missing data (eg. NA values)
> when using modeling functions.
>
> This appears to be false. From ?lme (nlme package, nlme_3.1-105, R 2.15.2):
>
> "na.action
>
> a function that indicates what should happen when the data contain
> NAs. The default action (na.fail) causes lme to print an error message
> and terminate if there are any incomplete observations."
>
> Frankly, I doubt that there is any uniformity for practically any
> modeling options across the vast array of "modeling functions" in R
> and (even recommended?) packages.
>
> Cheers,
> Bert
Good point Bert. That's what I get for over-generalizing... :-)
Thanks,
Marc
>
> Or you can pre-process using:
>>
>> D.New <- na.omit(D)
>>
>> and then use D.New for all of your subsequent analyses. See ?na.omit.
>>
>> Regards,
>>
>> Marc Schwartz
More information about the R-help
mailing list