[R] removing NA from a data frame
Petr PIKAL
petr.pikal at precheza.cz
Fri Jun 22 11:58:43 CEST 2012
Hi
both na.omit and complete cases works for me smoothly when NA is not a
valid level in factor.
If this is the case, as it seems to be, you need reset your factor levels
so that NA is not a valid level.
ex10s$dg <- factor( ex10s$dg )
both commands shall work than.
Regards
Petr
>
> Removing rows with NAs, using na.omit(), doesn't seem to be working for
me.
>
> Dataset:
>
> > str ( ex10s )
>
> 'data.frame': 2189576 obs. of 5 variables:
> $ LOPNR : int 58 58 58 58 64 64 64 64 64 64 ...
> $ DIAGNOS: Factor w/ 173 levels "F20","F200","F2000",..: 128 128 128 128
> 105 105 105 160 105 105 ...
> $ X_DATE : int 20060821 20061207 20080102 20090904 20010327 20010925
> 20020307 20021007 20021007 20030320 ...
> $ SOURCE : int 2 2 2 2 2 2 2 2 2 1 ...
> $ dg : Factor w/ 7 levels "0","1","2","3",..: 6 6 6 6 5 5 5 6 5 5
...
>
> The only NAs are in the factor dg (put in by 'recode' from the car
> library; I'm trying to eliminate cases with particular factor levels)
>
> > table ( ex10s$dg )
>
> 0 1 2 3 4 5 NA
> 2851 271501 63112 98425 335593 1257299 160795
>
> So, I remove the rows with NAs, to a new dataframe ex10ss:
>
> > ex10ss<-na.omit(ex10s)
>
> Check all the NAs have been removed:
>
> > table(ex10ss$dg)
>
> 0 1 2 3 4 5 NA
> 2851 271501 63112 98425 335593 1257299 160795
>
> > dim(ex10s)
> [1] 2189576 5
> > dim(ex10ss)
> [1] 2189576 5
>
> Nothing seems to have changed. I want all the rows with NA in removed.
>
> I am clearly doing something wrong.
>
> The only alternative I could find is pretty similar:
> use <- complete.cases ( ex10 )
> ex10ss<-ex10s[use,]
> which leads to the same result.
>
>
> Stuart
>
>
> Dr Stuart John Leask DM FRCPsych MB Mchir
> Clinical Senior Lecturer and Honorary Consultant Pychiatrist
> Institute of Mental Health, Innovation Park
> Triumph Road, Nottingham, Notts. NG7 2TU. UK
> Tel. +44 115 82 30419 stuart.leask at nottingham.ac.uk<
> mailto:stuart.leask at nottingham.ac.uk>
> Google 'Dr Stuart Leask'
>
>
> This message and any attachment are intended solely for the addressee
and
> may contain confidential information. If you have received this message
in
> error, please send it back to me, and immediately delete it. Please do
> not use, copy or disclose the information contained in this message or
in
> any attachment. Any views or opinions expressed by the author of this
> email do not necessarily reflect the views of the University of
Nottingham.
>
> This message has been checked for viruses but the contents of an
attachment
> may still contain software viruses which could damage your computer
system:
> you are advised to perform your own checks. Email communications with
the
> University of Nottingham may be monitored as permitted by UK
legislation.
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list