[R] more than two NA value names in my data
Joshua Wiley
jwiley.psych at gmail.com
Thu Sep 30 21:57:33 CEST 2010
Hi,
You were on the right track with na.strings, from ?read.table
na.strings: a character vector of strings which are to be interpreted
as ‘NA’ values. Blank fields are also considered to be
missing values in logical, integer, numeric and complex
fields.
so, you can just do something like na.strings = c(".", "na",
"anotherthing") and so on. If you leave it blank, R will treat NAs as
NA, but it is not going to mystically know what values are someone's
special term for missing and what are real values. So it would read
the data in as character or factor.
You could also work to clean the data after you had read it in and
then convert everything back to numeric, but it is easier to just
specifying what indicates missing values.
Hope that helps,
Josh
On Thu, Sep 30, 2010 at 10:10 AM, JoonGi <joongi at hanmail.net> wrote:
>
>
> my data(*.txt) has 1000 observations(numbers with no characters) of 5
> variables. quite simple.
>
> However, NA values are quite tricky.
>
> this observer used more than two names for NA values; "." and "na" and more.
>
>
> 1. If I don't want to manipulate this raw data at all, how can I read this
> table?
>
> (meaning, can I set more than two names for na.strings=" " in read.table()?)
>
>
> 2. What happens if I don't set any NA value names in read.table()?
>
> Does R read "."(dot) and "na" as NA value?
>
>
> 3. If Q1 is not possible, what is the best way to get the values I want?
>
> (let's say I want means of all variables. I give conditions on my mean()?)
>
> --
> View this message in context: http://r.789695.n4.nabble.com/more-than-two-NA-value-names-in-my-data-tp2730161p2730161.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/
More information about the R-help
mailing list