[R] recoding responses in a numeric variable
Jeff Newmiller
jdnewmil at dcn.davis.ca.us
Sun Jan 8 06:04:38 CET 2017
Please read the Posting Guide mentioned at the bottom of this and every
message. In particular, send your email in plain text format so we get to
see what you saw (the mailing list strips out HTML formatting in most
cases). Also please work to make your examples reproducible... e.g. give
all steps necessary to reproduce your output or error... otherwise
we get to guess what you were doing wrong and if we guess wrong then our
help is wasted. The code below should be reproducible for your benefit and
for anyone else who reads this.
#### code follows
# make believe data as though it was in a file
inputdata <-
"vn35
no entry
no entry
don't know
don't know
don't know
a lot of fear
a lot of fear
a lot of fear
a lot of fear
big fear
big fear
big fear
big fear
big fear
medium fear
medium fear
medium fear
medium fear
medium fear
medium fear
little fear
little fear
little fear
little fear
little fear
little fear
little fear
no fear at all
no fear at all
no fear at all
no fear at all
no fear at all
no fear at all
no fear at all
no fear at all
"
# I am going to guess that you did something kind of like
gles_reduced <- read.csv( text=inputdata )
# before you did the steps you gave us:
fear <- gles_reduced$vn35
levels(fear) # this doesn't change any of your data
table(fear, as.numeric(fear), exclude=NULL) # neither does this
# This now has a new level <NA> represented by a numeric value NA, and
# it is really not very useful to have both the value and the level be
# NA.
# In my opinion, the problem started when you let R automatically
# create a factor based on default settings. Lets try that again
# the right way:
# DONT let R automatically create a factor column
gles_reduced <- read.csv( text=inputdata, stringsAsFactors = FALSE )
fear <- gles_reduced$vn35
# now fear is a vector of character strings
# replace semantic unknowns with NA
fear[ fear %in% c( "no entry", "don't know" ) ] <- NA
# define the levels in the order you want them from small to large
fearlvls <- c( "no fear at all"
, "little fear"
, "medium fear"
, "big fear"
, "a lot of fear"
)
# explicitly create the factor with comparability
fear <- ordered( fear, levels=fearlvls )
table(fear, as.numeric( fear ) )
sum( is.na( fear ) )
which( "big fear" < fear ) # indexes of the ones that have
# a lot of fear
fear[ which( "big fear" < fear ) ] # see them
#### end of code
Note that the levels go from 1 to 5, not 0 to 4, but factors don't work
with zeroes. Fortunately all the stats functions in R know this so you
are better off not fighting the convention. If you absolutely must, then
you need to deal with it in an integer or numeric vector:
fearnums <- as.integer( fear ) - 1L
On Sat, 7 Jan 2017, Licia Biotti wrote:
> Hello,
>
> I am working with a dataset in R studio, and I have created a numeric
> variable which I have called fear by using a factor variable (called vn35).
> Here is the piece of code:
> fear<-gles_reduced$vn35
> levels(fear)
> table(fear, as.numeric(fear), exclude=NULL)
>
> Then I have coded the levels "don't know" and "not specified" as NA
> fear[fear=="not specified"]<-NA
> fear[fear=="don't know"]<-NA
>
> This is how the table looks like:
>
> fear 3 4 5 6 7 <NA>
> no entry 0 0 0 0 0 0
> don't know 0 0 0 0 0 0
> a lot of fear 412 0 0 0 0 0
> big fear 0 883 0 0 0 0
> medium fear 0 0 1350 0 0 0
> little fear 0 0 0 920 0 0
> no fear at all 0 0 0 0 305 0
> <NA> 0 0 0 0 0 41
>
> I would like to code the remaining answers (a lot of fear, big fear, medium
> fear, little fear and no fear at all) with values from 0 to 4 (so that
> greater values indicate great concern)
> I tried this piece of code:
> fear[fear=="big fear"]<-1
> But it is not working,
> could you please help me?
> Thanks,
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
---------------------------------------------------------------------------
Jeff Newmiller The ..... ..... Go Live...
DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go...
Live: OO#.. Dead: OO#.. Playing
Research Engineer (Solar/Batteries O.O#. #.O#. with
/Software/Embedded Controllers) .OO#. .OO#. rocks...1k
More information about the R-help
mailing list