[R] Antwort: Fw: Re: Subscripting problem with is.na()

G.Maubach at weinwolf.de G.Maubach at weinwolf.de
Mon Jun 27 10:45:12 CEST 2016


Hi David,
Hi Bert,

many thanks for the valuable discussion on NA in R (please see extract 
below). I follow your arguments leaving NA as they are for most of the 
time. In special occasions however I want to replace the NA with another 
value. To preserve the newly acquired knowledge for me I wrote this 
function:

-- cut --
t_replace_na <- function(dataset, variable, value) {
 if(inherits(dataset[[variable]], "factor") == TRUE) {
   dataset[variable] <- as.character(dataset[variable])
   print(class(dataset[variable]))
   dataset[, variable][is.na(dataset[, variable])] <- value
   dataset[variable] <- as.factor(dataset[variable])
   print(class(dataset[variable]))
 } else {
   dataset[, variable][is.na(dataset[, variable])] <- value
 }
 return(dataset)
}

ds_test <- data.frame(a=c(1,NA,2), b = rep(NA,3), c = c("A","b",NA))
print(sapply(ds_test, class))

t_replace_na(ds_test, "a", value = -1)
t_replace_na(ds_test, "b", value = -2)
t_replace_na(ds_test, "c", value = -3)
-- cut --

Unfortunately the if-statement does not work due to a wrong class 
definition within the function. When finding out what is going on I did 
this:

-- cut --
test_class <- function(dataset, variable) {
  if(inherits(dataset[, variable], "factor") == TRUE) {
    return(c(class(dataset[variable]), TRUE))
  } else {
    return(c(class(dataset[variable]), FALSE))
  }
}

ds_test <- data.frame(a=c(1,NA,2), b = rep(NA,3), c = c("A","b",NA))
print(sapply(ds_test, class))

# -- Test a --
class(ds_test[, "a"])
if(inherits(ds_test[, "a"], "factor")) {
  print(c(class(ds_test[, "a"]), "TRUE"))
} else {
  print(c(class(ds_test[, "a"]), "FALSE"))
}
test_class(ds_test, "a")
warning("'a' should be numeric NOT data.frame!")

# -- Test b --
if(inherits(ds_test[, "b"], "factor")) {
  print(c(class(ds_test[, "b"]), "TRUE"))
} else {
  print(c(class(ds_test[, "b"]), "FALSE"))
}
class(ds_test[, "b"])
test_class(ds_test, "b")
warning("'b' should be logical NOT data.frame!")

# -- Test c --
if(inherits(ds_test[, "c"], "factor")) {
  print(c(class(ds_test[, "c"]), "TRUE"))
} else {
  print(c(class(ds_test[, "c"]), "FALSE"))
}
class(ds_test[, "c"])
test_class(ds_test, "c")
warning("'c' should be factor NOT data.frame.
In addition data.frame != factor")
-- cut --

Why do I get different results for the same function if it is inside or 
outside my own function definition?

Kind regards

Georg

--------------------------------

> Gesendet: Donnerstag, 23. Juni 2016 um 21:14 Uhr
> Von: "David L Carlson" <dcarlson at tamu.edu>
> An: "Bert Gunter" <bgunter.4567 at gmail.com>
> Cc: "R Help" <r-help at r-project.org>
> Betreff: Re: [R] Subscripting problem with is.na()
>
> Good point. I did not think about factors. Also your example raises 
another issue since column c is logical, but gets silently converted to 
numeric. This would seem to get the job done assuming the conversion is 
intended for numeric columns only:
> 
> > test <- data.frame(a=c(1,NA,2), b = c("A","b",NA), c= rep(NA,3))
> > sapply(test, class)
>         a         b         c 
> "numeric"  "factor" "logical" 
> > num <- sapply(test, is.numeric)
> > test[, num][is.na(test[, num])] <- 0
> > test
>   a    b  c
> 1 1    A NA
> 2 0    b NA
> 3 2 <NA> NA
> 
> David C



More information about the R-help mailing list