[R] Spark DataFrame: replace NULL cell by NA
Karim Mezhoud
kmezhoud @ending from gm@il@com
Sun Dec 9 22:06:51 CET 2018
Dear All,
## function to relpace empty cell by NA
empty_as_na <- function(x){
if("factor" %in% class(x)) x <- as.character(x) ## since ifelse wont work
with factors
ifelse(as.character(x)!="", x, NA)
}
## connect to spark local
sc <- spark_connect(master = "local")
# load an example of dataframe taht has empty cells (needs cgdsr package)
clinicalData <- cgdsr::getClinicalData(cgds, "gbm_tcga_pub_all")
## copy to spark
clinicalData_tbl <- dplyr::copy_to(sc, clinicalData, overwrite = TRUE)
# This works
clinicalData %>% mutate_all(funs(empty_as_na))
# This Does not works
clinicalData_tbl %>% mutate_all(funs(empty_as_na))
Thanks,
Karim
[[alternative HTML version deleted]]
More information about the R-help
mailing list