[R] Problem with data conversion
    arinbasu@softhome.net 
    arinbasu at softhome.net
       
    Sun Dec 14 13:19:47 CET 2003
    
    
  
Hi All: 
I came across the following problem while working with a dataset, and 
wondered if there could be a solution I sought here. 
My dataset consists of information on 402 individuals with the followng five 
variables (age,sex, status = a binary variable with levels "case" or 
"control", mma, dma). 
During data check, I found that in the raw data, the data entry operator had 
mistakenly put a "0" for one participant, so now, the levels show 
> levels(status) 
[1] "0" "control" "case" 
The variables mma, and dma are actually numerical variables but in the 
dataframe, they are represented as "characters". I tried to change the type 
of the variables (from character to numeric) using the edit function (and 
bringing up the data grid where then I made changes), but the changes were 
not saved. I tried 
mma1 <- as.numeric(mma) 
but I was not successful in converting mma from a character variable to a 
numeric variable. 
So, to edit and "clean" the data, I exported the dataset as a text file to 
Epi Info 2002 (version 2, Windows). I used the following code: 
mysubset <- subset(workingdat, select = c(age,sex,status, mma, dma))
write.table(mysubset, file="mysubset.txt", sep="\t", col.names=NA) 
After I made changes in the variables using Epi Info (I created a new 
variable called "statusrec" containing values "case" and "control"), I 
exported the file as a ".rec" file (filename "mydata.rec"). I used the 
following code to read the file in R: 
require(foreign)
myData <- read.epiinfo("mydata.rec", read.deleted=NA) 
Now, the problem is this, when I want to run a logistic regression, R 
returns the following error message: 
> glm(statusrec~mma, family=binomial(link=logit))
Error in model.frame(formula, rownames, variables, varnames, extras, 
extranames,  :
       invalid variable type 
I cannot figure out the solution. I want to run a logistic regression now 
with the variable statusrec (which is a binary variable containing values 
"case" and "control"), and another
variable (say mma, which is now a numeric variable). What does the above 
error message mean and what could be a possible solution? 
Would greatly appreciate your insights and wisdom. 
 -Arin Basu
    
    
More information about the R-help
mailing list