[R] Studdy Missing Data, differentiate between a percent with in the valid answers and with in the different missing answers

Ericka Lundström e at it.dk
Mon Mar 3 16:00:08 CET 2008

On Mon, 03 Mar 2008 22:02:17 +1300, James Reilly wrote
> On 3/3/08 8:21 PM, Ericka Lundström wrote:
>  > I'm trying to emigrate from SPSS to R, thou I have some 
> problems whit > getting R to distinguish between the different 
> kind of missing. ... > Is there a smart way in R to 
> differentiate between missing and valid > and at the same time 
> treat both the categories within missing and > valid as 
> answers (like SPSS did above)
> The Hmisc package has some support for special missing values, 
> for instance when reading in SAS datasets using sas.get. I 
> don't believe spss.get offers the same facility, though.
> You can define special missing values for a variable manually, 
> which might seem a bit involved, but this could easily be 
> automated. For your example, try:
> special <- dataFrame$TWO %in% c("?","X")
> attr(dataFrame$TWO, "special.miss") <-
>      list(codes=as.character(dataFrame$TWO[special]),
>      obs=(1:length(dataFrame$TWO))[special])
> class(dataFrame$TWO) <- c("factor", "special.miss")
> is.na(dataFrame$TWO) <- special
> # Then describe gives new percentages
> describe(dataFrame$TWO)
> dataFrame$TWO
>        n missing       ?       X  unique
>        3       4       2       2       2
> No (2, 67%), yes (1, 33%)
Dear James Reilly

Tanks a for your answer, now I can get - or make - ‘metacategories’ for
my data, which is wonderful! Thou I actually only needed two
‘metacategories’. One for missing answers and one for valid answers,
anyhow it looks like R are treating “X” and “?” as missing, or
subcategorise of missing. 

One thing I still need R to give me a percent with in the valid answers
(or unique) and a percent over all. Is that in anyway possible? Whit the
special.miss I doesn’t get percentages I only get distribution with in n
[No (2, 67%), yes (1, 33%)]. I don’t get an percent over all [? (2,
29%), No (2, 29%), X (2, 29%), yes (1, 14%)].
Isn’t there someone who has developed a Package for this feature?
Karsten Mueller asked about this 10 years ago 

Hope some one have the time to help me. And again, thanks to James
Reilly for his answer!

All the best

Ericka Lujndström

More information about the R-help mailing list