[R] Missing data (Na) and chi-square tests

Nordlund, Dan (DSHS/RDA) NordlDJ at dshs.wa.gov
Tue Oct 9 02:18:00 CEST 2012


> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> project.org] On Behalf Of David Winsemius
> Sent: Monday, October 08, 2012 4:37 PM
> To: Rerda
> Cc: r-help at r-project.org
> Subject: Re: [R] Missing data (Na) and chi-square tests
> 
> 
> On Oct 8, 2012, at 9:06 AM, Rerda wrote:
> 
> > Dear Rui and David
> > Thank you very much for taking your time to look at my problem.
> > However, I still cannot seem to figure it out.
> >
> > I think that you David are corect in your assumption of how my data
> is
> > structured. The data in the two columns that I need to cross-table is
> either
> > 1 or 0.
> > I made a mistanke in the formula that I sent to you. The one I use
> is:
> > data <- matrix(c(sum(!Variable[Group....==1]),
> sum(Variable[Group....==1]),
> > sum(!Variable[Group....==0]), sum(Variable[Group....==0])),2,2)
> 
> Hardly seems surprising that is not working, ... the is no "Group...."
> column in the data. Furthermore you would not need to use sum(
> !Variable[Group....==1] as well as sum(Variable[Group....==1]). The
> `table` function will do that for you.
> 
> Given my personal memory of having a hypotensive episode as I stood at
> the counter of a pharmacy asking for epinephrine and a syringe to treat
> my urticarial reaction to shellfish, I picked these two:
> 
> > with( MyData, table( Rash= as.logical(Rash), Hypotension =
> as.logical(Hypotension) ) )
>        Hypotension
> Rash    FALSE TRUE
>   FALSE     3    7
>   TRUE      7    3
> 
> Best;
> David.
> 
> 
> >
> >
> > This is the output of >dput( head(MyData, 20) ):
> >
> > structure(list(Patient.nr = c(1L, 3L, 4L, 5L, 6L, 7L, 9L, 10L,
> > 11L, 12L, 13L, 14L, 15L, 16L, 19L, 20L, 21L, 22L, 23L, 24L),
> >    DAAC.... = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
> >    0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), TAA = c(0L, 0L, 0L,
> >    0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
> >    0L, 0L), Sex = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L,
> >    0L, 0L, 1L, 0L, 0L, 1L, 1L, 0L, 0L, 0L), Alder = c(1L, 6L,
> >    3L, 3L, 6L, 6L, 2L, 6L, 5L, 2L, 6L, 4L, 6L, 2L, 2L, 5L, 3L,
> >    6L, 6L, 6L), Reak.1 = structure(c(2L, 1L, 2L, 2L, 1L, 2L,
> >    2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 1L, 2L, 1L, 1L, 2L), .Label =
> c("0",
> >    "1", "na"), class = "factor"), Reak.2 = structure(c(1L, 1L,
> >    1L, 1L, 2L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 2L, 1L, 1L, 2L, 1L,
> >    2L, 1L, 1L), .Label = c("0", "2", "na"), class = "factor"),
> >    Reak.3 = structure(c(1L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 2L,
> >    1L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L), .Label = c("0",
> >    "3", "na"), class = "factor"), Reak.4 = structure(c(1L, 1L,
> >    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
> >    1L, 1L, 1L), .Label = c("0", "4", "na"), class = "factor"),
> >    Tryptase = structure(c(1L, 1L, 1L, 3L, 1L, 3L, 1L, 2L, 2L,
> >    1L, 3L, 3L, 1L, 3L, 3L, 3L, 3L, 3L, 2L, 1L), .Label = c("0",
> >    "1", "na"), class = "factor"), Hypotension = c(0L, 1L, 0L,
> >    0L, 1L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 0L, 0L, 0L, 0L, 0L,
> >    1L, 0L), Tachycardia = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
> >    1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L), Br.spasm. = c(0L,
> >    1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L,
> >    0L, 0L, 0L, 0L), Angioedema = c(0L, 0L, 0L, 1L, 0L, 1L, 0L,
> >    0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L), Urticaria =
> c(0L,
> >    0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
> >    0L, 0L, 0L, 0L), Flush. = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
> >    0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), Rash = c(1L,
> >    0L, 1L, 0L, 1L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 0L,
> >    1L, 1L, 0L, 1L), Pruritus = c(0L, 0L, 0L, 0L, 0L, 0L, 0L,
> >    1L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 1L, 1L, 0L, 0L), Transf. =
> c(0L,
> >    0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
> >    0L, 0L, 0L, 0L)), .Names = c("Patient.nr", "DAAC....", "TAA",
> > "Sex", "Alder", "Reak.1", "Reak.2", "Reak.3", "Reak.4", "Tryptase",
> > "Hypotension", "Tachycardia", "Br.spasm.", "Angioedema", "Urticaria",
> > "Flush.", "Rash", "Pruritus", "Transf."), row.names = c(NA, 20L
> > ), class = "data.frame")
> >
> > And I still can get it to work.
> > Is it possible to put is.na(Variable) or something into my formula?
> >
> > I understand if it is to difficult to figure it put.
> > Thank you very much
> >
> > Kind Regards Gerda
> >
> >

In addition, I wonder where the 'na' values are coming from.  In R, shouldn't those values be NA, if they are unknown values.

Dan

Daniel J. Nordlund
Washington State Department of Social and Health Services
Planning, Performance, and Accountability
Research and Data Analysis Division
Olympia, WA 98504-5204




More information about the R-help mailing list