[R] Percentages in contingency tables *warning trivial question*
Chuck Cleland
ccleland at optonline.net
Mon Dec 13 11:47:17 CET 2004
You might want to look at CrossTable() in the gmodels package of the
gregmisc bundle. For example:
> library(gmodels)
> sex <- as.factor(sample(c("Male", "Female"), 100, replace=TRUE))
> case <- as.factor(sample(c("Case", "Control"), 100, replace=TRUE))
> CrossTable(sex, case)
Cell Contents
|-----------------|
| N |
| N / Row Total |
| N / Col Total |
| N / Table Total |
|-----------------|
Total Observations in Table: 100
| case
sex | Case | Control | Row Total |
-------------|-----------|-----------|-----------|
Female | 21 | 29 | 50 |
| 0.420 | 0.580 | 0.500 |
| 0.420 | 0.580 | |
| 0.210 | 0.290 | |
-------------|-----------|-----------|-----------|
Male | 29 | 21 | 50 |
| 0.580 | 0.420 | 0.500 |
| 0.580 | 0.420 | |
| 0.290 | 0.210 | |
-------------|-----------|-----------|-----------|
Column Total | 50 | 50 | 100 |
| 0.500 | 0.500 | |
-------------|-----------|-----------|-----------|
Rachel Pearce wrote:
> I hesitate to post this question in the light of recent threads, indeed
> I have hesitated for several weeks, however I have come to a full stop
> and really need some help if I am going to progress. I am a new user of
> R for medical statistics. I have attempted to read all the relevant
> documents, but would welcome any suggestions as to what I have missed.
>
> I am trying to contruct "table 1" type contingency (mostly) tables. I
> would like to include percentages, thus:
>
> Cases Controls Total
> N % N % N %
> Total 50 100 50 100 100 100
>
>
> Sex: M 23 46 27 54 50 50
>
> etc...
>
> I hesitate even more to mention it here, but I am thinking of something
> along the lines of PROC TABULATE in SAS.
>
> The closest I have found in the documentation I have read so far is an
> example given in the help for "addmargins":
>
> Bee <- sample( c("Hum","Buzz"), 177, replace=TRUE )
> Sea <- sample( c("White","Black","Red","Dead"), 177,
> replace=TRUE )
> ...
> # Weird function needed to return the N when computing
> percentages
> sqsm <- function( x ) sum( x )^2/100
> B <- table(Sea, Bee)
> round(sweep(addmargins(B, 1, list(list(All=sum, N=sqsm))), 2,
> apply( B, 2, sum )/100, "/" ), 1)
> round(sweep(addmargins(B, 2, list(list(All=sum, N=sqsm))), 1,
> apply(B, 1, sum )/100, "/"), 1)
>
> .. Which introduced me to "sweep" and maybe could be extended to do
> what I want. But I don't like using mysterious "weird" functions.
>
> I recently found Paul Johnson's Rtips where:
> http://www.ku.edu/~pauljohn/R/Rtips.html#6.1 mentioned the function
> prop.table, which is also close to what I want. But how to show Ns and
> percentages im the same table?
>
> I wondered if there were a function which does this already. Or perhaps
> I should just write one for myself? Or should I not be trying to do this
> in R in the first place and go back to Excel (I no longer have access to
> SAS)? Please, NO! Or perhaps I am looking for the wrong thing in the
> manuals?
>
> I have followed recent advice to look at Frank E Harrell's detailed
> tabulation code, but this seems to produce many errors on my system and
> with my version of R (see below). I do not have access to LaTeX
> (apologies for incorrect typography). I can provide details of the
> errors if it turns out that the answer to my question is RTFM by Prof
> Harrell.
>
> I would like to add my two pennorth to the debate about "trivial"
> questions, of which I assume this is one. I believe that a very large
> amount of what is hard about learning R on one's own with documentation
> but without a real person, is a matter of vocabulary. I only found sweep
> and prop.table by chance since neither of them are indexed by words like
> "proportion" or "percentage" which is what I had been looking for.
> Similarly I still do not know exactly what "sweep" does, since I have
> never heard this verb used in a mathematical / statistical context, and
> the help on sweep states that what it does is sweep. I have experienced
> many similar examples in the last few weeks. This is not to say that
> there is anything wrong with the help on these functions nor with the
> help in general, but what R does not have is an extensive indexing
> system by synonyms and uses. It is largely for reasons like this, I
> believe, that trivial questions continue to be asked. If one does not
> know the name of the function to do "verb" and one has tried "verb" and
> the synonyms which spring to mind and drawn a blank, where to next?
>
> Another reason for difficulty is that while a function may exist to do
> something, it is sometimes hard to find the package where it is
> contained, e.g. Frank Harrell's functions seem to be in a package called
> Hmisc which is not listed in the drop-down box for "load package".
>
> System and version information:
>
> platform i386-pc-mingw32
> arch i386
> os mingw32
> system i386, mingw32
> status
> major 2
> minor 0.1
> year 2004
> month 11
> day 15
> language R
>
> Rachel Pearce
>
> British Society of Blood and Marrow Tranplantation
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>
--
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 452-1424 (M, W, F)
fax: (917) 438-0894
More information about the R-help
mailing list