[R] relationship between two discrete variables
Thomas W Blackwell
tblackw at umich.edu
Tue Nov 11 04:48:43 CET 2003
This situation seems like an obvious candidate for a log-linear model.
See the book MASS for details. They're beyond the scope of this list.
Or try help.search("log-linear").
(and ... can you find a way to break lines when sending your email ?)
- tom blackwell - u michigan medical school - ann arbor -
On Tue, 11 Nov 2003, Paul Sorenson wrote:
> I want to investigate possible relationships between two discrete variables. I have tried a few things but figured you guys might be able to point me at some purpose built functions.
> Our scientists score results of tests which are performed in lets say, 8 positions. The scores are assigned a value of 1,2,3 or 4. I want to know if there is a correlation between the test results and the position. The scientists have a feeling that position 1 does not score as high as the others.
> Not all 8 positions are always used, so the frequency of all test results can be substantially biased towards the first position. Here is an example dataset (not very biased) resulting from table(result, position):
> 1 2 3 4 5 6 7 8
> 0 3 3 2 2 0 3 3 0
> 1 11 4 6 7 7 3 3 5
> 2 38 37 32 38 31 21 23 27
> 3 51 66 54 66 57 37 58 56
> 4 3 1 3 0 1 0 1 1
> Because the test results are highly quantized, the boxplots I tried all looked pretty much the same.
> The bias means that stacked barplots aren't that useful for visualising the data. With a bit of data processing I guess I could normalise the total frequencies of each test position.
> I also tried a correlation between the two variables. The answer is non-zero but I am not sure that any relationship between the two variables would be monotonic (BTW cor() give me the correlation coefficient, how do I get the "confidence" of the coefficient?)
> Maybe I am overlooking the obvious, like just averaging the scores.
> R-help at stat.math.ethz.ch mailing list
More information about the R-help