[R] relationship between two discrete variables
Paul Sorenson
Paul.Sorenson at vision-bio.com
Tue Nov 11 00:10:50 CET 2003
I want to investigate possible relationships between two discrete variables. I have tried a few things but figured you guys might be able to point me at some purpose built functions.
Our scientists score results of tests which are performed in lets say, 8 positions. The scores are assigned a value of 1,2,3 or 4. I want to know if there is a correlation between the test results and the position. The scientists have a feeling that position 1 does not score as high as the others.
Not all 8 positions are always used, so the frequency of all test results can be substantially biased towards the first position. Here is an example dataset (not very biased) resulting from table(result, position):
1 2 3 4 5 6 7 8
0 3 3 2 2 0 3 3 0
1 11 4 6 7 7 3 3 5
2 38 37 32 38 31 21 23 27
3 51 66 54 66 57 37 58 56
4 3 1 3 0 1 0 1 1
Because the test results are highly quantized, the boxplots I tried all looked pretty much the same.
The bias means that stacked barplots aren't that useful for visualising the data. With a bit of data processing I guess I could normalise the total frequencies of each test position.
I also tried a correlation between the two variables. The answer is non-zero but I am not sure that any relationship between the two variables would be monotonic (BTW cor() give me the correlation coefficient, how do I get the "confidence" of the coefficient?)
Maybe I am overlooking the obvious, like just averaging the scores.
cheers
More information about the R-help
mailing list