[R] scatterplot and correlation for weird data format
Jim Lemon
jim at bitwrit.com.au
Tue Feb 17 12:02:24 CET 2009
William Simpson wrote:
> I have data in a format like this:
>
> name ssex sex view num rating rt
> ahl4 f m f 56 -108 2246
> ahl4 f m f 74 85 1444
> ahl4 f m f 52 151 1595
> ahl4 f m f 85 1 1447
> ahl4 f m f 53 46 1716
> ahl4 f m f 37 145 1276
> ahl4 f m f 50 98 1465
> ahl4 f m f 51 -26 1322
> ahl4 f m f 38 -97 1790
> ahl4 f m f 14 -158 865
> ...
> ahl4 f m p 43 -136 1669
> ahl4 f m p 10 -59 808
> ahl4 f m p 67 -111 1279
> ahl4 f m p 85 -86 994
> ahl4 f m p 100 134 1337
> ahl4 f m p 76 56 665
> ahl4 f m p 51 -49 594
> ahl4 f m p 33 -118 505
> ahl4 f m p 49 -156 1283
> ...
> and so on for many subjects (name)
>
> I would like to do a scatterplot of the rating given by each subject
> (with identifier "name") for the frontal (view=="f") and profile
> (view=="p") views of each face (each face has an identifier "num").
> I'd like to find the correlation as well.
> For each subject, since there are 100 faces, there will be 100 points
> on the scatterplot. I would just lump all the subjects' data together
> for the plot and correlation I think (unless somebody tells me I
> should do each subject separately).
>
> I'm stumped on how to do this. Thanks very much for any help!
>
Hi Bill,
The first thing that comes to mind is a variation on count.overplot, a
function that displays the number of overplotted points for a given
tolerance rather than a blur of separate symbols. The problem would be
separating the various categories of experimental stimuli in your case.
You could use, say, "F" and "P" as suffixes for the counts to indicate
orientation, color to indicate sex of face, male/female symbol for sex
of respondent, and so on. The problem is that you end up with a
difficult to interpret plot, as each entry (of which there will still be
many) must be decoded by the viewer. If you think this is worth
pursuing, email me and I will try to outline a way to do it.
Another, perhaps simpler way is to define a summary score for each
subject for each class of face and plot that.
Jim
More information about the R-help
mailing list