[R] stat problem: scaling of random subset of survey?

context grey mobygeek at yahoo.com
Sat May 6 03:01:11 CEST 2006

```Hello,

I'm curious if anyone has encounted a version of this
problem
(and it's solution) involving finding a consistent set
of scales
for subsets of survey data.

The goal is to obtain peoples' rankings of pairwise
similarity of a large
number of items, on a 1..5 scale for example.
How similar is object A to B    on a 1..5 scale ___
How similar is object A to C    on a 1..5 scale ___
etc.

Because there are many items, there are N(N-1)/2
pairs, so it is not
practical to show every pair to everyone.   Showing
people the
pairs corresponding to random subsets of the objects
seems desirable.

THe problem is that, a particular random subset might
by chance
contain objects that would all be rated "5" if one
were to see
the entire dataset.  When ranking pairs from this
subset, the scale
of 1..5 is different.

If we ensure that each pair of people must see some
data in common,
then one can think about obtaining a set of scales,
one for each
person, that causes the data that is commonly ranked
to have
as similar scores as possible, summed across all pairs
of people.

In more detail, person 1 ranks pairs  11,15,101, ...;
person 2 ranks pairs 2,15,73, ...
put these data in rows of a matrix R (ranking) with
zeros
for the data not ranked,
and we seek a diagonal matrix S that contains a scale
for each row
that maximizes the similarity of elements in common
(#15 in this case).
This is under determined, but we can require that
Tr(S'S) = 1,
since the overall scale is not important, only the
ranking.

That's the description... please let me know if you
know of
any similar problems.

Thank you.

```