[R] Varying results of sammon(), for the same data set
Ole Edsberg
edsberg at stud.ntnu.no
Mon Jan 30 09:39:15 CET 2006
Hello,
I have a data set on which I run the sammon algorithm as follows:
library(MASS)
data = read.table('problemforr.dat')
y = cmdscale(data, add=TRUE)
s = sammon(data, y$points)
(In case it should be relevant, I make the data available at
http://idi.ntnu.no/~edsberg/problemforr.dat)
With R 2.2.1 on Debian Sid I always get one of two solutions (stress
1.74288 after 10 iterations or stress 1.33629 afer 9 iterations). I
always get the same result within the same R session, even if I read
the data again. With R 2.2.0 on SunOS 5.9 I always get the same result
(stress 0.13186 after 74 iterations).
I understand that the sammon algorithm is very sensitive to even tiny
variations in the starting point, but the observed behaviour seems
strange to me. Difference between machines could perhaps be explained
by floating point portability issues, but not difference on the same
machine, and not the fact that i get the same result within the same R
session.
I read in the documentation
(http://stat.ethz.ch/R-manual/R-patched/library/MASS/html/sammon.html)
that "Further, since the configuration is only determined up to
rotations and reflections (by convention the centroid is at the
origin), the result can vary considerably from machine to machine."
This doesn't make sense to me. If the data and the algorithm is the
same, the result should be the same. What differences between machines
do they refer to here? Floating point issues?
I must admit that I am a beginner, both in R and in statistics. I'm
very curious about the cause of this strangeness. Does anybody have an
explanation?
Best Regards,
Ole Edsberg
More information about the R-help
mailing list