[R] Advice for working with Sammon's Projection on image data

B. Bogart bbogart at sfu.ca
Sat May 31 18:54:49 CEST 2008


Hello all,

I'm working on a project that uses a SOM to generate feature maps from
image data. Each image is a 100x75 pixels and RGBA, so 30,000 elements
per image.

I wanted to observe the structure of the pixel by pixel euclidean
distance (where each image is a point in 30,000 dimensional space) of my
image data.

Sammon's projection seems appropriate for this, though I'm a bit
concerned 400 images with 30,000 dimensions make be too large for the
algo. I'm planning on only publishing B+W images, so it is possible I
could throw away the colour channels to make each image 7500 dimensions.

Also I'm not sure how to structure my data to make using it with
Sammon's projection easiest. A 30,000 dimensional matrix for each image
occurs to me first, but I'm not sure what the best data format is for
that. I'm using two data types currently, the raw file has 30,000
variables and one observation per image. This is converted to a data
frame where each pixel is a observation and there are variables for its
position per image, and which image in which it is contained which is
used for ggplot2 tile plotting.

What I am attempting to do is compare the topology of my test data in
order to compare it to the topology of the SOM weights trained on that
data. The projections should be similar is topology is being preserved
correct?

Any advice appreciated.

Thanks,
B. Bogart



More information about the R-help mailing list