[R-sig-Geo] Classification of attribute table

ONKELINX, Thierry Thierry.ONKELINX at inbo.be
Mon May 11 15:49:25 CEST 2009

Dear Wesley,

Have a look a kmeans clustering. That will allow you to divide the data
points in a given number of clusters without any other user input.



ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature
and Forest
Cel biometrie, methodologie en kwaliteitszorg / Section biometrics,
methodology and quality assurance
Gaverstraat 4
9500 Geraardsbergen
tel. + 32 54/436 185
Thierry.Onkelinx at inbo.be 

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to
say what the experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of
~ John Tukey

-----Oorspronkelijk bericht-----
Van: r-sig-geo-bounces at stat.math.ethz.ch
[mailto:r-sig-geo-bounces at stat.math.ethz.ch] Namens Wesley Roberts
Verzonden: maandag 11 mei 2009 14:36
Aan: Dan Putler
CC: r-sig-geo at stat.math.ethz.ch
Onderwerp: Re: [R-sig-Geo] Classification of attribute table

Hi Dan,

Thanks for the advice. I want to classify my data into three classes;
canopy, non-canopy and ground based on six input variables. The input
variables are mean, min, max, median, var, stdev, and kurtosis of
spatially co-incident spectra associated with each segment. I have 1916
cases and the data are formatted like an ESRI attribute table, each row
corresponds to one particular segment,
      mean  min  max  median  var  stdev  kurtosis
2        values extracted from the imagery

I would thus like to classify the segments into three classes and
essentially add an additional column to the attribute table with values
1, 2, and 3 denoting the class of the particular segment. Ideally the
classification must be un-supervised as the whole procedure should be as
automatic as possible with limited input from the user. Initially I
wanted to use lda (MASS) but it required training classes. 

An alternative option is to use the hypothesis that segments with
brighter spectra are more likely to come from tree crowns and thus just
subset / select the segments which fall into for example the 90th
percentile and label those as tree crowns.

Many thanks,

Wesley Roberts MSc.
Researcher: Earth Observation (Ecosystems)
Natural Resources and the Environment
Tel: +27 (21) 888-2490
Fax: +27 (21) 888-2693

"To know the road ahead, ask those coming back."
- Chinese proverb

>>> Dan Putler <dan.putler at sauder.ubc.ca> 05/07/09 6:13 PM >>> 
Hi Wesley,

Is this classification problem or a clustering problem? Specifically, is
the ultimate goal to predict what segment a new polygon belongs in, or
are you trying to form 3 segments to begin with based on the six
measures you have available? If it is the latter, it is a cluster
analysis problem rather than a classification problem, and you'll want
to look at the Cluster Analysis and Finite Mixture Models task view at


On Thu, 2009-05-07 at 14:58 +0200, Wesley Roberts wrote:
> Dear R-sig-geo users,
> I have the output of a watershed segmentation in vector format
(shapefile) which has it's attribute table populated with statistics
regarding spectral reflectance of each polygon object. The attribute
data was sourced from a geographically co-incident aerial photograph. I
would now like to classify the segments using the attribute data. This
seems like an easy task but I am struggling to find a suitable method. I
have looked at 'lda' and 'qda' in the MASS package but the selection of
an appropriate model using 'cv1EMtrain' takes a really long time. In
essence all I want to do is classify the 6 variable data set into 3
classes with the class for each case recorded in the attribute table. 
> Any advice or suggestions would be greatly appreciated.
> Many thanks and kind regards,
> Wesley
> Wesley Roberts MSc.
> Researcher: Earth Observation (Ecosystems)
> Natural Resources and the Environment
> Tel: +27 (21) 888-2490
> Fax: +27 (21) 888-2693
> "To know the road ahead, ask those coming back."
> - Chinese proverb
Dan Putler
Sauder School of Business
University of British Columbia

This message is subject to the CSIR's copyright terms and conditions,
e-mail legal notice, and implemented Open Document Format (ODF)
The full disclaimer details can be found at

This message has been scanned for viruses and dangerous content by
and is believed to be clean.  MailScanner thanks Transtec Computers for
their support.

R-sig-Geo mailing list
R-sig-Geo at stat.math.ethz.ch

Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer 
en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is
door een geldig ondertekend document. The views expressed in  this message 
and any annex are purely those of the writer and may not be regarded as stating 
an official position of INBO, as long as the message is not confirmed by a duly 
signed document.

More information about the R-sig-Geo mailing list