[R] SVM classification based on pairwise distance matrix

Martin Tomko martin.tomko at geo.uzh.ch
Fri Oct 22 13:01:01 CEST 2010


Hi Steve,
thanks a lot, I will haev a look at the kernel appraoch ,that looks 
promising. I will first have to study the theory behind before I use it, 
I guess.
Cheers
M.

On 10/21/2010 5:42 PM, Steve Lianoglou wrote:
> Hi,
>
> On Thu, Oct 21, 2010 at 9:42 AM, Martin Tomko<martin.tomko at geo.uzh.ch>  wrote:
>    
>> Dear all,
>> I am exploring the possibilities for automated classification of my
>> data. I have successfully used KNN, but was thinking about looking at
>> SVM (which I did nto use before).
>> I have a pairwise distance matrix of training observations which are
>> classified in set classes, and a distance matrix of new observations to
>> the  training ones.
>>      
> It seems to me that since you have some pairwise distance metric, your
> original data is in some "vector form".
>
> Why not just try using your original data (forget the pairwsise
> distance for now) and try a few different kernels for the svm, such as
> a linear kernel or an rbf/gaussian.
>
>    
>> Is it possible to use distance matrices for SVM, and if yes, which
>> package would do so (e1071 ? ).
>>      
> I guess you can think of a "kernel matrix" as something like a
> distance matrix -- actually, it's more like a similarity matrix.
>
> I don't recall if e1071 allows you to use kernel matrix as input, but
> I'm pretty sure the svm functions from kernlab do. It was a pain to
> use, though.
>
> But anyway -- don't use your distance matrix :-)
>
>    
>> I have little experience with SVM, and I had the impression that it is
>> a/ usually used with data taht have observations in terms of a number of
>> variables (hence, not pariwise distances);
>>      
> With the exception of "plugging in" a kernel matrix (which was
> calculated from data in its original feature space) that's pretty much
> correct.
>
>    
>> b/ it is not well suited for large multidimensional spaces (I have a
>> distance matrix of 200*200 observations, a part of this could be used as
>> training data, but still, we are looking at say 50 distances per
>> observation).
>>      
> But your distance matrix isn't really the same multidemensional space
> your data lives in, right?
>
> Anyway, like I said before, try the SVM on your original data with
> some different kernels. I think the RBF kernel should be closest in
> spirit to your distance matrix, and will likely perform better than
> your kNN ;-).
>
> Hope that helps,
> -steve
>
>



More information about the R-help mailing list