[R] svm of e1071 package

Steve Lianoglou mailinglist.honeypot at gmail.com
Tue Apr 6 17:35:13 CEST 2010


Hi,

On Tue, Apr 6, 2010 at 8:07 AM, Shyamasree Saha [shs] <shs at aber.ac.uk> wrote:
> Hello List,
>
> I am having a great trouble using svm function in e1071 package. I have 4gb of data that i want to use to train svm. I am using Amazon cloud, my Amazon Machine Image(AMI) has 34.2 GB of memory. my R process was killed several times when i tried to use 4GB of data for svm. Now I am using a subset of that data and it is only 1.4 GB.  i remove all unnecessary objects before calling svm(). I have monitored the memory consumption and found that before i call svm() my AMI has 25GB of free memory. after calling svm(), this free memory starts going down and at the end i have only 1.7 gb of memory and R gives me error that it can not create vector of size 3.4 gb. Its true that if i do not have enough memory then how R will create the vector. But my question is how svm function is eating up that 25gb of memory?? do i have anything to do to solve this problem or its a problem in e1071 package ? by "problem in e1071 package", i mean does svm() in e1071 normally consume that high amount !
>  of memory? if svm() really consume this much memory then i have to think of some other way to train svm. if 34gb ram is not enough for 1.4 gb of data then i am in trouble. Amazon has maximum 68.4gb ram.

I think we need more info regarding your problem.

I'm guessing the answer must be yes since you're chewing up all that
memory, but are you you sure you're running R in 64-bit mode? What do
you get when you type the following in the R console:

R> .Machine$sizeof.pointer ## it should be 8

* What type of kernel are you using? Have you tried different ones?
* Are you doing classification or regression?
* Is your data/feature matrix sparse? If so, are you passing libsvm a
SparseM matrix?
* Have you tried playing with some of the params in the svm call, like
the values for tolerance, epsilon, cost/nu/etc.
* Try an even smaller subset of your data (< 1.4 GB)
* What is the dimensionality of your X matrix -- how many examples,
and how many features does each example have
* Include sessionInfo() -- we don't know what version of R/e1071 etc.
* There is a kernlab package that also implements the svm, try that.
* You can also try to precompute a kernel matrix and send that into
kernlab's ksvm function, maybe that helps?

Don't know, lots of things ... and you didn't provide any code, so
it's hard to figure out what's up.

If your problem is really too huge, there are other svm
implementations you might consider looking into, such as Pegasos SVM,
liblienar, svm^perf, etc., depending on the problem you're trying to
solve.

-steve

-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact



More information about the R-help mailing list