[R] bigkmeans not parallel

huazi0204 lishuliu at gmail.com
Thu Feb 2 23:22:01 CET 2012


I'm using bigkmeans in 'biganalytics' to cluster my 60,000 by 600,000 matrix.
I'm using a 8 core Linux VM.
I have register parallel backend with 
>registerDoMC()

And I checked how many cores registered with
>getDoParWorkers()
It returns 8, which is the number of cores I have on my machine. 

And I run the test below, whose results shows improved speed due to
parallel.
check <-function(n) {
+ for(i in 1:1000)
+ {
+ sme <- matrix(rnorm(100), 10,10)
+ solve(sme)
+ }
+ }
times <- 100     # times to run the loop
system.time(x <- foreach(j=1:times ) %dopar% check(j))
user  system elapsed
-----        ------       4
system.time(x <- foreach(j=1:times ) %do% check(j))
user  system elapsed
-----        -------      16

But when I run my data in bigkmeans
>ans <- bigkmeans(data,200,nstart=5,iter.max=20)
I see only one R process in system monitor, and only one CPU usage is high.
I guess it's not really parallel. 

I also tried DoSNOW, though it's used for multi clusters. 
>cl <- makeCluster(8,type="SOCK")
>registerDoSNOW(cl)
>ans <- bigkmeans(data,200,nstart = 30)
There are 8 R processes but only 1 running.  


Is it because I have something misconfigured? Or is the bigkmeans do not
support parallel?


Thanks in advance to any advise.

Regards,
Lishu

--
View this message in context: http://r.789695.n4.nabble.com/bigkmeans-not-parallel-tp4353036p4353036.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list