[R] Multi-GPU "Yinyang" K-means and K-nn for R

Thu Feb 23 17:05:32 CET 2017

Hi Vadim,

I would be happy to explore helping you out with this.  I am quite active
in development for GPU use in R.  You can see my work on my github (
https://github.com/cdeterman) and the group I created for additional
packages in development (https://github.com/gpuRcore).  I believe it would
be best though to take this conversation off list though.  If you would
like to discuss this further please email me separately.

Kind regards,
Charles

On Thu, Feb 23, 2017 at 4:37 AM, Vadim Markovtsev <vadim at sourced.tech>
wrote:

> ¡Hola!
>
> This is to announce that [kmcuda](https://github.com/src-d/kmcuda) has
> obtained native R bindings and ask for the help with CRAN packaging.
> kmcuda is my child: an efficient GPGPU (CUDA) library to do K-means
> and K-nn on as much data as fits into memory. It supports running on
> multiple GPUs simultaneously, angular distance metric, Yinyang
> refinement, float16 (well, not in R for sure), K-means++ and AFK-MC2
> initialization. I am thinking about Minibatch in the near future.
>
> Usage example:
>
>     dyn.load("libKMCUDA.so")
>     samples <- replicate(4, runif(16000))
>     result = .External("kmeans_cuda", samples, 50, tolerance=0.01,
>                                  seed=777, verbosity=1)
>     print(result$centroids)
>     print(result$assignments[1:10,])
>
> This library only supports Linux and macOS at the moment. Windows
> port is welcome.
>
> I knew pretty much nothing about R a week ago so would be glad to your
> suggestions. Besides, I've never published anything to CRAN and it
> will take some time for me to design a full package following the
> guidelines and rules. It will be awesome If somebody is willing to
> help! It seems to be the special fun to package the CUDA+OpenMP
> code for R and this fun doubles on macOS where you need a specific
> combination of two different clang compilers to make it work.
>
> Besides, I have a question which prevents me from sleeping at night:
> how is R able to support matrices with dimensions larger than
> INT32_MAX if the only integer type in C API is int (32-bit signed on
> Linux)? Even getting the dimensions with INTEGER() automatically leads
> to the overflow.
> --
> Best regards,
>
> Vadim Markovtsev
> Lead Machine Learning Engineer || source{d} / sourced.tech / Madrid
> StackOverflow: 69708/markhor | GitHub: vmarkovtsev | data.world:
> vmarkovtsev
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

	[[alternative HTML version deleted]]