[Rd] question about Makeconf and nvcc/CUDA

Hodgess, Erin HodgessE at uhd.edu
Thu Jul 18 22:12:00 CEST 2013


Wow!

This is just amazing!

Thanks so much.  I didn't realize how intense this is.


________________________________________
From: Duncan Temple Lang [dtemplelang at ucdavis.edu]
Sent: Thursday, July 18, 2013 3:08 PM
To: Hodgess, Erin
Cc: r-devel at r-project.org
Subject: Re: [Rd] question about Makeconf and nvcc/CUDA

Hi Erin

See the code below.

Basically, I have created a new routine that you will
call from R as

 .C("cuda4", 1L, "5")

where 1L is the number of arguments you are passing and "5" is the character vector of arguments.

We are using .C() here for simplicity. For other cases involving data, .Call() would be better.

That cuda4 routines is now not name-mangled, and has the correct parameter types and return type
to be called via the .C().
The good thing is this has nothing to do with CUDA, but just calling C++ routines from R.


This illustrates that there are complexities here with different devices, languages, etc.
This is one of the reasons a high-level interface to calling kernels is simpler and
more flexible.

The following R only code invokes the kernel on actual data we have in R (x).

# Put the square_array routine only in a file named, erinHodgess.cu
# add extern "C" before the square_array routine

  # generate the PTX code
f = nvcc("erinHodgess.cu")

  # load the PTX code
mod = loadModule(f)

  # Invoke the PTX code
x = rnorm(100000)
ans = .gpu(mod$square_array, x, length(x), gridBy = x)

all.equal(x*x, ans)


  D.

////////////////////////////////


#include <stdio.h>
#include <cuda.h>

// Kernel that executes on the CUDA device
extern "C"
__global__ void square_array(float *a, int N)
{
  int idx = blockIdx.x * blockDim.x + threadIdx.x;
  if (idx<N) a[idx] = a[idx] * a[idx];
}

// main routine that executes on the host
void stuff(int argc, char **argv)
{
  float *a_h, *a_d;  // Pointer to host & device arrays
  int N = atoi(argv[1]);
//  const int N = 10;   Number of elements in arrays
  size_t size = N * sizeof(float);
  a_h = (float *)malloc(size);        // Allocate array on host
  cudaMalloc((void **) &a_d, size);   // Allocate array on device
  // Initialize host array and copy it to CUDA device
  for (int i=0; i<N; i++) a_h[i] = (float)i;
  cudaMemcpy(a_d, a_h, size, cudaMemcpyHostToDevice);
  // Do calculation on device:
  int block_size = 4;
  int n_blocks = N/block_size + (N%block_size == 0 ? 0:1);
  square_array <<< n_blocks, block_size >>> (a_d, N);
  // Retrieve result from device and store it in host array
  cudaMemcpy(a_h, a_d, sizeof(float)*N, cudaMemcpyDeviceToHost);
  // Print results
  for (int i=0; i<N; i++) printf("%d %f\n", i, a_h[i]);
  // Cleanup
  free(a_h); cudaFree(a_d);
}

extern "C"
void
cuda4(int *nels, char **els)
{
   stuff(*nels, els);
}


On 7/18/13 12:46 PM, Hodgess, Erin wrote:
> Hi again.
>
> I put in the extern statement.
>
> However, I'm not sure what you mean by changing the signature, please.
>
> I changed out the routine name from "stuff" to "cuda4" in the cuda4.cu program.
>
> Still getting:
>
>> library(cudasize)
> library(cudasize)
>
>>
>
>> .C("cuda4",as.character(5))
> .C("cuda4",as.character(5))
> Error in .C("cuda4", as.character(5)) :
>   C symbol name "cuda4" not in load table
> Execution halted
>
> c:\Program Files\R\R-3.0.1\bin\i386>
>
> I still want to keep at it.
>
> Thanks so much for all of your help!!!
>
> Erin
>
> ________________________________________
> From: Duncan Temple Lang [dtemplelang at ucdavis.edu]
> Sent: Thursday, July 18, 2013 1:09 PM
> To: Hodgess, Erin
> Cc: r-devel at r-project.org
> Subject: Re: [Rd] question about Makeconf and nvcc/CUDA
>
> Hi Erin
>
> It appears that there is only one file in src that is compile, i.e. cuda4.cu.
>
> That does not contain a routine named cuda4, which is what you are trying to invoke
> via the .Call("cuda4") expression.
>
> Instead, it contains two routines - one kernel square_array which runs
> on the GPU, and the routine stuff.
>
> To be able to call the stuff() routine from R, you would have to
> a)  add
>       extern "C"
>     before its definition/declaration so that its name is not mangled by nvcc
>
> b)  change its signature so that it can be invoked via a .Call() or alternatively
>     a .C().
>
>  I would create another routine that acts as an intermediate routine
> called from R that then calls stuff().
>
> This assumes that what you showed us of cuda4.cu is all there is in it.
>    D.
>
> On 7/18/13 11:01 AM, Hodgess, Erin wrote:
>> Hi again:
>>
>> Here is another problem that I am having.  Hope this will be the last one.  I really want to see if I can put it together.  Sorry for belaboring the issue.
>>
>> Well, here is my story:
>>
>> c:\Program Files\R\R-3.0.1\bin\i386>R CMD build cudasize
>> R CMD build cudasize
>> * checking for file 'cudasize/DESCRIPTION' ... OK
>> * preparing 'cudasize':
>> * checking DESCRIPTION meta-information ... OK
>> * cleaning src
>> Warning: C:/Users/erin/AppData/Local/Temp/RtmpOKsfga/Rbuild22e066bf13fb/cudasize/man/f1.Rd:33: unexpected '{'
>> Warning: C:/Users/erin/AppData/Local/Temp/RtmpOKsfga/Rbuild22e066bf13fb/cudasize/man/f1.Rd:37: unexpected '}'
>> * checking for LF line-endings in source and make files
>> * checking for empty or unneeded directories
>> * building 'cudasize_1.0.tar.gz'
>>
>>
>> c:\Program Files\R\R-3.0.1\bin\i386>R CMD INSTALL --no-multiarch cudasize_1.0.tar.gz
>> R CMD INSTALL --no-multiarch cudasize_1.0.tar.gz
>> * installing to library 'c:/myRlib'
>> * installing *source* package 'cudasize' ...
>> ** libs
>> nvcc -m32 --shared -o cuda4.dll cuda4.cu
>> cuda4.cu
>> tmpxft_000012fc_00000000-5_cuda4.cudafe1.gpu
>> tmpxft_000012fc_00000000-10_cuda4.cudafe2.gpu
>> cuda4.cu
>> tmpxft_000012fc_00000000-5_cuda4.cudafe1.cpp
>> tmpxft_000012fc_00000000-15_cuda4.ii
>> installing to c:/myRlib/cudasize/libs/i386
>> ** R
>> ** preparing package for lazy loading
>> ** help
>> Warning: C:/Users/erin/AppData/Local/Temp/RtmpKSm696/R.INSTALL7fc517f4e58/cudasize/man/f1.Rd:33: unexpected '{'
>> Warning: C:/Users/erin/AppData/Local/Temp/RtmpKSm696/R.INSTALL7fc517f4e58/cudasize/man/f1.Rd:37: unexpected '}'
>> Warning: C:/Users/erin/AppData/Local/Temp/RtmpKSm696/R.INSTALL7fc517f4e58/cudasize/man/f1.Rd:31: All text must be in a section
>> Warning: C:/Users/erin/AppData/Local/Temp/RtmpKSm696/R.INSTALL7fc517f4e58/cudasize/man/f1.Rd:32: All text must be in a section
>> Warning: C:/Users/erin/AppData/Local/Temp/RtmpKSm696/R.INSTALL7fc517f4e58/cudasize/man/f1.Rd:34: All text must be in a section
>> Warning: C:/Users/erin/AppData/Local/Temp/RtmpKSm696/R.INSTALL7fc517f4e58/cudasize/man/f1.Rd:35: All text must be in a section
>> Warning: C:/Users/erin/AppData/Local/Temp/RtmpKSm696/R.INSTALL7fc517f4e58/cudasize/man/f1.Rd:36: All text must be in a section
>> *** installing help indices
>> ** building package indices
>> ** testing if installed package can be loaded
>> * DONE (cudasize)
>>
>> c:\Program Files\R\R-3.0.1\bin\i386>R --vanilla
>> R --vanilla
>>
>> R version 3.0.1 (2013-05-16) -- "Good Sport"
>> Copyright (C) 2013 The R Foundation for Statistical Computing
>> Platform: i386-w64-mingw32/i386 (32-bit)
>>
>> R is free software and comes with ABSOLUTELY NO WARRANTY.
>> You are welcome to redistribute it under certain conditions.
>> Type 'license()' or 'licence()' for distribution details.
>>
>> R is a collaborative project with many contributors.
>> Type 'contributors()' for more information and
>> 'citation()' on how to cite R or R packages in publications.
>>
>> Type 'demo()' for some demos, 'help()' for on-line help, or
>> 'help.start()' for an HTML browser interface to help.
>> Type 'q()' to quit R.
>>
>>> library(cudasize)
>> library(cudasize)
>>> f1
>> f1
>> function (x)
>> {
>>     y <- .Call("cuda4", as.character(x))
>>     return(y)
>> }
>> <environment: namespace:cudasize>
>>> f1(2)
>> f1(2)
>> Error in .Call("cuda4", as.character(x)) :
>>   "cuda4" not resolved from current namespace (cudasize)
>> Calls: f1 -> .Call
>> Execution halted
>>
>> Here is the Makevars.win file
>>
>> cuda4.dll:
>>       nvcc -m32 --shared -o cuda4.dll cuda4.cu
>>
>> And finally, the program itself:
>>
>>
>> #include <stdio.h>
>> #include <cuda.h>
>>
>> // Kernel that executes on the CUDA device
>> __global__ void square_array(float *a, int N)
>> {
>>   int idx = blockIdx.x * blockDim.x + threadIdx.x;
>>   if (idx<N) a[idx] = a[idx] * a[idx];
>> }
>>
>> // main routine that executes on the host
>> void  stuff(int argc, char **argv)
>> {
>>   float *a_h, *a_d;  // Pointer to host & device arrays
>>   int N = atoi(argv[1]);
>> //  const int N = 10;   Number of elements in arrays
>>   size_t size = N * sizeof(float);
>>   a_h = (float *)malloc(size);        // Allocate array on host
>>   cudaMalloc((void **) &a_d, size);   // Allocate array on device
>>   // Initialize host array and copy it to CUDA device
>>   for (int i=0; i<N; i++) a_h[i] = (float)i;
>>   cudaMemcpy(a_d, a_h, size, cudaMemcpyHostToDevice);
>>   // Do calculation on device:
>>   int block_size = 4;
>>   int n_blocks = N/block_size + (N%block_size == 0 ? 0:1);
>>   square_array <<< n_blocks, block_size >>> (a_d, N);
>>   // Retrieve result from device and store it in host array
>>   cudaMemcpy(a_h, a_d, sizeof(float)*N, cudaMemcpyDeviceToHost);
>>   // Print results
>>   for (int i=0; i<N; i++) printf("%d %f\n", i, a_h[i]);
>>   // Cleanup
>>   free(a_h); cudaFree(a_d);
>> }
>>
>> Any suggestions would be appreciated.
>>
>> Thanks,
>> Erin
>>
>> ________________________________________
>> From: r-devel-bounces at r-project.org [r-devel-bounces at r-project.org] on behalf of Duncan Temple Lang [dtemplelang at ucdavis.edu]
>> Sent: Thursday, July 18, 2013 9:16 AM
>> To: r-devel at r-project.org
>> Subject: Re: [Rd] question about Makeconf and nvcc/CUDA
>>
>> Hi Erin
>>
>>  Glad you are making progress on this with Brian's help.
>>
>>  I thought I would mention a different approach that might save you some
>> programming time and actually make the code more flexible at the same time.
>> Basically, in a high-level language like R, it is nice to keep the code
>> calling a GPU kernel also high-level rather than writing code in C/C++
>> to do the transfer of data to and from the CPU and GPU.
>>
>> Simon's OpenCL package (on CRAN) and the new RCUDA package (www.omegahat.org/RCUDA)
>> allow us to load and invoke kernels directly from R.
>> These provide more flexibility for invoking GPU kernels from R than fixed C/C++ code.
>>
>> RCUDA provides an interface to almost all of the CUDA API and so allows us to transfer
>> values from R to the GPU and back (in different ways), invoke kernels asynchronously, etc.
>> I haven't built the package on Windows yet, but may be able to get to that in the next
>> few days.
>>
>> With a Windows binary of RCUDA (or OpenCL), you would be able to skip nvcc in your
>> package by  compiling the kernel code generically so that it contains code for GPUs
>> with different  capabilities (e.g. sm_20, sm_30, sm_35).
>>
>> In the last few days, I have also managed to compile very simple R code
>> directly in R to PTX code that we can load and invoke using RCUDA.
>> The compilation uses the Rllvm and RLLVMCompile packages. So in the
>> future, I expect we will be able to compile simple R functions
>> to native and PTX code.
>>
>>  D.
>>
>> On 7/17/13 11:45 PM, Hodgess, Erin wrote:
>>> Dear R development:
>>>
>>>
>>>
>>> I'm not sure if this is the appropriate list, but it's a start.
>>>
>>>
>>>
>>> I would like to put together a package which contains a CUDA program on Windows 7.  I believe that it has to do with the Makeconf file in the etc directory.
>>>
>>>
>>>
>>> But when I just use the nvcc with the shared option, I can use the dyn.load command, but when I use the is.loaded function, it shows FALSE.
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> Here are the results of the check command:
>>>
>>>
>>>
>>> c:\PROGRA~1\R\R-3.0.1\bin\i386>R CMD check cudasize
>>> R CMD check cudasize
>>> * using log directory 'c:/PROGRA~1/R/R-3.0.1/bin/i386/cudasize.Rcheck'
>>> * using R version 3.0.1 (2013-05-16)
>>> * using platform: i386-w64-mingw32 (32-bit)
>>> * using session charset: ISO8859-1
>>> * checking for file 'cudasize/DESCRIPTION' ... OK
>>> * checking extension type ... Package
>>> * this is package 'cudasize' version '1.0'
>>> * checking package namespace information ... OK
>>> * checking package dependencies ... OK
>>> * checking if this is a source package ... OK
>>> * checking if there is a namespace ... OK
>>> * checking for executable files ... OK
>>> * checking for hidden files and directories ... OK
>>> * checking for portable file names ... OK
>>> * checking whether package 'cudasize' can be installed ... ERROR
>>> Installation failed.
>>> See 'c:/PROGRA~1/R/R-3.0.1/bin/i386/cudasize.Rcheck/00install.out' for details.
>>>
>>> And the 00install.out file:
>>>
>>> * installing *source* package 'cudasize' ...
>>>
>>> ** libs
>>>
>>>
>>>
>>> *** arch - i386
>>>
>>> cygwin warning:
>>>   MS-DOS style path detected: c:/PROGRA~1/R/R-30~1.1/etc/i386/Makeconf
>>>   Preferred POSIX equivalent is: /cygdrive/c/PROGRA~1/R/R-30~1.1/etc/i386/Makeconf
>>>   CYGWIN environment variable option "nodosfilewarning" turns off this warning.
>>>   Consult the user's guide for more details about POSIX paths:
>>>     http://cygwin.com/cygwin-ug-net/using.html#using-pathnames
>>> cygwin warning:
>>>   MS-DOS style path detected: c:/PROGRA~1/R/R-30~1.1/etc/i386/Makeconf
>>>   Preferred POSIX equivalent is: /cygdrive/c/PROGRA~1/R/R-30~1.1/etc/i386/Makeconf
>>>   CYGWIN environment variable option "nodosfilewarning" turns off this warning.
>>>   Consult the user's guide for more details about POSIX paths:
>>>     http://cygwin.com/cygwin-ug-net/using.html#using-pathnames
>>> make: `symbols.rds' is up to date.
>>> ERROR: compilation failed for package 'cudasize'
>>>
>>> * removing 'c:/PROGRA~1/R/R-3.0.1/bin/i386/cudasize.Rcheck/cudasize'
>>>
>>>
>>>
>>> I've been experimenting with the Makeconf file, but to no avail.
>>>
>>>
>>>
>>> Does anyone have any suggestions, please?
>>>
>>>
>>>
>>> Thanks,
>>>
>>> Erin
>>>
>>>
>>>
>>>       [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> R-devel at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>>
>> ______________________________________________
>> R-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>



More information about the R-devel mailing list