[R-sig-hpc] multicore with functions calling an exe/.sh file

Mauricio Zambrano-Bigiarini mauricio.zambrano at jrc.ec.europa.eu
Fri Jun 24 17:51:38 CEST 2011


Thanks Steve for your answer.

Stephen Weston wrote:
> If you're executing an external program from R using mclapply, and that
> program is creating output files, then you need to arrange for that
> program to create the output files with different names, or to create them
> in different directories.  

Yes, I'm running an external program with mclapply, and I thought that 
there could exist some package that mange this automatically, by 
creating a copy of the working process (and all the files) on the local 
memory, in some way, but probably it was just too much imagination :)


Usually that would be done with a command line
> argument to the program.  If it doesn't have such an argument, the best
> thing to do is add one. 

I think this could be the only way of solving my problem. However, I 
also the additional issue that each process needs to modify the same 
position of the same input files at the same time. So I should also add 
an argument for the location of the input files.

  If you can't do that, you could try executing them
> from different directories by executing setwd in the parallel function
> before executing the external program.  Unfortunately, that could cause
> other problems, especially if the program is reading data files from the
> current working directory.  But hopefully, the program supports an option
> to change the output file name or directory.


The external program I'm running now doesn't have an option for setting 
the location of the input/output files. So probably the fastest solution 
could be to create as many copies of the input files as the amount of 
cores I want to use, and then to re-direct (in some way) each process to 
a different directory, each one storing all the necessary input files.

I wanted to avoid the previous solution, but I think is the only way to 
go forward...


> 
> - Steve
> 
> P.S.  Why is a program running on a Linux machine using a file with
> the path "Z:\home\zambrhe\S090-test\file.cio"?

Good question.
The directory that hold the input/output/exe/.sh files is:

model.drty <- "/home/zambrhe/S090-test"

and that is the argument I pass to R for executing my script.

However, my '/home/zambrhe/' directory is in a network drive, ad 
probably R see it as "Z:\home\zambrhe\S090-test\file.cio" for some reason...



Thanks you very much again.

Mauricio

-- 
=======================================================
FLOODS Action
Land Management and Natural Hazards Unit
Institute for Environment and Sustainability (IES)
European Commission, Joint Research Centre (JRC)
=======================================================
DISCLAIMER:
"The views expressed are purely those of the writer
and may not in any circumstances be regarded as stating
an official position of the European Commission."
=======================================================
Linux user #454569 -- Ubuntu user #17469
=======================================================
"Don't wish for less problems, wish for more skills.
Don't wish it were easier, wish you were better."
(Jim Rohn)


> 
> 
> On Fri, Jun 24, 2011 at 5:12 AM, Mauricio Zambrano-Bigiarini
> <mauricio.zambrano at jrc.ec.europa.eu> wrote:
>> Dear List,
>>
>> I'm just doing my first trials with HPC, and I would like to ask your
>> opinion regarding the following issue.
>>
>> In R 2.13.0, I have an optimization algorithm for hydrological models, which
>> internally runs the .exe/.sh file of the model, and then computes and writes
>> into a file the results of different parameter values.
>>
>> So far this algorithm runs only in a sequential mode, i.e., trying different
>> parameter values one after another, and I would like to parallelize it.
>>
>> My first attempt was using the multicore library, and changing the existing
>> 'lapply' loop for an 'mclapply' one. However, when I run the optimization
>> algorithm, I got several error messages:
>>
>> forrtl: Sharing violation
>> forrtl: Sharing violation
>> forrtl: Sharing violation
>>
>> forrtl: severe (30): open failure, unit 1, file
>> Z:\home\zambrhe\S090-test\file.cio
>>
>> which are due to the fact that all the 4 process are trying to access the
>> same .exe/.sh file and to modify the same input files at the same time.
>>
>>
>> Is there any way to use multicore for parallelizing this type of
>> optimization function or should I move to some master/slave option ?
>>
>>
>> Thanks in advance for any help.
>>
>>
>> Mauricio Zambrano-Bigiarini
>>
>>
>> PS,
>> SesionInfo():
>> R version 2.13.0 (2011-04-13)
>> Platform: i386-redhat-linux-gnu (32-bit)
>>
>> --
>> =======================================================
>> FLOODS Action
>> Land Management and Natural Hazards Unit
>> Institute for Environment and Sustainability (IES)
>> European Commission, Joint Research Centre (JRC)
>> TP 261, Via Enrico Fermi 2749, 21027 Ispra (VA), Italy
>> webinfo    : http://floods.jrc.ec.europa.eu/
>> work-phone : (+39)-(0332)-789588
>> work-fax   : (+39)-(0332)-786653
>> =======================================================
>> DISCLAIMER:\ "The views expressed are purely those of th...{{dropped:11}}
>>
>> _______________________________________________
>> R-sig-hpc mailing list
>> R-sig-hpc at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-sig-hpc
>>
>



More information about the R-sig-hpc mailing list