[R] many datasets run with one R script in a computer cluster
dwinsemius at comcast.net
Fri Oct 8 20:30:14 CEST 2010
On Oct 8, 2010, at 12:33 PM, Martin Hughes wrote:
> Hello Everyone
> I have an R script (and a source file which I keep my functions)
> that I need to run on 70 data sets (each consisting of a pair of
> I wish to run these data sets in a computer cluster that is run by
> my uni (HOWEVER they cannot help me with this problem but say it is
> the cluster is clever enough that if i set my data up as follows:
> within one folder called 'work' there is 70 subfolders each of which
> contain a pair of files, each pair of files having a unique first
> part eg CottonEA05 as in the example text below)
> then if I have one R script to run the analysis within the main
> folder, it will open each subfolder, run the R script and output the
> results into that subfolder.
> The problem is that this script for R needs to have some kind of
> wild card element so for example in the script below, R will replace
> CottonEA05 with the whatever the unique identifier is for the
> particular subfolder its looking through eg change it to
> Martin_M_STAGE.txt or bananas_M_STAGE.txt etc
> Can R do this? ie can it look a file title, and change the file name
> within the script to be the same as that file title, and then run
> the analysis
It can certain read a directory and return the file names into a
vector. And you can certainly do sub() on that vector to strip out the
leading characters before the first occurrence of a character.
(Which also has pattern matching facilities through its second
This reads the files in my working directory and then returns only the
characters before the first period:
> filist <- list.files()
chr [1:295] "_train_1.dat" "~Show.Dot.Files.txt" ...
> first <- sub("\\..+$","", filist)
chr [1:295] "_train_1" "~Show" "~UCONN" "2001VBTANB" ...
Was that what you were asking?
> OR do I have to use another programme that does that?
> #"CottonEA05" what is different for each dataset
> Martin Hughes
> MPhil/PhD Research in Biology
> Rm 1.07, 4south
> University of Bath
> Department of Biology and Biochemistry
> Bath BA2 7AY
> Tel: 01225 385 437
> M.Hughes at bath.ac.uk
> R-help at r-project.org mailing list
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD
West Hartford, CT
More information about the R-help