[R] Scripting in R -- pattern matching, logic, system calls, the works!

Mon Sep 15 18:29:18 CEST 2008

Im very new to R so this might be a very simple question.  First I'll lay out
the hierarchy of my directories, goals.

I have say 5 directories of form "Coverage_(some number)" and each one of
these I have text files of form "Length_(some number)" which are comprised
of say 30 numbers.  Each one of these Length files (which are basically
incremented by 5 from 0 to 100, Length_(0,5,10,15,20) are to be averaged
where the average is the y-value and the length is the x-value in a linear
regression.

What I want to do is, write a script that looks in each of the coverage
directories and then reads in each of the files, takes the means, and plots
them in form I specified above.  The catch is, what if I only want to plot
say Length_(20-50) and what command/method is best for a linear regression? 
I've looked at m1(), but have not gotten it to work properly. 

Below is some of the code I've put together:

topdir="~"

setwd(topdir)

### Took this function from a friend so I'm not sure what its doing besides
grep-ing a directory?
ll<-function(string)
{
	grep(string,dir(),value=T)
}

### I believe this is looking for all files of form below
subdir = ll("Coverage_[1-9][0-9]$")

### A for loop iterating through each of the sub directories.
for (i in subdir)
{      
        #not sure what this line is doing as I found it on the internet on a
similar function 
	setwd(paste(getwd(),i,sep="/"))
        #This makes a vector of all the file names
        filelist=ll("Length_")

Can I use a regex or logic to only take the filelist variables I want?
And can I now get the mean of each Length_* and set in a matrix (length x
mean)?

Then finally, how to do a linear regression of this.        

-- 
View this message in context: http://www.nabble.com/Scripting-in-R----pattern-matching%2C-logic%2C-system-calls%2C-the-works%21-tp19496451p19496451.html
Sent from the R help mailing list archive at Nabble.com.