[R] Scripting in R -- pattern matching, logic, system calls, the works!
bioinformatics_guy
wwwhitener at gmail.com
Mon Sep 15 18:29:18 CEST 2008
Im very new to R so this might be a very simple question. First I'll lay out
the hierarchy of my directories, goals.
I have say 5 directories of form "Coverage_(some number)" and each one of
these I have text files of form "Length_(some number)" which are comprised
of say 30 numbers. Each one of these Length files (which are basically
incremented by 5 from 0 to 100, Length_(0,5,10,15,20) are to be averaged
where the average is the y-value and the length is the x-value in a linear
regression.
What I want to do is, write a script that looks in each of the coverage
directories and then reads in each of the files, takes the means, and plots
them in form I specified above. The catch is, what if I only want to plot
say Length_(20-50) and what command/method is best for a linear regression?
I've looked at m1(), but have not gotten it to work properly.
Below is some of the code I've put together:
topdir="~"
setwd(topdir)
### Took this function from a friend so I'm not sure what its doing besides
grep-ing a directory?
ll<-function(string)
{
grep(string,dir(),value=T)
}
### I believe this is looking for all files of form below
subdir = ll("Coverage_[1-9][0-9]$")
### A for loop iterating through each of the sub directories.
for (i in subdir)
{
#not sure what this line is doing as I found it on the internet on a
similar function
setwd(paste(getwd(),i,sep="/"))
#This makes a vector of all the file names
filelist=ll("Length_")
Can I use a regex or logic to only take the filelist variables I want?
And can I now get the mean of each Length_* and set in a matrix (length x
mean)?
Then finally, how to do a linear regression of this.
--
View this message in context: http://www.nabble.com/Scripting-in-R----pattern-matching%2C-logic%2C-system-calls%2C-the-works%21-tp19496451p19496451.html
Sent from the R help mailing list archive at Nabble.com.
More information about the R-help
mailing list