[R] Manipulating text files

Gabor Grothendieck ggrothendieck at gmail.com
Sun Apr 25 18:35:26 CEST 2010


Try this.  First we read in the lines using readLines. (We use
textConnection here to keep it self contained but you can read it from
the file as shown in the commented out portion.)

Using strapply we match the regular expression to the input.  The two
parenthesized portions match the number (\\S+ means one or more
non-space characters), we then match a space and the rest (.* means
anything). These are processed as the two arguments of the function
given in formula notation.  Since the arguments are not explicitly
listed it is assumed that the free variables no and text are the
arguments. The result is a data frame with a numeric column, no,
holding the number and a text column, text, holding the text.


Lines <- "2.50000 Phytoplankton Maximum Growth Rate Constant @20 øC
for Group 1 (1/day)
1.08000 Phytoplankton Growth Temperature Coefficient for Group 1
0.200000 Phytoplankton Respiration Rate Constant @20 øC for Group 1 (1/day)
1.04500 Phytoplankton Respiration Temperature Coefficient for Group 1
3.000000E-02 Phytoplankton Death Rate Constant (Non-Zoo Predation) for
Group 1 (1/day)
0.000000 Phytoplankton Zooplankton Grazing Rate Constant for Group 1 (1/day)
2.500000E-02 Phytoplankton Half-Saturation Constant for N Uptake for
Group 1 (mg N/L)"

library(gsubfn) # http://gsubfn.googlecode.com

# L <- readLines("myfile")
L <- readLines(textConnection(Lines))

out <- strapply(L, "(\\S+) (.*)",
	~ data.frame(no = as.numeric(no), text, stringsAsFactors = FALSE),
	simplify = rbind)

An alternative to the last statement using only sub is:

out <- data.frame(no = as.numeric(sub(" .*", "", L)),
     text = sub("^[^ ]+ ", "", L))

Either of the above out <- statements produces:

> out
     no    text
[1,] 2.5   "Phytoplankton Maximum Growth Rate Constant @20 øC for
Group 1 (1/day)"
[2,] 1.08  "Phytoplankton Growth Temperature Coefficient for Group 1"
[3,] 0.2   "Phytoplankton Respiration Rate Constant @20 øC for Group 1
(1/day)"
[4,] 1.045 "Phytoplankton Respiration Temperature Coefficient for
Group 1"
[5,] 0.03  "Phytoplankton Death Rate Constant (Non-Zoo Predation) for
Group 1 (1/day)"
[6,] 0     "Phytoplankton Zooplankton Grazing Rate Constant for Group
1 (1/day)"
[7,] 0.025 "Phytoplankton Half-Saturation Constant for N Uptake for
Group 1 (mg N/L)"


On Sun, Apr 25, 2010 at 12:09 PM, galen kaufman
<leavealetter1 at hotmail.com> wrote:
>
> Dear R Community,
>
> I am trying to optimize a water quality model that I am using.  Based on conversations with others more familiar with what I am doing I plan to implement DEOptim to do this. The water quality model is interfaced through a GUI. I have the input file necessary to alter parameters and run the model as a text file.
>
> To do the optimization I have figured out the general procedure but I need some help on the specific methods and commands that may be helpful.
>
> Here is an example of a section of the input file:
>
> ….
> 2.50000 Phytoplankton Maximum Growth Rate Constant @20 øC for Group 1 (1/day)
> 1.08000 Phytoplankton Growth Temperature Coefficient for Group 1
> 0.200000 Phytoplankton Respiration Rate Constant @20 øC for Group 1 (1/day)
> 1.04500 Phytoplankton Respiration Temperature Coefficient for Group 1
> 3.000000E-02 Phytoplankton Death Rate Constant (Non-Zoo Predation) for Group 1 (1/day)
> 0.000000 Phytoplankton Zooplankton Grazing Rate Constant for Group 1 (1/day)
> 2.500000E-02 Phytoplankton Half-Saturation Constant for N Uptake for Group 1 (mg N/L)
> ….
>
> The first numbers of each line are the parameter values that I am interested in.
>
> I have familiarized myself with DEOptim but need more guidance on writing the function to that will run the water quality model and feed DEOptim.
>
> In general I need to write a function to: (1) open and write over the model input file, inserting parameter values from DEOptim to the specific lines where they belong in the input file, (2) run the input file in the water quality model, (3) read the output files, (4) select values of interest in the output, (5) use values of interest in the equation to be optimized, then the result of the equation will be what is minimized in DEOptim. Steps 2-5 I am reasonably familiar with. I am not as familiar with how I would work with what I describe below for step (1).
>
> (1) (a)open the input file, (b)find the lines in the input file to insert parameter values into(values used in DEOptim optimization between “lower” and “upper”), (c) insert those parameter values into the input file in the proper spot, and then run this altered input file(step 2). Is this the right way to start?
>
> #function
> optimfxn<- function (x) {
>
> #(1)(a) would I do something like this?
>
> txt<-readLines("inp_File.inp")
>
> #From here I need help finishing step 1. I have not worked much with text files in R. I have read a lot of help files but am not getting anywhere.
>
> #Do I have to use textConnection() to write to the text file?
>
> #How do I incorporate the parameter values selected in each iteration of DEOptim into the input file?
>
> Here is an example of what I would be running in DEOptim:
>
> optimization<-DEoptim( fn=optimfxn, lower=c(phyt_g_rt_coef=1.01000, phyt_mx_g_rt=0.98000),upper=c(phyt_g_rt_coef=1.20000, phyt_mx_g_rt=4.44000)
>
> Any help you could provide would be very much appreciated.  Thank you in advance.
>
> Galen
> _________________________________________________________________
> Hotmail has tools for the New Busy. Search, chat and e-mail from your inbox.
>
> N:WL:en-US:WM_HMP:042010_1
>        [[alternative HTML version deleted]]
>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>



More information about the R-help mailing list