[R] one way to write scripts in R

Jason E. Aten j.e.aten at gmail.com
Mon Mar 29 20:40:50 CEST 2010


Okay, I'll try again with .txt extension.  Thanks David.

On Mon, Mar 29, 2010 at 12:50 PM, David Winsemius <dwinsemius at comcast.net>wrote:

> I would have made it through the mail-server had you given it an extension
> of .txt but not so with the .rsh extension.
>
>
>
> On Mar 29, 2010, at 12:31 PM, Jason E. Aten wrote:
>
>  Thanks Gabor. I didn't realize you could.  Here is the scriptdemo.rsh
>>  file
>> as a text attachment, in case the line wraps made it hard to read/use.
>>
>> - Jason
>>
>> On Mon, Mar 29, 2010 at 11:19 AM, Gabor Grothendieck <
>> ggrothendieck at gmail.com> wrote:
>>
>>  Thanks.
>>>
>>> You might want to repost it as a text attachment since many of the
>>> lines wrapped around.
>>>
>>> Another more permanent possibility would be to put it on the R wiki at
>>> http://rwiki.sciviews.org/doku.php
>>>
>>> Note that the gsubfn package has a facility for quasi-perl type string
>>> interpolation as well. Just preface any function with fn$ and the
>>> facility is applied to the arguments of the function (subject to
>>> certain heuristics which determine which args to apply it to).
>>>
>>>  library(gsubfn)
>>>> today <- format(Sys.Date())
>>>> show <- list()
>>>> show$syntax <- 43
>>>> Sys.setenv(AN_ENV_VAR="greetings (I'm an env var!)")
>>>>
>>>> fn$cat('Getting `Sys.getenv("AN_ENV_VAR")` from the environment, on
>>>>
>>> $today,
>>> + `show$syntax` is also possible.\n')
>>> Getting greetings (I'm an env var!) from the environment, on 2010-03-29,
>>> 43 is also possible.
>>>
>>>
>>>
>>> On Mon, Mar 29, 2010 at 11:41 AM, Jason E. Aten <j.e.aten at gmail.com>
>>> wrote:
>>>
>>>> Dear R users,
>>>>
>>>> A colleague of mine asked me how to write a script (an executable text
>>>>
>>> file
>>>
>>>> containing R code) in R. After I showed
>>>> him, he said that after extensive searching of the R archives, he had
>>>> not
>>>> found anything like these techniques.
>>>>
>>>> He suggested that I share these methods to enable others to leverage R
>>>> as
>>>>
>>> a
>>>
>>>> better alternative to bash/perl scripts.
>>>>
>>>> So in the interest of giving back to the R community, and with all
>>>>
>>> humility,
>>>
>>>> I offer the
>>>> following small demonstration of one method for creating scripts of R
>>>>
>>> code
>>>
>>>> that are
>>>> executable from the (at least Linux) command line.
>>>>
>>>> I don't make any warrantees that this will work for you, but if it helps
>>>> somebody at least
>>>> get starting utilizing R effectively in scripts, then great!
>>>>
>>>> Best regards,
>>>>
>>>> Jason
>>>>
>>>> --
>>>> Jason E. Aten, Ph.D.
>>>>
>>>>
>>>> # file: scriptdemo.rsh
>>>>
>>>>
>>>  ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> David Winsemius, MD
> West Hartford, CT
>
>


-- 
Jason E. Aten, Ph.D.
(310) 429-4566 cell
-------------- next part --------------
#!/bin/bash
exec R --vanilla -q --slave -e "source(file=pipe(\"tail -n +4 $0\"))" --args $@
#debug: exec R --vanilla --verbose -e "source(file=pipe(\"tail -n +4 $0\"))" --args $@
### The above line starts R and then reads in this script, starting at line 4:
#
# scriptdemo.rsh : a simple filter script to demostrate how to write a script in R that
#                reads stdin and utilizes command line argv. Also shows how to do
#                bash scripting like variable substitution. Really just syntactic
#                sugar. But sugar can be sweet.
#
# NB: Only tested on Linux, YMMV, and you may have to adapt to your OS.

# 1st point of note: notice the exec R invocation above, with the pipe and tail combo.
#   This file becomes the program read into R.

  pp=function(...) paste(sep="",...)
  script="scriptdemo.rsh"
  usage=pp(script,": put help info here")

  argv = commandArgs(trailingOnly=TRUE)


  # --help
  if(any(argv=="--help")) {
    cat(usage)
    quit(save="no",status=0)
  }

# 2nd point of note: this is how to read stdin inside a script:
#

  # slurp in all the input
  r=readLines("stdin")

  bad=grep("^#",r) # remove comments

  # write out lines that didn't start with #
  cat(r[setdiff(1:length(r),bad)],sep="\n")


# 3rd point of note: if you want nice bash shell scripting string substitution and backticking
#  you can use my ppp() function. Note it's not well vectorized at the moment, so it will expect
#  variables that are substituted from the environment to be of length 1.
#  A bit hackish in places (sure the |@|@| and 34HERE43 stuff makes me wince), but it gets the job done,
#  as it's meant as a proof of concept.

##########################
# utility functions leading up to final definition of ppp() : shell scripting like facilities for R
# Skip down to the end of this file to see what ppp() does, example output is there.
##########################

# delete one trailing whitespace
chomp=function(x) {
   n=nchar(x)
   a=substr(x,n,n)
   w=which(a==" " | a == "\n" | a=="\t")
   if (length(w)) {
     x[w]=substr(x[w],1,n[w]-1)
   }
   x
}

# delete one leading whitespace
prechomp=function(x) {
   n=nchar(x)
   a=substr(x,1,1)
   w=which(a==" " | a == "\n" | a == "\t")
   if (length(w)) {
     x[w]=substr(x[w],2,n[w])
   }   
   x
}


# eliminate whitespace leading/trailing from a string
trim=function(x) {
   y=chomp(x)
   while(any(y!=x)) {
     x=y
     y=chomp(x)
   }

   y=prechomp(x)
   while(any(y!=x)) {
     x=y
     y=prechomp(x)
   }
   
  x
}

strsplit2=function(x,split,...) {
    # detect trailing split : and add "" afterwards, so we know if it was there.
    a=strsplit(pp(x,"|@|@|"),split,...)
    lapply(a,function(x) gsub("|@|@|","",x,fixed=TRUE))
}

strsplit3=function(x,split,keepsplit=FALSE,...) {
    if (keepsplit) {
      repstring="34HERE43"
      if (length(grep(repstring,x))) { stop(pp(repstring, " repstring already found. Arg! Aborting"))  } # sanity check
      # note where we want to split, using \\1 backref to keep the original
      a=gsub(pattern=pp("(",split,")") ,replacement=pp(repstring,"\\1"),x)
    } else {
       a=x
       repstring=split
    }
    b=strsplit2(a,repstring,...) # split, keeping the original delimiters
}

pp=function(...) paste(...,sep="") # pp() must be defined in outermost scope for ppp() to work

replacer=function(s,begin.string="${",end.string="}",keepend=FALSE,require.end=TRUE) {

     translate.env=function(x) {
       if (exists(x)) return(x)
       a=Sys.getenv(x)
       if (a!="") return(pp("\"",a,"\""))
       x
     }
  
     parts=strsplit2(s,begin.string,fixed=T)[[1]]
     if (length(parts) < 2 || all(parts=="")) return(s)
     if (any(trim(parts[-1])=="")) {
        warning(pp("ppp::replacer(): found begin.string '",begin.string,"' in '",s,"' but had empty/blankspace/end of string following it."))
        return(s)
     }

     collap=c()
     collap[1]=parts[1]
     for (i in 2:length(parts)) {
       tmp=strsplit3(parts[i],end.string,keepsplit=keepend)[[1]]
       if (length(tmp)==1) {
         if (require.end) {
           warning(pp("ppp::replacer(): could not find end.string '",end.string,"' in string '",s,"' and require.end=TRUE, so karping."))
           collap[(i-1)*2]=parts[i]
           collap[(i-1)*2+1]=""
         } else {
           collap[(i-1)*2]=pp("\",",translate.env(tmp[1]),",\"") # allow newline to terminate as well, if end not required
           collap[(i-1)*2+1]=""
         }
       } else {
         collap[(i-1)*2]=pp("\",",translate.env(tmp[1]),",\"")
         # collect the rest of parts[i] following tmp[1] and the end.string (assumes end.string is only ever length 1)
         collap[(i-1)*2+1]=substr(parts[i],nchar(tmp[1])+1+(1-as.numeric(keepend)),nchar(parts[i]))
       }
     }
     text=pp("pp(\"",pp(collap,collapse=""),"\")")
     
     # sys.frame(-2) is necessary to get definitions from calling function before where we were defined.
     if (sys.nframe() > 1) {
       ftext=eval(parse(text=text),envir=sys.frame(-2))
     } else {
       ftext=eval(parse(text=text))
     }
     ftext
}

pp=function(...) paste(...,sep="") # must be defined in outermost scope for ppp() to work

# shell like string interpolation... ppp("fill in ${myvar} here after `hostname` is $myvar")
ppp=function(...) {

  sa=paste(sep="",...)
  res=c()
  for (j in 1:length(sa)) {
     s=sa[j]
     s2=replacer(s,"${","}")
     terminators="\t| |\\.|`|\\$|\\{|\\}|\\(|\\)|<|>|\\|"
     s3=replacer(s2,"$",terminators,keepend=TRUE,require.end=FALSE) # !require.end allows end of line termination
     res[j]=s3
  }

  do.sys.expecting.output=function(cmd) {
    got=pp(system(intern=T,cmd),collapse="\n")
    if (got=="") die("do.sys() on '",cmd,"' returned no output.")
    got
  }

  # now check for backtick system call requests as well, *after* variable substitution is all finished.
  bt=grep("`",res)
  if(length(bt)) {
     sa=res[bt]
     for (j in 1:length(sa)) {
         s=sa[j]
         parts=strsplit2(s,"`",fixed=T)[[1]]
         if (length(parts) < 3) { res[bt[j]]=s; next; }
  
         collap=c()
         collap[1]=parts[1]
         for (i in seq(2,length(parts)-1,2)) {
            cmd=parts[i]
            collap[i]=do.sys.expecting.output(cmd)
            collap[i+1]=parts[i+1]
         }
         text=pp(collap,collapse="")
         res[bt[j]]=text
       } # for j
     } #end if length(bt)
  res
}
 


#
# now demonstrate the use of ppp() in a scripting context:
#

today="date"
month=3
year=2010

show=list()
show$syntax = 43

Sys.setenv(AN_ENV_VAR="greetings (I'm an env var!)")

demo=ppp("Getting $AN_ENV_VAR from the environment, on `$today`, substituting ${show$syntax} in named lists is also possible. `cal $month $year| head` ")

cat("here's the demo output\n")
cat(demo,sep="\n")


## # output of this demo script when run, show how to use stdin and ppp()
##
##
## me at host:~/uns/bin$ cat ~/tmp/test | scriptdemo.rsh
## not comment 1
## not comment 2
## not comment 3
## not comment 4
## not comment 5
## here's the demo output
## Getting greetings (I'm an env var!) from the environment, on Mon Mar 29 10:23:49 CDT 2010, substituting 43 in named lists is also possible.      March 2010     
## Su Mo Tu We Th Fr Sa
##     1  2  3  4  5  6
##  7  8  9 10 11 12 13
## 14 15 16 17 18 19 20
## 21 22 23 24 25 26 27
## 28 29 30 31
## 
## me at host:~/uns/bin$ cat ~/tmp/test
## # comment 1
## not comment 1
## not comment 2
## # comment 2
## not comment 3
## not comment 4
## not comment 5
## # comment 3


More information about the R-help mailing list