[R] Application design.
Joe W. Byers
ecjbosu at aol.com
Wed Jul 23 14:45:45 CEST 2014
On 07/21/2014 09:24 PM, John McKown wrote:
> I'm designing an R based application for my boss. It's not much, but
> it might save him some time. What it will be doing is reading data
> from an MS-SQL database and creating a number of graphs. At present,
> he must log into one server to run a vendor application to display the
> data in a grid. He then cuts this data and pastes it into an Excel
> spreadsheet. He then generates some graphs in Excel. Which he then
> cuts and pastes into a Power Point presentation. Which is the end
> result for distribution to others up the food chain.
>
> What I would like to do is read the MS-SQL data base using RODBC and
> create the graphs using ggplot2 instead of using Excel. I may end up
> being told to create an Excel file as well.
>
> My real question is organizing the R programs to do this. Basically
> what I was thinking of was a "master" program. It does the ODBC work
> and fetches the data into one, or more, data.frames. I was then
> thinking that it would be better to have separate source files for
> each graph produced. I would use the source() function in the "master"
> R program to load & execute each one in order. Is this a decent
> origination? Or would it be better for each "create a graph" R file to
> really just define a unique function which the "master" program would
> then invoke? I guess this latter would be a good way to keep the
> workspace "clean" since all the variables in the functinon would "go
> away" when the function ended.
>
> I guess what I'm asking is how others organize the R applications. Oh,
> I plan for this to be run by my boss by double clicking on the
> "master" R source file, which I will associate with the Rscript
> program in Windows. Yes, this is Windows based <sigh/>.
>
> Appreciate your thoughts. Especially if I'm really off track.
>
John,
Your original plan is simple and straight forward, and is similar to
many of the processes that I have automated using R. I use both RJDBC
and RODBC to connect to a database. I have wrapper JDBC and ODBC.Pull
scripts that are wrappers to the other libraries that accept the
connection parameters and a sql string. These will return a data.frame,
and if needed set the mode/class of the columns according to the
database meta data. You can create a master job that call a function to
pull the data you need, calls plotting functions, and generates graphs,
tabular reports, and as suggested Excel files. I use xlsx to build
Excel reports. I use hwriter to generate html reports that I publish.
I have a small function that loads when I start R call setSources().
This allows me load my custom code from different directories like using
library() or require().
Keep it simple until you are comfortable with your results.
1. Master Job script (loaded at startup)
2. Methods for each sub process
data pull, plots, etc.
3. An email to or other notification that the process is completed (if
you want it to run and have your boss log into a share point or other BI
delivery portal).
Good luck
More information about the R-help
mailing list