[R] help with algorithm

Paul Hiemstra paul.hiemstra at knmi.nl
Mon Aug 1 13:32:05 CEST 2011


 On 07/31/2011 05:57 PM, r student wrote:
> I'm wondering if anyone can give some basic advice about how to approach a
> specific task in R.
>
> I'm new to R but have used SAS for many years, and while I can muscle
> through a lot of the code details, I'm unsure of a few things.
>
>
> Specific questions:
>
> If I have to perform a set of actions on a group of files, should I use a
> loop (I feel like I've heard people say to avoid looping in R)?

Hi,

Looping over several files is best done using the apply family of
functions. Especially the llply, ldply and ddply functions from the plyr
package I use a lot for processing. An example of looping over files and
recombining the results would look something like:

library(plyr)

listoffiles = list.files("/where/the/files/are")
combinedResult = ldply(listoffiles, function(filename) {
    bla = read.table(filename)
    ... now maybe do some stuff with it...
    return(result) # Note that result is a data.frame
                            # Can contain e.g. summary stats of bla
})

ldply will automatically combine the result of the function call in an
efficient manner. It can take some time to get the hang of these things,
but I love working with them when processing data.

> How to get means for "by" groups and subset a files based on those (subset
> highest and lowest groups)?  (I can do this in multiple steps* but wonder
> what the best, "R way" is to do this.)

when your data.frame has the form and is called dat:

value    by
1           A
5           A
3           B
etc

You can use ddply like this to get the mean value per category in 'by':

ddply(dat, .(by), summarise, m = mean(value))

> How to draw cutoff lines at specific points on density plots?
>
> How to create a matrix of plots?  (Take 4 separate plots and put them into a
> single graphic.)

I really like the ggplot2 package, this provides drawing several plots
using a special syntax construct (no need to manually subdivide the
canvas nor keep the axis of the plots equal manually). Take a look at
the website of ggplot2, specifically look at the examples given for the
facet_wrap and facet_grid functions.

cheers,
Paul

>
> * Get group means, add means back to file, sort by mean, take first and last
> groups
>
>
>
> Feel free to excoriate me if I'm asking for too much help.  If possible
> though, a few words of advice (loops are the best way, just use the "main"
> parameter to combine plots) would be lovely if you can provide.
>
>
>
> Thanks!
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


-- 
Paul Hiemstra, Ph.D.
Global Climate Division
Royal Netherlands Meteorological Institute (KNMI)
Wilhelminalaan 10 | 3732 GK | De Bilt | Kamer B 3.39
P.O. Box 201 | 3730 AE | De Bilt
tel: +31 30 2206 494

http://intamap.geo.uu.nl/~paul
http://nl.linkedin.com/pub/paul-hiemstra/20/30b/770



More information about the R-help mailing list