[R] Do you use R for data manipulation?
Frank E Harrell Jr
f.harrell at vanderbilt.edu
Wed May 6 14:25:39 CEST 2009
I second what Zeljko wrote. In addition, see the data manipulation
section in Chapter 4 of
Zeljko Vrba wrote:
> Sorry for reply to the wrong person, I lost the original email.
>> Farrel Buchinsky wrote:
>>> Is R an appropriate tool for data manipulation and data reshaping and data
>>> organizing? I think so but someone who recently joined our group thinks
>>> The new recruit believes that python or another language is a far better
>>> tool for developing data manipulation scripts that can be then used by
>>> several members of our research group. Her assessment is that R is useful
>>> only when it comes to data analysis and working with statistical models.
> I personally started to use R because I got tired of manually writing scripts
> for data manipulation and processing. The argument of your new recruit smells
> of ignorance and resistance to learning something new. Ask her _how_ did she
> assess R, how much time she spent on her assessment and whether did she
> actually try to run it and perform some concrete simple tasks.
> (Yes, R is somewhat "different", it has a steep learning curve, but the effort
> of learning it is worth it. And yes, R can be used in the same way as any
> other scripting language, i.e., it is not restricted to interactive work.)
> Take a look at plyr and reshape packages (http://had.co.nz/), I have a hunch
> that they would have saved me a lot of headache had I found out about them
> earlier :)
> I would also recommend investing in Phil Spector's book "Data manipulation with
> R", it will get you started much faster.
> I also find R's image files very convenient for sharing data (and code!) in a
> very compact format (single file, portable across architectures). When you
> quit your R session, all the variables and functions get saved in the image
> file, which you can take with you (or send to somebody else), start R again,
> load the image into a new session and continue from where you left. You won't
> get this kind of automatic persistence in any scripting language out of the
>>> So what do you think:
>>> 1)R is a phenomenally powerful and flexible tool and since you are going to
>>> do analyses in R you might as well use it to read data in and merge it and
>>> reshape it to whatever you need.
>>> 2) Are you crazy? Nobody in their right mind uses R to pipe the data around
>>> their lab and assemble it for analysis.
> I'd go with 1). R has also interfaces towards databases through RODBC, so you
> do not have to go through several conversions when you're about to process or
> plot data in R.
> R-help at r-project.org mailing list
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
Frank E Harrell Jr Professor and Chair School of Medicine
Department of Biostatistics Vanderbilt University
More information about the R-help