[R] Do you use R for data manipulation?
Warren Young
warren at etr-usa.com
Wed May 13 04:48:01 CEST 2009
Farrel Buchinsky wrote:
> Is R an appropriate tool for data manipulation and data reshaping and data
> organizing? I think so but someone who recently joined our group thinks not.
> The new recruit believes that python or another language is a far better
> tool for developing data manipulation scripts that can be then used by
> several members of our research group. Her assessment is that R is useful
> only when it comes to data analysis and working with statistical models.
It's hard to shift people's individual preferences, but impressive
objective comparisons are easy to come by. Ask her how many lines it
would take to do this trivial R task in Python:
data <- read.csv('original-data.csv')
write.csv('scaled-data.csv', data * 10)
R's ability to do something to an entire data structure -- or a slice of
it, or some other subset -- in a single operation is very useful when
cleaning up data for presentation and analysis. Also point out how easy
it is to get data *out* of R, as above, not just into it, so you can
then hack on it in Python, if that's the better language for further
manipulation.
If she gives you static about how a few more lines are no big deal,
remind her that it's well established that bug count is always a simple
function of line count. This fact has been known since the 70's.
While making your points, remember that she has a good one, too: R is
not the only good language out there. You should learn Python while
she's learning R.
More information about the R-help
mailing list