[R] Do you use R for data manipulation?

kostas savvidis ksavvidis at gmail.com
Fri May 8 05:34:49 CEST 2009

>> Farrel Buchinsky wrote:
>>> Is R an appropriate tool for data manipulation and data reshaping and data
>>> organizing? I think so but someone who recently joined our group thinks
>>> not.
>>> The new recruit believes that python or another language is a far better
>>> tool for developing data manipulation scripts that can be then used by
>>> several members of our research group. Her assessment is that R is useful
>>> only when it comes to data analysis and working with statistical models.

If the project data is complex and heterogeneous I use SQL database
for manipulating data. Ideally, your data should be entered into the
database at the point of creation - if not then  you are bound to be
using python, perl, java, bash etc programs to input the stuff.
Postgres=SQL these days is a very good choice, there are even
generators available for automatic generation of web forms for data
entry and viewing for those who have to use the web or cant be
bothered with SQL.

But once data is in SQL database, it is immenesely more usable and
manipulatable in a natural way (what they used to call data-centric
way). From R, it trivial to get it into a dataframe with auto
generation of column names.

Kostas Savvidis
Nanjing University

PS: perl, python, and so on are definitely not to be pushed onto
everybody by the "expert" in the lab. But perhaps SQL is, especially
if you would like the web interface to your data.

Histion Partners LP
Nanjing University

+8625 8622 8040 (h)
+86 13451 911 944 (m)

More information about the R-help mailing list