[R] Relational Databases or XML?

Keith Alan Chamberlain Keith.Chamberlain at Colorado.EDU
Thu Apr 10 22:14:07 CEST 2008

Dear R-Help,

I am working on a paper in an R course for large file support in R using scan(), relational databases, and XML. I have never used SQL or heirarchical document formats such as XML (except where it occurs without user interaction), and knowledge in RDBs and XML is lacking in my program. I have tried finding a working example for the novices-novice on the topic, read many postings, the r-data I/O manual several times, and descriptions of packages RODBC, DBI, XML, among others. I understand that RDBs are (assumed at least) used widely among the R community. I have not been able to put all of the pieces together, but assuming that RDB use is actually quite widespread, it should be quite easy to fill me in and/or correct my understanding where necessary.

For a cross-platform solution (PC/OSX at least, or in part) my questions/problems are about what preliminary steps are needed to get an SQL or XML query "to work" in R to begin with, what the appropriate data-file formats are, and how to convert to them if starting out with data in, say, a delimited ASCII text file. Very basic examples should suffice, say, a table with 20 random observations, a grouping variable with 2 levels, and a factor with 2 levels.

## untested code
write.table("junk.txt", data.frame(Subj=c(rep(1,10),rep(2,10)),block=rep(c(rep(-1,5),rep(1,5)),2), obs=rnorm(20,0,1)))


1- what are the minimum required non R components that are needed to support SQL or XML functionality, which may or may not need to be installed?

2- what R packages need to be installed, at a minimum (also as a cross-PC/Mac solution if possible or at least as much as possible)

3- I keep seeing reference to connections of a given name "if previously setup". What kind of setup is needed outside of R, if any?

4- what steps are needed in R to then connect to a file and import a subset based on a query?

5- Do I then use standard R routines (e.g. write()) to export as a DB, or an RDB/XML specific function?

KeithC. [U.S]


More information about the R-help mailing list