[R] Relational Databases or XML?
Doran, Harold
HDoran at air.org
Thu Apr 10 22:28:41 CEST 2008
I'm not sure it is possible to parse an XML file in R directly. Well, I
guess it's *possible*, but may not be the best way to do it. ElementTree
in Python is an easy-to-use parser that you might use to first parse
your XML file (or others hierarchically structured data), organize it
anyway you want, and then bring those data into R for subsequent
analysis.
In fact, I have recently done just this. I have another statistical
program that outputs data as an XML file. So, I wrote a python program
that parses that XML file, pulls out the data of interest into a text
file, and then I bring those data into R for analysis.
> -----Original Message-----
> From: r-help-bounces at r-project.org
> [mailto:r-help-bounces at r-project.org] On Behalf Of Keith Alan
> Chamberlain
> Sent: Thursday, April 10, 2008 4:14 PM
> To: r-help at r-project.org
> Subject: [R] Relational Databases or XML?
>
> Dear R-Help,
>
> I am working on a paper in an R course for large file support
> in R using scan(), relational databases, and XML. I have
> never used SQL or heirarchical document formats such as XML
> (except where it occurs without user interaction), and
> knowledge in RDBs and XML is lacking in my program. I have
> tried finding a working example for the novices-novice on the
> topic, read many postings, the r-data I/O manual several
> times, and descriptions of packages RODBC, DBI, XML, among
> others. I understand that RDBs are (assumed at least) used
> widely among the R community. I have not been able to put all
> of the pieces together, but assuming that RDB use is actually
> quite widespread, it should be quite easy to fill me in
> and/or correct my understanding where necessary.
>
> For a cross-platform solution (PC/OSX at least, or in part)
> my questions/problems are about what preliminary steps are
> needed to get an SQL or XML query "to work" in R to begin
> with, what the appropriate data-file formats are, and how to
> convert to them if starting out with data in, say, a
> delimited ASCII text file. Very basic examples should
> suffice, say, a table with 20 random observations, a grouping
> variable with 2 levels, and a factor with 2 levels.
>
> ## untested code
> set.seed(1024)
> write.table("junk.txt",
> data.frame(Subj=c(rep(1,10),rep(2,10)),block=rep(c(rep(-1,5),r
> ep(1,5)),2), obs=rnorm(20,0,1)))
>
> Specifically,
>
> 1- what are the minimum required non R components that are
> needed to support SQL or XML functionality, which may or may
> not need to be installed?
>
> 2- what R packages need to be installed, at a minimum (also
> as a cross-PC/Mac solution if possible or at least as much as
> possible)
>
> 3- I keep seeing reference to connections of a given name "if
> previously setup". What kind of setup is needed outside of R, if any?
>
> 4- what steps are needed in R to then connect to a file and
> import a subset based on a query?
>
> 5- Do I then use standard R routines (e.g. write()) to export
> as a DB, or an RDB/XML specific function?
>
> Sincerely,
> KeithC. [U.S]
>
> 1/k^c
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list