[R] Relational Databases or XML?

Prof Brian Ripley ripley at stats.ox.ac.uk
Thu Apr 10 22:49:24 CEST 2008


On Thu, 10 Apr 2008, Doran, Harold wrote:

> Well, I guess it is possible with XML package on CRAN. But, it seems
> there is no windows binary (yet)

There certainly is -- did you try installing it from CRAN (extras)?


>> -----Original Message-----
>> From: r-help-bounces at r-project.org
>> [mailto:r-help-bounces at r-project.org] On Behalf Of Doran, Harold
>> Sent: Thursday, April 10, 2008 4:29 PM
>> To: Keith Alan Chamberlain; r-help at r-project.org
>> Subject: Re: [R] Relational Databases or XML?
>>
>> I'm not sure it is possible to parse an XML file in R
>> directly. Well, I guess it's *possible*, but may not be the
>> best way to do it. ElementTree in Python is an easy-to-use
>> parser that you might use to first parse your XML file (or
>> others hierarchically structured data), organize it anyway
>> you want, and then bring those data into R for subsequent analysis.
>>
>> In fact, I have recently done just this. I have another
>> statistical program that outputs data as an XML file. So, I
>> wrote a python program that parses that XML file, pulls out
>> the data of interest into a text file, and then I bring those
>> data into R for analysis.
>>
>>> -----Original Message-----
>>> From: r-help-bounces at r-project.org
>>> [mailto:r-help-bounces at r-project.org] On Behalf Of Keith Alan
>>> Chamberlain
>>> Sent: Thursday, April 10, 2008 4:14 PM
>>> To: r-help at r-project.org
>>> Subject: [R] Relational Databases or XML?
>>>
>>> Dear R-Help,
>>>
>>> I am working on a paper in an R course for large file support in R
>>> using scan(), relational databases, and XML. I have never
>> used SQL or
>>> heirarchical document formats such as XML (except where it occurs
>>> without user interaction), and knowledge in RDBs and XML is
>> lacking in
>>> my program. I have tried finding a working example for the
>>> novices-novice on the topic, read many postings, the r-data
>> I/O manual
>>> several times, and descriptions of packages RODBC, DBI, XML, among
>>> others. I understand that RDBs are (assumed at least) used widely
>>> among the R community. I have not been able to put all of
>> the pieces
>>> together, but assuming that RDB use is actually quite
>> widespread, it
>>> should be quite easy to fill me in and/or correct my understanding
>>> where necessary.
>>>
>>> For a cross-platform solution (PC/OSX at least, or in part) my
>>> questions/problems are about what preliminary steps are
>> needed to get
>>> an SQL or XML query "to work" in R to begin with, what the
>> appropriate
>>> data-file formats are, and how to convert to them if
>> starting out with
>>> data in, say, a delimited ASCII text file. Very basic
>> examples should
>>> suffice, say, a table with 20 random observations, a
>> grouping variable
>>> with 2 levels, and a factor with 2 levels.
>>>
>>> ## untested code
>>> set.seed(1024)
>>> write.table("junk.txt",
>>> data.frame(Subj=c(rep(1,10),rep(2,10)),block=rep(c(rep(-1,5),r
>>> ep(1,5)),2), obs=rnorm(20,0,1)))
>>>
>>> Specifically,
>>>
>>> 1- what are the minimum required non R components that are
>> needed to
>>> support SQL or XML functionality, which may or may not need to be
>>> installed?
>>>
>>> 2- what R packages need to be installed, at a minimum (also as a
>>> cross-PC/Mac solution if possible or at least as much as
>>> possible)
>>>
>>> 3- I keep seeing reference to connections of a given name "if
>>> previously setup". What kind of setup is needed outside of
>> R, if any?
>>>
>>> 4- what steps are needed in R to then connect to a file and
>> import a
>>> subset based on a query?
>>>
>>> 5- Do I then use standard R routines (e.g. write()) to
>> export as a DB,
>>> or an RDB/XML specific function?
>>>
>>> Sincerely,
>>> KeithC. [U.S]
>>>
>>> 1/k^c
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-help mailing list