[Rd] Creating XML document extremely slow

Milan Bouchet-Valat nalimilan at club.fr
Fri Feb 10 18:43:05 CET 2012


Le vendredi 10 février 2012 à 17:36 +0100, Titus von der Malsburg a
écrit :
> On Fri, Feb 10, 2012 at 2:10 PM, Milan Bouchet-Valat <nalimilan at club.fr> wrote:
> > Le vendredi 10 février 2012 à 13:18 +0100, Titus von der Malsburg a
> > écrit :
> > Just a guess, but I'd try creating all 'Marker' nodes first, storing
> > them in a 'markers' list, and then calling addChildren(markernode,
> > kids=markers).
> 
> A good guess.  I changed the code according to your suggestion and it
> reduced the processing time from ~25 to ~3 seconds.  Better but still
> ridiculously slow.  When I generate the same XML document by
> concatenating pre-fabricated strings, as suggested by Friedrich, the
> whole process takes just 10 ms according to system.time.
Doesn't sound so bad to me. I don't think you'll find a use case where
3s will really be a problem.

>From what Rprof() says, xmlNode() doesn't seem to do anything obviously
wrong. It's just that you're calling it 500 times, so there's some
overhead. You'd need a vectorized version that would handle all the data
in one go, i.e. create all the children from the values of x and y, and
then add them to their respective parents, in one function call.

Actually, if you look at what xmlNode() does, you'll see that it can
easily be re-implemented to do this. Though, since your data is simple,
it might be as easy to write the output by hand as you said. The real
benefit of libxml2 appears when you create complex documents, and even
more when you need to parse them: the hardest part to implement is error
checking and finding nodes in a structure you don't perfectly know in
advance.


Cheers



More information about the R-devel mailing list