On Mon, Jun 4, 2012 at 3:38 PM, Daróczi Gergely <gergely@snowl.net> wrote:

>
>
> On Sun, Jun 3, 2012 at 2:32 PM, Michael Lawrence <
> lawrence.michael@gene.com> wrote:
>
>>
>>
>> On Sun, Jun 3, 2012 at 3:13 AM, Daróczi Gergely <gergely@snowl.net>wrote:
>>
>>>
>>>
>>> On Sun, Jun 3, 2012 at 6:53 AM, Michael Lawrence <
>>> lawrence.michael@gene.com> wrote:
>>>
>>>>
>>>>
>>>> On Sat, Jun 2, 2012 at 1:01 PM, Daróczi Gergely <gergely@snowl.net>wrote:
>>>>
>>>>> Dear Michael and other list members,
>>>>>
>>>>> I have not written a mail to this list yet as just trying to learn ESS
>>>>> properly rather then have to say something useful, but I might have some
>>>>> alternative solutions/POC examples about your question. I would like to
>>>>> apologise for the long mail/tons of links in advance!
>>>>>
>>>>> I am currently developing an R package, called pander<https://github.com/daroczig/pander>:
>>>>> a Pandoc<http://www.google.hu/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&ved=0CFQQFjAA&url=http%3A%2F%2Fjohnmacfarlane.net%2Fpandoc%2F&ei=dGvKT8iXCMTrOZLn8QU&usg=AFQjCNEvOD-RkHDmvIL936uCP2LtOknXew>'s
>>>>> markdown (compatible with traditional markdown with bunch of extra
>>>>> features) writer, which works like knitr with a brew like syntax (among
>>>>> some other ways). The main advantage of this alternative report generation
>>>>> is that you do not have to sign in any way in your  tags that you would
>>>>> generate an image/plot, as the functions used (eval<https://github.com/aL3xa/rapport/blob/master/R/eval.R> from
>>>>> rapport <http://rapport-package.info/> package developed my
>>>>> Aleksandar Blagotic and me) would auto-recognize that a graphic device was
>>>>> touched and render that to a png file named to something like
>>>>> "report-name-ID.png" and also adds a link in the generated markdown
>>>>> document in place of the R code.
>>>>>
>>>>
>>>> This evals() function sounds very similar to the evaluate package that
>>>> knitr uses, except evaluate() seems like it should be called something more
>>>> like repl(), since it prints results instead of returning them directly.
>>>> What's missing for, say, integration with ESS and Sweave, would be a parser
>>>> for the block that would delegate to evals().
>>>>
>>>
>>> evals() do depend on evaluate() too ATM to grab R results properly in
>>> not-so-clean R blocks (like: you return something in an R block but later
>>> do something else too which would overwrite the returned R object) but does
>>> something more. It does not only prints() the results (like evaluate) but
>>> rather returns the resulting (raw) R object  with some other fields
>>> (errors/warnings etc.). And also auto-render plots (without the need e.g.
>>> to print lattice/ggplot2 plots) to image files with controllable resolution
>>> - even with high resolution images too.
>>>
>>>
>> Taking this off the ESS list, since it's a bit of a tangent. I guess it
>> would be nice to have a function that would *only* evaluate, even for the
>> ggplot2/lattice plot objects. One could then have a pander() method for the
>> plot classes that would print them.
>>
>
> You are right.
> I have created some functions and suggested key-bindings to use in ESS:
> http://daroczig.github.com/pander/#ess
> Please note that my experience with Lisp is really limited, but those
> functions seems to work well as I tested.
>
> I am really considering what you suggested and implement that in "pander":
> adding an extra option to "evals" something like "grab.images" default set
> to TRUE. Based on that it would be really easy to evaluate code chunks
> without printing and adding some more "pander" methods.
>

Cool, I'll check it out.


>
>
>>
>>
>>> Unfortunately I am just learning Emacs/ESS so that integration would be
>>> beyond my knowledge, but will try to think abut it, thanks a lot! I do
>>> really appreciate your kind feedback.
>>>
>>>
>>>>
>>>>
>>>>> Of course all generated images could be pdf, jpg etc. - it's just an
>>>>> option you set after loading the packages. And the resulting markdown can
>>>>> be easily converted with a function call to HTML/pdf/docx/odt etc., but it
>>>>> might be better to checkout some examples<http://daroczig.github.com/pander/#examples_link>
>>>>> .
>>>>>
>>>>> Currently partitioning some parts of rapport to pander, as now the
>>>>> requirements to run my functions looks like: install Pandoc and two R
>>>>> packages from GitHub<http://daroczig.github.com/pander/#installation_link>,
>>>>> which is painful, but once this gets towards pander's first final release
>>>>> (in a few days), it could go to CRAN hopefully (you can find rapport there
>>>>> - but unfortunately a quite old build).
>>>>>
>>>>>
>>>> Do you think you could make an R package that solely interfaces with
>>>> pandoc and which embeds pandoc within itself (this would depend on the
>>>> license)? That avoids the system dependency.
>>>>
>>>
>>> I have pushed some changes to "pander" last night which would drop
>>> "rapport" dependency. Now "pander" imports only from "brew" and "evaluate".
>>> Unfortunately I am not sure how I could integrate Pandoc to the package as
>>> sources, as written in Haskell, so "cabal" should be installed on user's
>>> computer to compile it - which is a huge dependency. I was thinking adding
>>> a compiled version of Pandoc (with sources based on the GPL license), but
>>> as far as I know it's against "writing R packages" guidance.
>>>
>>>
>> Well, one option that is allowed would be to download the binary and
>> install it the first time the user loads the package, or tries to use
>> pandoc.
>>
>
> That would be really cool indeed - thanks for the suggestion. But I feel
> quite frustrated to parse (without adding e.g. XML package to package
> depends/imports) Pandoc's download page for current version and also not
> speaking about different Linux distributions.
>
>
Linux is another story. I was just thinking about Windows and Mac. You just
need a direct link to a tested version. Getting the latest version would be
a liability in terms of consistency. The version of the package would be
tied to pandoc version.


> In short: I've added "INSTALL" file to package and also pointing users to
> "Pandoc's install page" if a function which depends on Pandoc cannot find
> that great software. And I may not have stated before, but only exporting
> (pdf/html/docx etc.) features of "pander" really require Pandoc, other
> materials are done in solely R (like: converting R objects to markdown,
> brewing documents etc.).
>
>
>

>

>
>>
>>
>>> After all: "pander" now depends on two quite common packages available
>>> from CRAN and Pandoc, which can be installed quite easily to multiple
>>> platforms <http://johnmacfarlane.net/pandoc/installing.html>. Once
>>> "pander" gets to CRAN (hopefully in a few days), installation process
>>> should not be so problematic. I hope :)
>>>
>>>
>>>>
>>>>
>>>>> But back to your mail, you asked: "It is of course currently possible
>>>>> to evaluate the *body* of a block, but what about the other parameters of
>>>>> the block with side effects?"
>>>>> It is IMHO.
>>>>>
>>>>> First, I would suggest to use the above cited "evals" function, as it
>>>>> evaluates the R command, grabs the returning R object and anything written
>>>>> to stdout, besides renders the (optionally) generated image to disk with
>>>>> controllable file name and options (like resolution etc.) and grabs all
>>>>> messages/warnings/errors while evaluation and returning all this in a
>>>>> structured list.
>>>>>
>>>>> Alex and me have developed a package around this (or: we have created
>>>>> this function to behave as one of the backends of our package) which quite
>>>>> differs from "traditional reproducible research" strategies: we are dealing
>>>>> with templates <http://rapport-package.info/#templates> there which
>>>>> could be run on any dataset (like crosstabe on two specified variables,
>>>>> correlation matrix etc.).
>>>>> This packages uses a brew like syntax too and all R code blocks found
>>>>> would be passed to "evals" which grabs the above listed values. Besides the
>>>>> fact that you can export that list of report blocks to markdown/html/pdf
>>>>> etc. with the help of Pandoc, you still have the structured list of the
>>>>> evaluated report as an R object with meta-information (used template,
>>>>> description, author etc., proc.time), the call (optionally with the used
>>>>> dataset attached) and each report element (text, run R code, results in R
>>>>> object, messages/errors etc., stdout and generated images path).
>>>>>
>>>>> There are many ways to deal with that list, in the above cited
>>>>> packages we are doing printing to console, writing markdown file (quite a
>>>>> wide variety of R objects automatically transformed to markdown) or other
>>>>> formats.
>>>>>
>>>>>
>>>> This automatic, high-level transformation of R objects to markdown is
>>>> something that I have been seeking in a report generator. The simple
>>>> print()ing of R objects does not often integrate well with the document.
>>>>
>>>
>>> That transformation of R objects to markdown could be IMHO used in
>>> "knitr" too by adding "pander" as a hook. Will add some examples in my docs
>>> soon. But I am puzzled at the moment (without any tests - so it's a rather
>>> theoretical question ATM) how to deal with the situation where "knitr"
>>> generates images. Not sure how should I deal with that.
>>>
>>
>> Right, the knitr hooks extend knitr; they do not override its behavior.
>> Preventing the printing is something that has to happen in the core.
>>
>
> Will check it out in details. If "knitr" hook of images would provide
> generated image's file name/path, then passing that to "pander.image" would
> solve this somehow.  But I have no idea right now, will check it out in a
> few days.
>
>
>>
>>
>>>
>>>
>>>>
>>>> Another thing I've been wanting is a way to output objects in multiple
>>>> ways, separate from the report itself. Typical side effects would be:
>>>> generation of an R data package with the result as a dataset, or storage of
>>>> the result in some other database. It would be cool to be able to specify,
>>>> on a per-block basis, the output driver (or a list of them). For example,
>>>> your pander() function could be a dual-dispatch S4 generic, dispatching on
>>>> both the object to export, and an object representing the target.
>>>>
>>>
>>> Thanks, this sounds really promising!
>>> And could be done with not so hard work as I see now. "evals()" have a
>>> hook option, which is to be specified on the command line ATM, but I am now
>>> thinking to set it automatically based on some parameters of the run
>>> environment. I have to think about this in details.
>>>
>>>
>>>>
>>>> Thanks a lot for pointing out your package,
>>>> Michael
>>>>
>>>>
>>> Thank you for inspiring ideas, suggestion and also for your supporting
>>> feedback!
>>>
>>> Best,
>>> Gergely
>>>
>>>
>>>>
>>>>  I hope this could be useful/inspiring.
>>>>>
>>>>> Best,
>>>>> Gergely
>>>>>
>>>>>
>>>>>
>>>>> On Sat, Jun 2, 2012 at 6:37 PM, Michael Lawrence <
>>>>> lawrence.michael@gene.com> wrote:
>>>>>
>>>>>> Interesting to hear the chatter about knitr today. Related to this is
>>>>>> the
>>>>>> desire to evaluate Sweave, knitr, etc reports incrementally and
>>>>>> interactively. It is of course currently possible to evaluate the
>>>>>> *body* of
>>>>>> a block, but what about the other parameters of the block with side
>>>>>> effects? This is essentially a generalization of the report generator
>>>>>> to
>>>>>> the notion of "annotated code blocks".
>>>>>>
>>>>>> My primary use case is interactive data analysis using a literate
>>>>>> programming document instead of a simple script. This is often a
>>>>>> convenient
>>>>>> way to work, because it's easy to record thoughts, plans,
>>>>>> observations and
>>>>>> the code in a single document, and then the final result can be
>>>>>> generated
>>>>>> as a report. The analysis itself is an iterative and interactive
>>>>>> process,
>>>>>> so continually regenerating a report, even with caching, is not very
>>>>>> efficient or convenient.
>>>>>>
>>>>>> Here are some concrete benefits:
>>>>>>
>>>>>> Figure files could be generated outside of the report generation,
>>>>>> i.e.,
>>>>>> when fig=TRUE. I am constantly having to write pdf()/dev.off() around
>>>>>> by
>>>>>> code blocks. Utilities like ggsave() help a little in some cases, but
>>>>>> using
>>>>>> the code block name to name the figure automatically would be even
>>>>>> more
>>>>>> convenient.
>>>>>>
>>>>>> Caching support, as implemented by knitr or Seth's weaver package,
>>>>>> could be
>>>>>> useful for saving intermediate results, for either passing to a
>>>>>> colleague
>>>>>> or for resuming later.
>>>>>>
>>>>>> The extensibility of knitr opens up additional possibilities.
>>>>>>
>>>>>> These annotated meta blocks effectively separate the accidental
>>>>>> issues of
>>>>>> saving and distributing results from the analysis itself. What do you
>>>>>> guys
>>>>>> think? Are code blocks only for generating reports, or could we use
>>>>>> them in
>>>>>> other ways?
>>>>>>
>>>>>> Michael
>>>>>>
>>>>>>        [[alternative HTML version deleted]]
>>>>>>
>>>>>> ______________________________________________
>>>>>> ESS-help@r-project.org mailing list
>>>>>> https://stat.ethz.ch/mailman/listinfo/ess-help
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

	[[alternative HTML version deleted]]