[R-pkg-devel] r-quantities seeking feedback

Iñaki Úcar i.ucar86 at gmail.com
Sat Oct 7 13:29:18 CEST 2017


2017-10-06 22:38 GMT+02:00 Bill Denney <bill at denney.ws>:
> Hi Iñaki and David,
>
> I fully see the need in a standardized unit package, and I understand the need for propagation of errors (though I'm in the opposite camp to David where I usually need unit tracking and conversion and rarely need error propagation-- though that's because my error propagation is often nonlinear and sometimes not normally distributed, so I have to do it myself).

I plan to extend 'errors' to support also arbitrary distributions and
MC propagation methods. There are already excellent packages doing
this, but unlike with 'errors', you need a separate workflow to
propagate the uncertainty. I believe they could be integrated as
backends for 'errors'.

> I agree with David in that: error propagation and unit tracking and conversion are different with partially-overlapping audiences.  But, I agree with Iñaki that there is a need for a consistent framework that can handle both.
>
> The reason for the need of a consistent framework is that if we have two separate packages that handle both they generally will be unaware of each other and may not play nicely together (ref the recent discussion on tibbles not always playing nicely with code expecting data.frames).  I think that three packages should generally be the goal:
> 1) One that handles units
> 2) One that handles error propagation
> 3) One that uses the other two to handle both units and error propagation

Yeap, that's exactly our intent.

> The components that I didn't see in your discussion of your proposal is extension of both libraries.
>
> For units, it should be possible to connect any set of units to any other set of units with a new conversion (e.g. mass and molar units could be connected with a molecular weight).  And, it should be possible to have multiple unit systems that can manage separate sets of rules (often an extension of a basic set of rules), and these should be possible to connect together.  The example for me again is with molecular weights, I may have molecule 1 that has a molecular weight of 100 g/mole and molecule 2 with a molecular weight of 200 g/mole; I would need to be able to store those at the same time without the system confusing the two.  And, I would slow need to store the rule that 2 count of molecule 1 make 1 count of molecule 2.  (FYI, parts of this are in https://github.com/pacificclimate/Rudunits2/pull/9 )

I'm not sure how much discussion should be dedicated in the proposal
to the feature extension of both libraries, because many issues and
needs have yet to be identified. We are in conversations with David
Flater, author of reference [3] in the proposal, and he raised very
interesting points too regarding units. For example, operations with
counting units: if you have 2 pixels * 2 pixels, you want 4 pixels as
output, and not 4 pixels^2.

> For both units and error propagation, these will need to work with general functions in packages that do not explicitly support the new packages.  As an example, the lm, glm, gls, etc. (along with thousands of other) functions are unlikely to be modified for support of the packages).  There should be some mechanism to make a simple wrapper function that looks at the input and understands how to map the output. Such as:
>
> lm_quantities <- function(...) {
> # look at the LHS of the formula argument, and apply any maths required to determine the units of the LHS.
> # call lm normally
> # assign units and/or error propagation to the result of the lm call
> }
>
> That would have to be repeated for any other function of interest.  Straight-forward examples that are part of the recommended libraries would hopefully be covered, and other library authors should have a simple way of assessing what the right units and error measures are to add it to their own libraries (optionally).

This, on the other hand, is not about new features, but about general
compatibility, and I agree it should be further discussed in the
proposal. I'll add some discussion along this lines.

Thank you very much, Bill. This feedback is very useful.

Iñaki



More information about the R-package-devel mailing list