Re: [R-pkg-devel] r-quantities seeking feedback

2017-10-07 Thread Iñaki Úcar
2017-10-06 22:28 GMT+02:00 David Hugh-Jones :
> Many measurements have no unit, but some uncertainty - e.g. the b and se
> from an arbitrary regression. Can you give specific examples of the
> advantages from binding these packages tightly together?

As Duncan already pointed out, the units of b and se from an arbitrary
regression depend on the units of your variables. The advantages from
integrating both packages are the combination of advantages from each
one with the same workflow as if you were working with bare numbers.

It seems that you are already aware of the advantages of automatic
error propagation. Regarding the units package, it is very useful for
painless conversion of units. A conversion from kg to g is elementary,
but some others require more care, for example J to eV, or N.m-1 to
dyn.cm-1. In electromagnetism, it is very common to work with the CGS
units system, and an automatic conversion from/to the SI comes in
handy.

If you are not persuaded already, we can also talk about the Mars
Climate Orbiter, a robotic space probe launched by NASA on 1998 which
disintegrated in Mars' upper atmosphere due to a computation with
wrong units.

Iñaki

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel

Re: [R-pkg-devel] r-quantities seeking feedback

2017-10-07 Thread Iñaki Úcar
2017-10-06 22:38 GMT+02:00 Bill Denney :
> Hi Iñaki and David,
>
> I fully see the need in a standardized unit package, and I understand the 
> need for propagation of errors (though I'm in the opposite camp to David 
> where I usually need unit tracking and conversion and rarely need error 
> propagation-- though that's because my error propagation is often nonlinear 
> and sometimes not normally distributed, so I have to do it myself).

I plan to extend 'errors' to support also arbitrary distributions and
MC propagation methods. There are already excellent packages doing
this, but unlike with 'errors', you need a separate workflow to
propagate the uncertainty. I believe they could be integrated as
backends for 'errors'.

> I agree with David in that: error propagation and unit tracking and 
> conversion are different with partially-overlapping audiences.  But, I agree 
> with Iñaki that there is a need for a consistent framework that can handle 
> both.
>
> The reason for the need of a consistent framework is that if we have two 
> separate packages that handle both they generally will be unaware of each 
> other and may not play nicely together (ref the recent discussion on tibbles 
> not always playing nicely with code expecting data.frames).  I think that 
> three packages should generally be the goal:
> 1) One that handles units
> 2) One that handles error propagation
> 3) One that uses the other two to handle both units and error propagation

Yeap, that's exactly our intent.

> The components that I didn't see in your discussion of your proposal is 
> extension of both libraries.
>
> For units, it should be possible to connect any set of units to any other set 
> of units with a new conversion (e.g. mass and molar units could be connected 
> with a molecular weight).  And, it should be possible to have multiple unit 
> systems that can manage separate sets of rules (often an extension of a basic 
> set of rules), and these should be possible to connect together.  The example 
> for me again is with molecular weights, I may have molecule 1 that has a 
> molecular weight of 100 g/mole and molecule 2 with a molecular weight of 200 
> g/mole; I would need to be able to store those at the same time without the 
> system confusing the two.  And, I would slow need to store the rule that 2 
> count of molecule 1 make 1 count of molecule 2.  (FYI, parts of this are in 
> https://github.com/pacificclimate/Rudunits2/pull/9 )

I'm not sure how much discussion should be dedicated in the proposal
to the feature extension of both libraries, because many issues and
needs have yet to be identified. We are in conversations with David
Flater, author of reference [3] in the proposal, and he raised very
interesting points too regarding units. For example, operations with
counting units: if you have 2 pixels * 2 pixels, you want 4 pixels as
output, and not 4 pixels^2.

> For both units and error propagation, these will need to work with general 
> functions in packages that do not explicitly support the new packages.  As an 
> example, the lm, glm, gls, etc. (along with thousands of other) functions are 
> unlikely to be modified for support of the packages).  There should be some 
> mechanism to make a simple wrapper function that looks at the input and 
> understands how to map the output. Such as:
>
> lm_quantities <- function(...) {
> # look at the LHS of the formula argument, and apply any maths required to 
> determine the units of the LHS.
> # call lm normally
> # assign units and/or error propagation to the result of the lm call
> }
>
> That would have to be repeated for any other function of interest.  
> Straight-forward examples that are part of the recommended libraries would 
> hopefully be covered, and other library authors should have a simple way of 
> assessing what the right units and error measures are to add it to their own 
> libraries (optionally).

This, on the other hand, is not about new features, but about general
compatibility, and I agree it should be further discussed in the
proposal. I'll add some discussion along this lines.

Thank you very much, Bill. This feedback is very useful.

Iñaki

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel

Re: [R-pkg-devel] r-quantities seeking feedback

2017-10-07 Thread David Hugh-Jones
Hi Iñaki,

OK, it sounds like we have no practical disagreement: you're planning to
keep separate packages and then have a third one for integration. That will
be fine for people like me who don't necessarily want to specify units for
our regressions. I look forward to seeing this!

Cheers,
David

On 7 October 2017 at 13:00, Iñaki Úcar  wrote:

> 2017-10-06 22:28 GMT+02:00 David Hugh-Jones :
> > Many measurements have no unit, but some uncertainty - e.g. the b and se
> > from an arbitrary regression. Can you give specific examples of the
> > advantages from binding these packages tightly together?
>
> As Duncan already pointed out, the units of b and se from an arbitrary
> regression depend on the units of your variables. The advantages from
> integrating both packages are the combination of advantages from each
> one with the same workflow as if you were working with bare numbers.
>
> It seems that you are already aware of the advantages of automatic
> error propagation. Regarding the units package, it is very useful for
> painless conversion of units. A conversion from kg to g is elementary,
> but some others require more care, for example J to eV, or N.m-1 to
> dyn.cm-1. In electromagnetism, it is very common to work with the CGS
> units system, and an automatic conversion from/to the SI comes in
> handy.
>
> If you are not persuaded already, we can also talk about the Mars
> Climate Orbiter, a robotic space probe launched by NASA on 1998 which
> disintegrated in Mars' upper atmosphere due to a computation with
> wrong units.
>
> Iñaki
>

[[alternative HTML version deleted]]

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel