Re: [math][dbcp][pool] Apachecon NA (Austin)

Gilles Mon, 05 Jan 2015 09:15:42 -0800

On Mon, 5 Jan 2015 16:15:34 +0000, sebb wrote:

On 5 January 2015 at 15:54, Phil Steitz <phil.ste...@gmail.com>wrote:

On 1/5/15 5:39 AM, Gilles wrote:

Hi.


On Sun, 04 Jan 2015 14:37:41 -0700, Phil Steitz wrote:

On 1/4/15 5:58 AM, Gilles wrote:

Hello.

On Sun, 04 Jan 2015 12:09:35 +0100, Luc Maisonobe wrote:

Hi all,

Le 04/01/2015 02:07, Gilles a écrit :

On Fri, 02 Jan 2015 14:45:15 -0700, Phil Steitz wrote:

I am thinking about submitting a proposal or two for Austin.I

could update / extend the pool/dbcp talk I did last year or
try a
[math] talk.  I would love to have company developing and / or
presenting either of these.  Is anyone else interested in
working on
a talk on either of these?  Any suggestions on content?

For [math] I have always wanted to do a high level overview
followed
by some real world examples.  It would be great to make the
examples
part a community effort.


It reminded me that I had yet to improve one toy example in the
"src/userguide/java/org/apache/commons/math3/userguide" section
of the repository.
It also occurred to me that I don't know how to compile and run
the applications stored there. :(
Is there a maven incantation to do so?


No as maven does not know about this directory (and should not
IMHO).


I think that we should have some way to
1. automatically compile its content (so that we can ensure that
the
   source tree does not contain any non-compilable stuff) and

2. run selected classes (so that users easily see CM code atwork)


It looks like it should not be too difficult (for an experienced

maven user, which I'm not) to create a profile for doing justthat.

[I've seen there is an "exec" plugin that could do (2), but I
couldn't find where one can specifytan alternate source for
compilation.]

For [pool] / [dbcp] I did the boring part last year - summaryof

changes in the 2.x versions, migration, etc. - so this year I
could
focus on examples and best practices.  Again, a great thing to
work
together on.

Another crazy idea I have had is a talk on how hard it is to
design
stable APIs, using [math] as an example.


Is it really hard?  Isn't it rather that some developers just
lack
the willpower to support less than ideal APIs? :-}


Yes, it is hard.


I wanted to stress exactly what you expand below, i.e. that
needs are

changing, and that if we want to combine all the qualities ofgood

code (a.o. code that evolves with the developers' community, with
the users' community, with the language's state-of-the-art, with
the
computers' power, etc.) we have to modify the APIs; it is a never
ending, but creative, task.

The alternative is, as I wrote above, to stick withless-than-ideal

APIs, and _that_ is not hard; but it is a dead end.


If you make good choices initially, it does not have to be a dead
end.  That's the challenge.


Getting it right from the outset is indeed hard.
In my own, obviously limited, programming experience, it _never_
occurred.


I agree its hard.  I think we have actually succeeded in a few
places.  For example, the PRNG framework has been stable since first
release.  Admittedly, the core interface is just extracted from
java.util.random, but the the abstract superclass and framework for
the generators has worked very well.  Another example is the
distributions package.  Other than fiddling with sampling, the core
there has been stable since 1.x (10+ years now).  We have added /
extended capabilities and many, many distributions; but very little
incompatible change.


Since, as Luc mentioned, development in CM (almost?) is always the
result of some definite (and more or less urgent) need of a single
developer, it looks pretty impossible to assume that we can get it
right from the outset. That is, unless we change the policy "that
whoever does the job...": it should rather become that whoever
does the job must show ("prove") that there is no better way to
implement the proposed functionality, which, needless to say, would
put all development to a final rest!


Personally I think that when we add things - and more importantly
when we consider changing existing APIs - we do need to think about
extensibility and how to model the problem.  The benefit of having a
community is that it does not have to be "one developer solving his
/ her problem."  The itch can certainly come from one developer, but
the solution belongs to the community.

Some of our APIs have been pretty
stable.


IMHO, some would benefit from being changed. I mean that some are
not stable because they are the best that can be, but because the
emphasis is on stability.

The problem with constantly changing APIs is they become
effectively worthless for real world use.


It depends.  Strictly it is not true, as I've mentioned several
times: people who are happy with some version "A.b" do not have
to change anything, ever.  Furthermore they can benefit from
new features (and refactoring) in version "C.d" by using another
JAR along the old one.

The sole problem here is that we don't want to maintain old
versions.

Even for me, as a *math
developer*, I have some of my own hacked / forked / semi-patched
versions of [math] things because I don't have time to refactorand
retest all of the code that uses now compat-broken stuff.


According to the above, that's because of your decision to not
use older CM libraries.


No, I had to hack / backport patches to get bug fixes.


That problem stems from lacking human resources.

I do not see that this problem should prevent making progress
in newer versions.
You seem to want to trade one kind of developers' work for
another.

Nothing prevents some of us to maintain an old branch, make
backports, and release fully compatible versions... Just the
lack of time, I guess. :-}

I am not
advocating that we don't ever change APIs - just that we be
conservative in doing so so that users can count on some level of
stability.


I'm of the opinion that it should not be at the expense of the
other qualities of "good code".


Well, we disagree there.  Having a thing of beauty for us to look at
is way, way less valuable than a well-maintained (meaning bugs get
fixed) library with strong functionality and a stable API.  *That*
is what is valuable for users and what I personally am interested in
working on.

Somehow R does this pretty well.
The last comment is what I was thinking about exploring when Isaidthat modeling math algorithms using OO constructs is tricky.Maybe
they are just much smarter than us (to some extent, I am sure that
is true); but I think R also has the advantage that it is really a
pretty much procedural setup (I know, R is weirdly OO in its own
way; but the public API is really pretty flat, procedural).


[I don't know/use "R".]
They got it right before CM did.  So I'm advocating that we remain
free to do changes until we get it right. ;-)

The other thing that I was thinking about in this area isbasicallywhat the whole field of numerical analysis is about - thedifferencebetween naive modelling of mathematical objects in the "natural"way
and what you need to do to get stable and accurate results.  Also,
the tradeoff between "correctly" handling corner cases and
extensions beyond what is well-defined mathematically and
performance.  The Complex class illustrates all of these things.


We can only observe that CM lack the human resources to dig into
these things...

That talk would also call
out some of the special challenges that you run into modelling
mathematical objects using OO constructs.
That would be interesting.
Are mathematical concepts really more special to model thanother
concepts?
No, the problem is that low level reusable components intended
to be
used by lots of different users for lots of different needs are
difficult to set up. You have to meet conflicting needs and the
developers do not know in advance how their code will be used.
That is the core problem of API design, which we all three agreeis"hard." As mentioned above, I really think that math presentssomespecial challenges, mostly having to do with the fact that"natural"
representations of things sometimes lead to both bad numerics and
bad extensibility.


I'm, really, curious.  Could you provide example(s)?


The linear package - and its history - is full of examples.   We
started out naively implementing things using the "natural"
representation of algebraic objects as arrays, etc and the result
was bad numerics, lots of excessive creation / copy operations, etc.

That's what I wrote below: "trying to implement genericalgorithms

to solve as many practical problems as possible".

Thus: it is good to have generic, reusable, code (e.g. just froma

maintenance POV), but it at some point, it will conflict with
unexpected usage. Hence API change will be in order.

In many cases, things start with "someone scratching an itch"
because
open-source developers are the first users of their stuff. The
resulting
API is biased towards this first need. Low level reusable
components

developers have to make lots of efforts to design somethingclean

enough to anticipate other uses. Even experienced low level
reusable
components developers don't succeed in this part.


I totally and did not say otherwise.

The wrong way would be to have duplicate code all over the placetotak care of each new usage. Since we try to avoid that, we'llneed

to refactor when the "genericity" in one direction conflicts with
unanticipated usage.

I rather think that the issues arise from trying to sort outthe
general from the particular, trying to implement generic
algorithms
to solve as many practical problems as possible.


Perhaps, but it is really an important need for reusable
components.


From a development POV, I thinks so too.
For black-box users, it is not. [They don't care about what's in
the
box as long as it does the job; and "no duplicate code" is, for
instance, not a requirement.  But this a short-term view, since
maintenance will suffer and loss of quality will ensue.]

Another problem is that not moving forward (typically still
staying
in the 3.X series instead of starting 4.0) creates additional
constraints. Trying to patch something wrong is much more
difficult
than rewriting it, and sometimes it is even completely
impossible if
wrong design choices cannot be changed.

+1

This quite naturally lead to desing decisions that may be
challenged
by the appearance of unforeseen cases (or better programming
skills).


Exactly. However, I prefer to say that here is the problem, and
it is
a challenge we have to consider rather than saying it is
something we
don't want to cope with so we ignore the problem and don't care
about
users apart from our own needs.


So, we perfectly agree: i.e. we do not have the willpower to
maintain
old cruft! :-)

Might be a little painful
to develop, but also maybe a little cathartic ;)
It would certainly be useful to understand where the painreally
comes from.


There is an old joke I have heard numerous times in different
activities (including outside of software development): remove
the user and everything gets way better.


Sure (including us!).
But my question was real: we cannot make the user disappear, nor
all the reasons why we write programs, so we have to let the APIs
evolve.  Why is there such pain in doing so?


Basically because constant compat breaks screw users.


Cf. above. I really do not understand.
[As Luc said, we (developers) should not go into contorted ways in
order to fix bad design decisions.]

The win-win
is obviously to do the hard thinking, compromising and testing to
get to stable APIs that we can extend without breaking existing
APIs.


That is the ideal, no denial.
In reality, it is putting the bar too high for most contributors.

So here is the contradiction between letting unproven (design-wise)
code in, but then forbidding its demise because of stability.
IMO, this is an unsatisfying proposition.


Satisfying or not, it is our challenge.  Given that, as you
correctly state above, we "lack the human resources" to keep
supporting old versions, if we keep piling on incompatible API
breaks, our users are left with either unmaintained or unstable
code, neither of which is "satisfying" for them.  For me personally,
satisfying the practical needs of users is the reason that I
contribute here.  That is why I will always try to find ways to
extend APIs, add capabilities, fix bugs in a way that preserves
backward compatibility.


Strongly agree.

When looking at the impact of a change, it's important to remember
that there are many more consumers than there are producers (Commons
devs).
So if a change makes it easier for Commons devs, but harder for
consumers, then it is probably bad change as it increases the total
workload.

This sometimes means that Commons devs have to do extra work.

There is a very different trade-off when working on in-house or
personal code.
In that case, the overall workload may be reduced by ignoring
downstream compatibility in some cases.


This is FLOSS; developers should not be left behind on the
"free" (as in "libre") part: I'm against extra work for
developers if the consumers did not voice their concerns here.

Don't assume that I do not care to try and keep compatibility.
But, as Luc also said (thanks for the support), it does not
make sense to try and fix things when the design is shown to
be short of new expectations.

I'd be keen to read what you have to say about "those who do
the work get to decide"?
If people are annoyed with the evolving API, they have to
bring it here.  IMHO, it's not fair that some of us bar
changes on the a-priori that unspecified people, out there,
might not be happy with them.
If they care about compatibility, they should provide the
necessary help needed to keep alternative branches alive...


Regards,
Gilles


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [math][dbcp][pool] Apachecon NA (Austin)

Reply via email to