Re: [math][dbcp][pool] Apachecon NA (Austin)

sebb Mon, 05 Jan 2015 08:18:37 -0800

On 5 January 2015 at 15:54, Phil Steitz <phil.ste...@gmail.com> wrote:
> On 1/5/15 5:39 AM, Gilles wrote:
>> Hi.
>>
>> On Sun, 04 Jan 2015 14:37:41 -0700, Phil Steitz wrote:
>>> On 1/4/15 5:58 AM, Gilles wrote:
>>>> Hello.
>>>>
>>>> On Sun, 04 Jan 2015 12:09:35 +0100, Luc Maisonobe wrote:
>>>>> Hi all,
>>>>>
>>>>> Le 04/01/2015 02:07, Gilles a écrit :
>>>>>> On Fri, 02 Jan 2015 14:45:15 -0700, Phil Steitz wrote:
>>>>>>> I am thinking about submitting a proposal or two for Austin.  I
>>>>>>> could update / extend the pool/dbcp talk I did last year or
>>>>>>> try a
>>>>>>> [math] talk.  I would love to have company developing and / or
>>>>>>> presenting either of these.  Is anyone else interested in
>>>>>>> working on
>>>>>>> a talk on either of these?  Any suggestions on content?
>>>>>>>
>>>>>>> For [math] I have always wanted to do a high level overview
>>>>>>> followed
>>>>>>> by some real world examples.  It would be great to make the
>>>>>>> examples
>>>>>>> part a community effort.
>>>>>>
>>>>>> It reminded me that I had yet to improve one toy example in the
>>>>>> "src/userguide/java/org/apache/commons/math3/userguide" section
>>>>>> of the repository.
>>>>>> It also occurred to me that I don't know how to compile and run
>>>>>> the applications stored there. :(
>>>>>> Is there a maven incantation to do so?
>>>>>
>>>>> No as maven does not know about this directory (and should not
>>>>> IMHO).
>>>>
>>>> I think that we should have some way to
>>>> 1. automatically compile its content (so that we can ensure that
>>>> the
>>>>    source tree does not contain any non-compilable stuff) and
>>>> 2. run selected classes (so that users easily see CM code at work)
>>>>
>>>> It looks like it should not be too difficult (for an experienced
>>>> maven user, which I'm not) to create a profile for doing just that.
>>>> [I've seen there is an "exec" plugin that could do (2), but I
>>>> couldn't find where one can specifytan alternate source for
>>>> compilation.]
>>>>
>>>>>>
>>>>>>>
>>>>>>> For [pool] / [dbcp] I did the boring part last year - summary of
>>>>>>> changes in the 2.x versions, migration, etc. - so this year I
>>>>>>> could
>>>>>>> focus on examples and best practices.  Again, a great thing to
>>>>>>> work
>>>>>>> together on.
>>>>>>>
>>>>>>> Another crazy idea I have had is a talk on how hard it is to
>>>>>>> design
>>>>>>> stable APIs, using [math] as an example.
>>>>>>
>>>>>> Is it really hard?  Isn't it rather that some developers just
>>>>>> lack
>>>>>> the willpower to support less than ideal APIs? :-}
>>>>>
>>>>> Yes, it is hard.
>>>>
>>>> I wanted to stress exactly what you expand below, i.e. that
>>>> needs are
>>>> changing, and that if we want to combine all the qualities of good
>>>> code (a.o. code that evolves with the developers' community, with
>>>> the users' community, with the language's state-of-the-art, with
>>>> the
>>>> computers' power, etc.) we have to modify the APIs; it is a never
>>>> ending, but creative, task.
>>>> The alternative is, as I wrote above, to stick with less-than-ideal
>>>> APIs, and _that_ is not hard; but it is a dead end.
>>>
>>> If you make good choices initially, it does not have to be a dead
>>> end.  That's the challenge.
>>
>> Getting it right from the outset is indeed hard.
>> In my own, obviously limited, programming experience, it _never_
>> occurred.
>
> I agree its hard.  I think we have actually succeeded in a few
> places.  For example, the PRNG framework has been stable since first
> release.  Admittedly, the core interface is just extracted from
> java.util.random, but the the abstract superclass and framework for
> the generators has worked very well.  Another example is the
> distributions package.  Other than fiddling with sampling, the core
> there has been stable since 1.x (10+ years now).  We have added /
> extended capabilities and many, many distributions; but very little
> incompatible change.
>
>>
>> Since, as Luc mentioned, development in CM (almost?) is always the
>> result of some definite (and more or less urgent) need of a single
>> developer, it looks pretty impossible to assume that we can get it
>> right from the outset. That is, unless we change the policy "that
>> whoever does the job...": it should rather become that whoever
>> does the job must show ("prove") that there is no better way to
>> implement the proposed functionality, which, needless to say, would
>> put all development to a final rest!
>
> Personally I think that when we add things - and more importantly
> when we consider changing existing APIs - we do need to think about
> extensibility and how to model the problem.  The benefit of having a
> community is that it does not have to be "one developer solving his
> / her problem."  The itch can certainly come from one developer, but
> the solution belongs to the community.
>>
>>> Some of our APIs have been pretty
>>> stable.
>>
>> IMHO, some would benefit from being changed. I mean that some are
>> not stable because they are the best that can be, but because the
>> emphasis is on stability.
>>
>>> The problem with constantly changing APIs is they become
>>> effectively worthless for real world use.
>>
>> It depends.  Strictly it is not true, as I've mentioned several
>> times: people who are happy with some version "A.b" do not have
>> to change anything, ever.  Furthermore they can benefit from
>> new features (and refactoring) in version "C.d" by using another
>> JAR along the old one.
>>
>> The sole problem here is that we don't want to maintain old
>> versions.
>>
>>> Even for me, as a *math
>>> developer*, I have some of my own hacked / forked / semi-patched
>>> versions of [math] things because I don't have time to refactor and
>>> retest all of the code that uses now compat-broken stuff.
>>
>> According to the above, that's because of your decision to not
>> use older CM libraries.
>
> No, I had to hack / backport patches to get bug fixes.
>>
>>> I am not
>>> advocating that we don't ever change APIs - just that we be
>>> conservative in doing so so that users can count on some level of
>>> stability.
>>
>> I'm of the opinion that it should not be at the expense of the
>> other qualities of "good code".
>
> Well, we disagree there.  Having a thing of beauty for us to look at
> is way, way less valuable than a well-maintained (meaning bugs get
> fixed) library with strong functionality and a stable API.  *That*
> is what is valuable for users and what I personally am interested in
> working on.
>
>>
>>> Somehow R does this pretty well.
>>>
>>> The last comment is what I was thinking about exploring when I said
>>> that modeling math algorithms using OO constructs is tricky.  Maybe
>>> they are just much smarter than us (to some extent, I am sure that
>>> is true); but I think R also has the advantage that it is really a
>>> pretty much procedural setup (I know, R is weirdly OO in its own
>>> way; but the public API is really pretty flat, procedural).
>>
>> [I don't know/use "R".]
>> They got it right before CM did.  So I'm advocating that we remain
>> free to do changes until we get it right. ;-)
>>
>>> The other thing that I was thinking about in this area is basically
>>> what the whole field of numerical analysis is about - the difference
>>> between naive modelling of mathematical objects in the "natural" way
>>> and what you need to do to get stable and accurate results.  Also,
>>> the tradeoff between "correctly" handling corner cases and
>>> extensions beyond what is well-defined mathematically and
>>> performance.  The Complex class illustrates all of these things.
>>
>> We can only observe that CM lack the human resources to dig into
>> these things...
>>
>>>>>>
>>>>>>> That talk would also call
>>>>>>> out some of the special challenges that you run into modelling
>>>>>>> mathematical objects using OO constructs.
>>>>>>
>>>>>> That would be interesting.
>>>>>> Are mathematical concepts really more special to model than other
>>>>>> concepts?
>>>>>
>>>>> No, the problem is that low level reusable components intended
>>>>> to be
>>>>> used by lots of different users for lots of different needs are
>>>>> difficult to set up. You have to meet conflicting needs and the
>>>>> developers do not know in advance how their code will be used.
>>>
>>> That is the core problem of API design, which we all three agree is
>>> "hard."  As mentioned above, I really think that math presents some
>>> special challenges, mostly having to do with the fact that "natural"
>>> representations of things sometimes lead to both bad numerics and
>>> bad extensibility.
>>
>> I'm, really, curious.  Could you provide example(s)?
>
> The linear package - and its history - is full of examples.   We
> started out naively implementing things using the "natural"
> representation of algebraic objects as arrays, etc and the result
> was bad numerics, lots of excessive creation / copy operations, etc.
>>
>>>> That's what I wrote below: "trying to implement generic algorithms
>>>> to solve as many practical problems as possible".
>>>> Thus: it is good to have generic, reusable, code (e.g. just from a
>>>> maintenance POV), but it at some point, it will conflict with
>>>> unexpected usage. Hence API change will be in order.
>>>>
>>>>> In many cases, things start with "someone scratching an itch"
>>>>> because
>>>>> open-source developers are the first users of their stuff. The
>>>>> resulting
>>>>> API is biased towards this first need. Low level reusable
>>>>> components
>>>>> developers have to make lots of efforts to design something clean
>>>>> enough to anticipate other uses. Even experienced low level
>>>>> reusable
>>>>> components developers don't succeed in this part.
>>>>
>>>> I totally and did not say otherwise.
>>>> The wrong way would be to have duplicate code all over the place to
>>>> tak care of each new usage. Since we try to avoid that, we'll need
>>>> to refactor when the "genericity" in one direction conflicts with
>>>> unanticipated usage.
>>>>
>>>>>
>>>>>> I rather think that the issues arise from trying to sort out the
>>>>>> general from the particular, trying to implement generic
>>>>>> algorithms
>>>>>> to solve as many practical problems as possible.
>>>>>
>>>>> Perhaps, but it is really an important need for reusable
>>>>> components.
>>>>
>>>> From a development POV, I thinks so too.
>>>> For black-box users, it is not. [They don't care about what's in
>>>> the
>>>> box as long as it does the job; and "no duplicate code" is, for
>>>> instance, not a requirement.  But this a short-term view, since
>>>> maintenance will suffer and loss of quality will ensue.]
>>>>
>>>>> Another problem is that not moving forward (typically still
>>>>> staying
>>>>> in the 3.X series instead of starting 4.0) creates additional
>>>>> constraints. Trying to patch something wrong is much more
>>>>> difficult
>>>>> than rewriting it, and sometimes it is even completely
>>>>> impossible if
>>>>> wrong design choices cannot be changed.
>>>>
>>>> +1
>>>>
>>>>>> This quite naturally lead to desing decisions that may be
>>>>>> challenged
>>>>>> by the appearance of unforeseen cases (or better programming
>>>>>> skills).
>>>>>
>>>>> Exactly. However, I prefer to say that here is the problem, and
>>>>> it is
>>>>> a challenge we have to consider rather than saying it is
>>>>> something we
>>>>> don't want to cope with so we ignore the problem and don't care
>>>>> about
>>>>> users apart from our own needs.
>>>>
>>>> So, we perfectly agree: i.e. we do not have the willpower to
>>>> maintain
>>>> old cruft! :-)
>>>>
>>>>>>> Might be a little painful
>>>>>>> to develop, but also maybe a little cathartic ;)
>>>>>>
>>>>>> It would certainly be useful to understand where the pain really
>>>>>> comes from.
>>>>>
>>>>> There is an old joke I have heard numerous times in different
>>>>> activities (including outside of software development): remove
>>>>> the user and everything gets way better.
>>>>
>>>> Sure (including us!).
>>>> But my question was real: we cannot make the user disappear, nor
>>>> all the reasons why we write programs, so we have to let the APIs
>>>> evolve.  Why is there such pain in doing so?
>>>
>>> Basically because constant compat breaks screw users.
>>
>> Cf. above. I really do not understand.
>> [As Luc said, we (developers) should not go into contorted ways in
>> order to fix bad design decisions.]
>>
>>> The win-win
>>> is obviously to do the hard thinking, compromising and testing to
>>> get to stable APIs that we can extend without breaking existing
>>> APIs.
>>
>> That is the ideal, no denial.
>> In reality, it is putting the bar too high for most contributors.
>>
>> So here is the contradiction between letting unproven (design-wise)
>> code in, but then forbidding its demise because of stability.
>> IMO, this is an unsatisfying proposition.
>
> Satisfying or not, it is our challenge.  Given that, as you
> correctly state above, we "lack the human resources" to keep
> supporting old versions, if we keep piling on incompatible API
> breaks, our users are left with either unmaintained or unstable
> code, neither of which is "satisfying" for them.  For me personally,
> satisfying the practical needs of users is the reason that I
> contribute here.  That is why I will always try to find ways to
> extend APIs, add capabilities, fix bugs in a way that preserves
> backward compatibility.


Strongly agree.

When looking at the impact of a change, it's important to remember
that there are many more consumers than there are producers (Commons
devs).
So if a change makes it easier for Commons devs, but harder for
consumers, then it is probably bad change as it increases the total
workload.

This sometimes means that Commons devs have to do extra work.

There is a very different trade-off when working on in-house or personal code.
In that case, the overall workload may be reduced by ignoring
downstream compatibility in some cases.

> Phil
>>
>>
>> Regards,
>> Gilles
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
>> For additional commands, e-mail: dev-h...@commons.apache.org
>>
>>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [math][dbcp][pool] Apachecon NA (Austin)

Reply via email to