Hello team,
I'd like to propose including the *fix [1]* for *GEODE-7079 [2]* in release
1.10.0.
Long story short: a *NullPointerException* can be continuously thrown
and flood the member's logs if a serial event processor (either
*async-event-queue* or *gateway-sender*) starts processing events fr
+1
On Thu, Aug 15, 2019 at 5:30 AM Ju@N wrote:
> Hello team,
>
> I'd like to propose including the *fix [1]* for *GEODE-7079 [2]* in release
> 1.10.0.
> Long story short: a *NullPointerException* can be continuously thrown
> and flood the member's logs if a serial event processor (either
> *asyn
Juan,
From your explanation, it seems this issue is existing and not
critical. Could we possibly hold this for 1.11?
--Udo
On 8/15/19 5:29 AM, Ju@N wrote:
Hello team,
I'd like to propose including the *fix [1]* for *GEODE-7079 [2]* in release
1.10.0.
Long story short: a *NullPointerExceptio
+1
On Thu, Aug 15, 2019 at 9:54 AM John Blum wrote:
> +1
>
> On Thu, Aug 15, 2019 at 5:30 AM Ju@N wrote:
>
> > Hello team,
> >
> > I'd like to propose including the *fix [1]* for *GEODE-7079 [2]* in
> release
> > 1.10.0.
> > Long story short: a *NullPointerException* can be continuously thrown
Hello Udo,
Even if it is an existing issue I'd still consider it critical for those
cases on which there are unprocessed events on the persistent queue after a
restart and the region takes long to recover... you can actually see
millions of *NPEs* flooding the member's logs.
My two cents anyway, i
+1
Agreed to fixing this. It's impossible for a user to discover they hit an
edge case that we fail to support till they are in prod and restart.
On Thu, Aug 15, 2019 at 10:09 AM Juan José Ramos wrote:
> Hello Udo,
>
> Even if it is an existing issue I'd still consider it critical for those
> c
+1
On Thu, Aug 15, 2019 at 10:15 AM Alexander Murmann
wrote:
> +1
>
> Agreed to fixing this. It's impossible for a user to discover they hit an
> edge case that we fail to support till they are in prod and restart.
>
> On Thu, Aug 15, 2019 at 10:09 AM Juan José Ramos
> wrote:
>
> > Hello Udo,
>
Seems everyone is in favor or including a /*non-critical*/ fix to an
already cut branch of the a potential release...
Am I missing something?
Why cut a release at all... just have a perpetual cycle of fixes added
to develop and users can chose what nightly snapshot build they would
want to us
While we can’t fix *all known bugs*, I think where we do have a fix for an
important issue we should think hard about the cost of not including that in a
release.
IMO, the fixed time approach to releases means that we *start* the release
effort (including stabilization and bug fixing if needed)
In this specific case, how long has this issue been in the product? When did
we first see it? That would give me a lot more context in gauging the
“criticality” of this. Juan, can you share that information?
To Udo’s point, with every change we check in, we add some risk of instability
or at
Whilst I agree with "*finish* when we believe the quality of the release
branch is sufficient", I disagree that we should have cut a branch and
continue to patch that branch with non-critical fixes. i.e this issue
has been around for a while and has no averse side effects. Issues like
GEODE-708
+1 to merging Juan's fix for GEODE-7079. I've seen systems taken down by
rapidly filling up the logs in the past, this does seem to be a critical
fix from the perspective of system stability.
Also, this is the change, which doesn't seem particularly risky to me.
- ConflationKey key = new
@Dan, not arguing that logs filling up with NPE's could bring a system
down with limit disk space, or potentially swallowing important logs
that could be helpful in root-causing issues...
I'm merely raising the question on why this bug fix should receive
priority inclusion. It has been around
I'm changing my vote to +1 on this issue.
The ONLY reason I'm changing my vote is to add to the cleanliness of the
code of the release. I do 100% disagree with the continual scope creep
that we have been incurring on this release branch.
--Udo
On 8/15/19 12:34 PM, Dan Smith wrote:
+1 to mer
It sounds like there is consensus on adding this fix. Could someone please
cherry-pick this for me?
Thanks,
Aaron
> On Aug 14, 2019, at 1:13 PM, Udo Kohlmeyer wrote:
>
> @Aaron,Kirk - thank you for the clarification.
>
> +1 to include the fix, as reverting GEODE-7001 would be more effort :)
>
This is a fix for a problem where a member that has lost quorum does not
detect it and does not shut down. The fix is small and has been
extensively tested. The fix also addresses the possibility of a member
being kicked out of the cluster when it is only late in delivering a
heartbeat (i.e.,
Because someone will ask, can we be proactive in these request with identifying
if the issue being fixed is introduced in Geode 1.10 or is a preexisting
condition.
-jake
> On Aug 15, 2019, at 2:09 PM, Bruce Schuchardt wrote:
>
> This is a fix for a problem where a member that has lost quorum
Testing in the past week hit this problem 9 times and it was identified
as a new issue.
On 8/15/19 2:23 PM, Jacob Barrett wrote:
Because someone will ask, can we be proactive in these request with identifying
if the issue being fixed is introduced in Geode 1.10 or is a preexisting
condition.
You should be able to do the cherry-pick on your fork and then open a PR
against the release branch.
> On Aug 15, 2019, at 2:04 PM, Aaron Lindsey wrote:
>
> It sounds like there is consensus on adding this fix. Could someone please
> cherry-pick this for me?
>
> Thanks,
> Aaron
>
>> On Aug 1
Normally cherry-picking to the release branch is the release managers job
(Dick in this case) [1]. He asked me to help out while he was on vacation,
so I will go ahead and cherry-pick it over.
I kinda like the process Jake proposed though - creating a PR against the
release branch. My only concern
Looking at the Geode ticket number, it seems this problem has
resurfaced, as it seems to have been addressed in 1.7.0 already.
My concern is, do what know WHAT caused it to resurface? Or was this
issue always dormant and only recently resurfaced? Without understand
why we are seeing "fixed" is
I have the cherry-pick ready to push or file a PR. Let me know what you
prefer...
On Thu, Aug 15, 2019 at 3:01 PM Dan Smith wrote:
> Normally cherry-picking to the release branch is the release managers job
> (Dick in this case) [1]. He asked me to help out while he was on vacation,
> so I will
@kirk - go ahead and push it.
-Dan
On Thu, Aug 15, 2019 at 3:13 PM Kirk Lund wrote:
> I have the cherry-pick ready to push or file a PR. Let me know what you
> prefer...
>
> On Thu, Aug 15, 2019 at 3:01 PM Dan Smith wrote:
>
> > Normally cherry-picking to the release branch is the release mana
In this case it was another change that is in 1.10 that decreased the
amount of time we try to connect to unreachable alert listeners that
caused this problem to resurface. This decrease allowed availability
checks to proceed faster than they used to. This allowed an availability
check to pass
Just a reminder, that our many sun.misc.* warnings are drowning out real
warnings...
We are adding new build warnings which makes me sad. This one was added
recently:
/Users/klund/dev/geode3/geode-core/src/main/java/org/apache/geode/internal/cache/tier/sockets/AcceptorImpl.java:1075:
warning: unr
Done!
On Thu, Aug 15, 2019 at 3:21 PM Dan Smith wrote:
> @kirk - go ahead and push it.
>
> -Dan
>
> On Thu, Aug 15, 2019 at 3:13 PM Kirk Lund wrote:
>
> > I have the cherry-pick ready to push or file a PR. Let me know what you
> > prefer...
> >
> > On Thu, Aug 15, 2019 at 3:01 PM Dan Smith wro
On that note, I’ve had a PR open to address all the API warnings for some time
now. Would love a review.
https://github.com/apache/geode/pull/3872
> On Aug 15, 2019, at 3:59 PM, Kirk Lund wrote:
>
> Just a reminder, that our many sun.misc.* warnings are drowning out real
> warnings...
>
> We
27 matches
Mail list logo