Re: Proposal: Cut 1.15 release branch from SHA 8f7193c827ee3198ae374101221c02039c70a561
Fair point, although my hope is that we continually and incrementally improve our code base on the /develop branch. Yes, there is an increased cost of back porting critical changes to support branches but the tradeoff is worth it IMHO. Anthony > On Jan 25, 2022, at 11:18 PM, Owen Nichols wrote: > > Also to consider, having a large refactor on develop but not support/1.15 > will increase backporting pain, as many cherry-picks will have merge > conflicts that have to be manually "un-refactored".
Re: Proposal: Cut 1.15 release branch from SHA 8f7193c827ee3198ae374101221c02039c70a561
Owen, I really appreciate your point about the increased cost of backports by the branches diverging like this. I do wonder how high the cost will be in practice, given that AFAIK most of these changes limit themselves to a single line. From: Owen Nichols Sent: Tuesday, January 25, 2022 20:18 To: dev@geode.apache.org Subject: Re: Proposal: Cut 1.15 release branch from SHA 8f7193c827ee3198ae374101221c02039c70a561 Even a small change can have subtle but important effects only discovered after a long time, so leaning on commit-size as a proxy for risk may only serve to create a false sense of security. Also to consider, having a large refactor on develop but not support/1.15 will increase backporting pain, as many cherry-picks will have merge conflicts that have to be manually "un-refactored". On 1/25/22, 5:09 PM, "Alexander Murmann" wrote: Hi everyone, Last week we discussed to cut the 1.15 release branch. I would like to propose that we cut the branch from last week's SHA 8f7193c827ee3198ae374101221c02039c70a561. The following commit is a very large refactor. Nothing obvious seems wrong with that change, but given that we frequently only discover very subtle, but important changes to Geode after a long time, I think that this would allow us to reduce some risk for 1.15 and its future users and give this large change some time to proof itself on the develop branch. I'd love to avoid that work entirely, but am concerned that we only might find out about problems a few weeks from now or worse, after we shipped. Another option might be to branch from head and revert the change on the release branch. I am uncertain which approach will proof less work.
Upgrade scenario: FunctionAttributes update
Hi everyone, During our testing, we have the following scenario: * We start up a cluster and a geode-native client. * Load a FunctionService jar into the cluster. * Execute normal flow of operations (function execs, puts, gets, ...) * After 5 mins, we update the FunctionService jar, which changes some of the function attributes. * From this point on, geode-native client receives an error from the server stating that the function exec failed due to function attributes mismatch. Obviously, this case is not supported by geode-native, as it would require updating the function attributes when this error is received. I am not sure if this is supported in the Java client, I've been looking, and I haven't seen that this mechanism is supported in the Java client, but maybe I am wrong. My questions are: * Have you encountered a scenario like this? * Do you know if function attributes update is not supported on purpose, to avoid an issue? Or it's just that nobody tried this upgrade scenario before? Thanks for all! /Mario.
Re: Proposal: Cut 1.15 release branch from SHA 8f7193c827ee3198ae374101221c02039c70a561
I think we will probably always have this struggle around the time we cut a new release branch and then need to decide what changes on develop to also bring to the release branch. I'm okay with us not including the big cleanup in 1.15 since it is not something we plan on backporting to other support branches and if we had cut the 1.15 release branch right before this change, I don't think we would have approved it to be back ported to the 1.15 release branch. I share Owen's concern that not including it might case extra work in resolving conflicts in every change to develop after it that we also want to include in 1.15. We will not know if this is true until we try cherry-picking those revisions. So I'm unsure what the best way of not including it is. I'm okay with us cutting the branch from the revision before the big commit and then working through the list of commits after it and deciding which need to be in the 1.15 release. From: Alexander Murmann Sent: Wednesday, January 26, 2022 7:21 AM To: dev@geode.apache.org Subject: Re: Proposal: Cut 1.15 release branch from SHA 8f7193c827ee3198ae374101221c02039c70a561 Owen, I really appreciate your point about the increased cost of backports by the branches diverging like this. I do wonder how high the cost will be in practice, given that AFAIK most of these changes limit themselves to a single line. From: Owen Nichols Sent: Tuesday, January 25, 2022 20:18 To: dev@geode.apache.org Subject: Re: Proposal: Cut 1.15 release branch from SHA 8f7193c827ee3198ae374101221c02039c70a561 Even a small change can have subtle but important effects only discovered after a long time, so leaning on commit-size as a proxy for risk may only serve to create a false sense of security. Also to consider, having a large refactor on develop but not support/1.15 will increase backporting pain, as many cherry-picks will have merge conflicts that have to be manually "un-refactored". On 1/25/22, 5:09 PM, "Alexander Murmann" wrote: Hi everyone, Last week we discussed to cut the 1.15 release branch. I would like to propose that we cut the branch from last week's SHA 8f7193c827ee3198ae374101221c02039c70a561. The following commit is a very large refactor. Nothing obvious seems wrong with that change, but given that we frequently only discover very subtle, but important changes to Geode after a long time, I think that this would allow us to reduce some risk for 1.15 and its future users and give this large change some time to proof itself on the develop branch. I'd love to avoid that work entirely, but am concerned that we only might find out about problems a few weeks from now or worse, after we shipped. Another option might be to branch from head and revert the change on the release branch. I am uncertain which approach will proof less work.
Re: Proposal: Cut 1.15 release branch from SHA 8f7193c827ee3198ae374101221c02039c70a561
First, I think I would suggest that we have someone cut a branch as suggested and see how long it actually takes. Second, I would suggest we define a norm if we want to avoid this in the future. Third, I don't really like the risk of having this in, but I have only heard about performance changes in our performance testing. Is there a specific defect? I looked at the code changes (not all 3000 files but a chunk) before I approved them. The changes were mostly dealing with warnings like unboxing etc. Given that these types of changes are lower risk individually, though obviously of concern en masse, I would like to see a bug or something before we decide to increase the work load by branching before the change. I understand that we are nervous about creating a release, but let's get some data (bugs) to justify the effort. Thanks, Mark On 1/26/22, 7:21 AM, "Alexander Murmann" wrote: Owen, I really appreciate your point about the increased cost of backports by the branches diverging like this. I do wonder how high the cost will be in practice, given that AFAIK most of these changes limit themselves to a single line. From: Owen Nichols Sent: Tuesday, January 25, 2022 20:18 To: dev@geode.apache.org Subject: Re: Proposal: Cut 1.15 release branch from SHA 8f7193c827ee3198ae374101221c02039c70a561 Even a small change can have subtle but important effects only discovered after a long time, so leaning on commit-size as a proxy for risk may only serve to create a false sense of security. Also to consider, having a large refactor on develop but not support/1.15 will increase backporting pain, as many cherry-picks will have merge conflicts that have to be manually "un-refactored". On 1/25/22, 5:09 PM, "Alexander Murmann" wrote: Hi everyone, Last week we discussed to cut the 1.15 release branch. I would like to propose that we cut the branch from last week's SHA 8f7193c827ee3198ae374101221c02039c70a561. The following commit is a very large refactor. Nothing obvious seems wrong with that change, but given that we frequently only discover very subtle, but important changes to Geode after a long time, I think that this would allow us to reduce some risk for 1.15 and its future users and give this large change some time to proof itself on the develop branch. I'd love to avoid that work entirely, but am concerned that we only might find out about problems a few weeks from now or worse, after we shipped. Another option might be to branch from head and revert the change on the release branch. I am uncertain which approach will proof less work.
Re: Proposal: Cut 1.15 release branch from SHA 8f7193c827ee3198ae374101221c02039c70a561
BTW, just to clarify, when I officially proposed cutting the branch, I hadn't intended to volunteer as release manager this round. That said, it's important to branch at a point we're confident about. Bisecting a 3K-file change is potentially... complicated. If there's confidence we can track down the issue(s) quickly, a later branch point would be nice. If not... probably better to branch at a "not known bad" point. On 1/26/22, 1:03 PM, "Mark Hanson" wrote: First, I think I would suggest that we have someone cut a branch as suggested and see how long it actually takes. Second, I would suggest we define a norm if we want to avoid this in the future. Third, I don't really like the risk of having this in, but I have only heard about performance changes in our performance testing. Is there a specific defect? I looked at the code changes (not all 3000 files but a chunk) before I approved them. The changes were mostly dealing with warnings like unboxing etc. Given that these types of changes are lower risk individually, though obviously of concern en masse, I would like to see a bug or something before we decide to increase the work load by branching before the change. I understand that we are nervous about creating a release, but let's get some data (bugs) to justify the effort. Thanks, Mark On 1/26/22, 7:21 AM, "Alexander Murmann" wrote: Owen, I really appreciate your point about the increased cost of backports by the branches diverging like this. I do wonder how high the cost will be in practice, given that AFAIK most of these changes limit themselves to a single line. From: Owen Nichols Sent: Tuesday, January 25, 2022 20:18 To: dev@geode.apache.org Subject: Re: Proposal: Cut 1.15 release branch from SHA 8f7193c827ee3198ae374101221c02039c70a561 Even a small change can have subtle but important effects only discovered after a long time, so leaning on commit-size as a proxy for risk may only serve to create a false sense of security. Also to consider, having a large refactor on develop but not support/1.15 will increase backporting pain, as many cherry-picks will have merge conflicts that have to be manually "un-refactored". On 1/25/22, 5:09 PM, "Alexander Murmann" wrote: Hi everyone, Last week we discussed to cut the 1.15 release branch. I would like to propose that we cut the branch from last week's SHA 8f7193c827ee3198ae374101221c02039c70a561. The following commit is a very large refactor. Nothing obvious seems wrong with that change, but given that we frequently only discover very subtle, but important changes to Geode after a long time, I think that this would allow us to reduce some risk for 1.15 and its future users and give this large change some time to proof itself on the develop branch. I'd love to avoid that work entirely, but am concerned that we only might find out about problems a few weeks from now or worse, after we shipped. Another option might be to branch from head and revert the change on the release branch. I am uncertain which approach will proof less work.
Recent CI outage - resolved
This morning we had an unexpected problem with Concourse. The database volume filled up and locked everything up. The database volume has been increased in size and everything should be functioning properly again. It's highly likely that a few jobs failed because of this and we're looking into it. -Sean.
Re: Proposal: Cut 1.15 release branch from SHA 8f7193c827ee3198ae374101221c02039c70a561
To clarify what I think might be a misunderstanding here, there is zero evidence of any kind that the large warning clean-up PR has introduced any issues, performance-related or otherwise. For my two cents on cutting the branch at the commit before the big change, I'm of the same opinion as Robert, that we should cut the branch at HEAD and revert only if it's determined to be necessary. The changes were broad, but the nature of the changes were benign and low-risk, being simple, automated refactors of trivial things like removing unnecessary casts or adding "final" to variables. I understand the argument that we often don't see all the consequences of a change until some time after it's been merged, but that argument applies to every change that gets made, and historically, new features and bug fixes have a far higher change of introducing problems than automated refactors. The choice of the warning clean-up change seems based almost entirely on the fact that it touched a lot of files, not in the actual risk associated with those changes. From: Raymond Ingles Sent: Wednesday, January 26, 2022 10:16 AM To: dev@geode.apache.org Subject: Re: Proposal: Cut 1.15 release branch from SHA 8f7193c827ee3198ae374101221c02039c70a561 BTW, just to clarify, when I officially proposed cutting the branch, I hadn't intended to volunteer as release manager this round. That said, it's important to branch at a point we're confident about. Bisecting a 3K-file change is potentially... complicated. If there's confidence we can track down the issue(s) quickly, a later branch point would be nice. If not... probably better to branch at a "not known bad" point. On 1/26/22, 1:03 PM, "Mark Hanson" wrote: First, I think I would suggest that we have someone cut a branch as suggested and see how long it actually takes. Second, I would suggest we define a norm if we want to avoid this in the future. Third, I don't really like the risk of having this in, but I have only heard about performance changes in our performance testing. Is there a specific defect? I looked at the code changes (not all 3000 files but a chunk) before I approved them. The changes were mostly dealing with warnings like unboxing etc. Given that these types of changes are lower risk individually, though obviously of concern en masse, I would like to see a bug or something before we decide to increase the work load by branching before the change. I understand that we are nervous about creating a release, but let's get some data (bugs) to justify the effort. Thanks, Mark On 1/26/22, 7:21 AM, "Alexander Murmann" wrote: Owen, I really appreciate your point about the increased cost of backports by the branches diverging like this. I do wonder how high the cost will be in practice, given that AFAIK most of these changes limit themselves to a single line. From: Owen Nichols Sent: Tuesday, January 25, 2022 20:18 To: dev@geode.apache.org Subject: Re: Proposal: Cut 1.15 release branch from SHA 8f7193c827ee3198ae374101221c02039c70a561 Even a small change can have subtle but important effects only discovered after a long time, so leaning on commit-size as a proxy for risk may only serve to create a false sense of security. Also to consider, having a large refactor on develop but not support/1.15 will increase backporting pain, as many cherry-picks will have merge conflicts that have to be manually "un-refactored". On 1/25/22, 5:09 PM, "Alexander Murmann" wrote: Hi everyone, Last week we discussed to cut the 1.15 release branch. I would like to propose that we cut the branch from last week's SHA 8f7193c827ee3198ae374101221c02039c70a561. The following commit is a very large refactor. Nothing obvious seems wrong with that change, but given that we frequently only discover very subtle, but important changes to Geode after a long time, I think that this would allow us to reduce some risk for 1.15 and its future users and give this large change some time to proof itself on the develop branch. I'd love to avoid that work entirely, but am concerned that we only might find out about problems a few weeks from now or worse, after we shipped. Another option might be to branch from head and revert the change on the release branch. I am uncertain which approach will proof less work.
Re: Proposal: Cut 1.15 release branch from SHA 8f7193c827ee3198ae374101221c02039c70a561
To be clear, I agree with Donal and Robert. On 1/26/22, 10:44 AM, "Donal Evans" wrote: To clarify what I think might be a misunderstanding here, there is zero evidence of any kind that the large warning clean-up PR has introduced any issues, performance-related or otherwise. For my two cents on cutting the branch at the commit before the big change, I'm of the same opinion as Robert, that we should cut the branch at HEAD and revert only if it's determined to be necessary. The changes were broad, but the nature of the changes were benign and low-risk, being simple, automated refactors of trivial things like removing unnecessary casts or adding "final" to variables. I understand the argument that we often don't see all the consequences of a change until some time after it's been merged, but that argument applies to every change that gets made, and historically, new features and bug fixes have a far higher change of introducing problems than automated refactors. The choice of the warning clean-up change seems based almost entirely on the fact that it touched a lot of files, not in the actual risk associated with those changes. From: Raymond Ingles Sent: Wednesday, January 26, 2022 10:16 AM To: dev@geode.apache.org Subject: Re: Proposal: Cut 1.15 release branch from SHA 8f7193c827ee3198ae374101221c02039c70a561 BTW, just to clarify, when I officially proposed cutting the branch, I hadn't intended to volunteer as release manager this round. That said, it's important to branch at a point we're confident about. Bisecting a 3K-file change is potentially... complicated. If there's confidence we can track down the issue(s) quickly, a later branch point would be nice. If not... probably better to branch at a "not known bad" point. On 1/26/22, 1:03 PM, "Mark Hanson" wrote: First, I think I would suggest that we have someone cut a branch as suggested and see how long it actually takes. Second, I would suggest we define a norm if we want to avoid this in the future. Third, I don't really like the risk of having this in, but I have only heard about performance changes in our performance testing. Is there a specific defect? I looked at the code changes (not all 3000 files but a chunk) before I approved them. The changes were mostly dealing with warnings like unboxing etc. Given that these types of changes are lower risk individually, though obviously of concern en masse, I would like to see a bug or something before we decide to increase the work load by branching before the change. I understand that we are nervous about creating a release, but let's get some data (bugs) to justify the effort. Thanks, Mark On 1/26/22, 7:21 AM, "Alexander Murmann" wrote: Owen, I really appreciate your point about the increased cost of backports by the branches diverging like this. I do wonder how high the cost will be in practice, given that AFAIK most of these changes limit themselves to a single line. From: Owen Nichols Sent: Tuesday, January 25, 2022 20:18 To: dev@geode.apache.org Subject: Re: Proposal: Cut 1.15 release branch from SHA 8f7193c827ee3198ae374101221c02039c70a561 Even a small change can have subtle but important effects only discovered after a long time, so leaning on commit-size as a proxy for risk may only serve to create a false sense of security. Also to consider, having a large refactor on develop but not support/1.15 will increase backporting pain, as many cherry-picks will have merge conflicts that have to be manually "un-refactored". On 1/25/22, 5:09 PM, "Alexander Murmann" wrote: Hi everyone, Last week we discussed to cut the 1.15 release branch. I would like to propose that we cut the branch from last week's SHA 8f7193c827ee3198ae374101221c02039c70a561. The following commit is a very large refactor. Nothing obvious seems wrong with that change, but given that we frequently only discover very subtle, but important changes to Geode after a long time, I think that this would allow us to reduce some risk for 1.15 and its future users and give this large change some time to proof itself on the develop branch. I'd love to avoid that work entirely, but am concerned that we only might find out about problems a few weeks from now or worse, after we shipped. Another option might be to branch from head and revert the change on the release branch. I am uncertain which approach will proof less work.
Re: Proposal: Cut 1.15 release branch from SHA 8f7193c827ee3198ae374101221c02039c70a561
Thanks Ray for proposing this branch and Alexander for suggesting a branchpoint. I will volunteer as Release Manager. ** The support/1.15 branch has now been cut from the SHA proposed in this thread and develop is now 1.16. ** Please use your best judgement in determining what to backport. PRs are welcome, but optional! Committers, please merge your own PRs. If your backport will take some while, you can add the "blocks-1.15.0" label in Jira to let others know. Please do raise any concerns on the dev list and/or via the blocks-1.15.0 label. Please do test the branch as much as possible to make this the best release of Geode. -Owen On 1/26/22, 10:16 AM, "Raymond Ingles" wrote: BTW, just to clarify, when I officially proposed cutting the branch, I hadn't intended to volunteer as release manager this round. That said, it's important to branch at a point we're confident about. Bisecting a 3K-file change is potentially... complicated. If there's confidence we can track down the issue(s) quickly, a later branch point would be nice. If not... probably better to branch at a "not known bad" point. On 1/26/22, 1:03 PM, "Mark Hanson" wrote: First, I think I would suggest that we have someone cut a branch as suggested and see how long it actually takes. Second, I would suggest we define a norm if we want to avoid this in the future. Third, I don't really like the risk of having this in, but I have only heard about performance changes in our performance testing. Is there a specific defect? I looked at the code changes (not all 3000 files but a chunk) before I approved them. The changes were mostly dealing with warnings like unboxing etc. Given that these types of changes are lower risk individually, though obviously of concern en masse, I would like to see a bug or something before we decide to increase the work load by branching before the change. I understand that we are nervous about creating a release, but let's get some data (bugs) to justify the effort. Thanks, Mark On 1/26/22, 7:21 AM, "Alexander Murmann" wrote: Owen, I really appreciate your point about the increased cost of backports by the branches diverging like this. I do wonder how high the cost will be in practice, given that AFAIK most of these changes limit themselves to a single line. From: Owen Nichols Sent: Tuesday, January 25, 2022 20:18 To: dev@geode.apache.org Subject: Re: Proposal: Cut 1.15 release branch from SHA 8f7193c827ee3198ae374101221c02039c70a561 Even a small change can have subtle but important effects only discovered after a long time, so leaning on commit-size as a proxy for risk may only serve to create a false sense of security. Also to consider, having a large refactor on develop but not support/1.15 will increase backporting pain, as many cherry-picks will have merge conflicts that have to be manually "un-refactored". On 1/25/22, 5:09 PM, "Alexander Murmann" wrote: Hi everyone, Last week we discussed to cut the 1.15 release branch. I would like to propose that we cut the branch from last week's SHA 8f7193c827ee3198ae374101221c02039c70a561. The following commit is a very large refactor. Nothing obvious seems wrong with that change, but given that we frequently only discover very subtle, but important changes to Geode after a long time, I think that this would allow us to reduce some risk for 1.15 and its future users and give this large change some time to proof itself on the develop branch. I'd love to avoid that work entirely, but am concerned that we only might find out about problems a few weeks from now or worse, after we shipped. Another option might be to branch from head and revert the change on the release branch. I am uncertain which approach will proof less work.
Re: Upgrade scenario: FunctionAttributes update
Mario, I tried changing optimizeForWrite from true to false in java and the function execution breaks with this cause: Caused by: org.apache.geode.cache.client.ServerOperationException: remote server on localhost: Function attributes at client and server don't match: UpdateTradeFunction at org.apache.geode.cache.client.internal.ExecuteRegionFunctionSingleHopOp$ExecuteRegionFunctionSingleHopOpImpl.processResponse(ExecuteRegionFunctionSingleHopOp.java:372) at org.apache.geode.cache.client.internal.AbstractOp.processResponse(AbstractOp.java:234) at ... at org.apache.geode.cache.client.internal.OpExecutorImpl.executeOnServer(OpExecutorImpl.java:343) at org.apache.geode.cache.client.internal.OpExecutorImpl.executeOn(OpExecutorImpl.java:312) at org.apache.geode.cache.client.internal.PoolImpl.executeOn(PoolImpl.java:848) at org.apache.geode.cache.client.internal.SingleHopOperationCallable.call(SingleHopOperationCallable.java:48) There is an internal java API in AbstractExecution called removeFunctionAttributes that works-around this issue. I'm not sure I see the same API in NC, but I could be missing it. } catch (FunctionException e) { if (e.getMessage().contains("Function attributes at client and server don't match")) { ((AbstractExecution) execution).removeFunctionAttributes("UpdateTradeFunction"); // Retry function here or in loop } else { handleException(e); } } I'm not sure this was done on purpose. It seems like the product could clear the cached attributes and retry once internally before throwing the exception. It looks like changing optimizeForWrite or isHA would be fine, but changing hasResult would require changing your client read the result (or not). Barry From: Mario Salazar de Torres Sent: Wednesday, January 26, 2022 9:11 AM To: dev@geode.apache.org Subject: Upgrade scenario: FunctionAttributes update Hi everyone, During our testing, we have the following scenario: * We start up a cluster and a geode-native client. * Load a FunctionService jar into the cluster. * Execute normal flow of operations (function execs, puts, gets, ...) * After 5 mins, we update the FunctionService jar, which changes some of the function attributes. * From this point on, geode-native client receives an error from the server stating that the function exec failed due to function attributes mismatch. Obviously, this case is not supported by geode-native, as it would require updating the function attributes when this error is received. I am not sure if this is supported in the Java client, I've been looking, and I haven't seen that this mechanism is supported in the Java client, but maybe I am wrong. My questions are: * Have you encountered a scenario like this? * Do you know if function attributes update is not supported on purpose, to avoid an issue? Or it's just that nobody tried this upgrade scenario before? Thanks for all! /Mario.
Re: Upgrade scenario: FunctionAttributes update
Hi Barry, Many thanks for dedicating some time testing this. As you confirmed the scenario also fails using the Java API, so I'll create the PRs with the solution for both Java and C++ API. Thanks! /Mario From: Barry Oglesby Sent: Wednesday, January 26, 2022 10:28 PM To: dev@geode.apache.org Subject: Re: Upgrade scenario: FunctionAttributes update Mario, I tried changing optimizeForWrite from true to false in java and the function execution breaks with this cause: Caused by: org.apache.geode.cache.client.ServerOperationException: remote server on localhost: Function attributes at client and server don't match: UpdateTradeFunction at org.apache.geode.cache.client.internal.ExecuteRegionFunctionSingleHopOp$ExecuteRegionFunctionSingleHopOpImpl.processResponse(ExecuteRegionFunctionSingleHopOp.java:372) at org.apache.geode.cache.client.internal.AbstractOp.processResponse(AbstractOp.java:234) at ... at org.apache.geode.cache.client.internal.OpExecutorImpl.executeOnServer(OpExecutorImpl.java:343) at org.apache.geode.cache.client.internal.OpExecutorImpl.executeOn(OpExecutorImpl.java:312) at org.apache.geode.cache.client.internal.PoolImpl.executeOn(PoolImpl.java:848) at org.apache.geode.cache.client.internal.SingleHopOperationCallable.call(SingleHopOperationCallable.java:48) There is an internal java API in AbstractExecution called removeFunctionAttributes that works-around this issue. I'm not sure I see the same API in NC, but I could be missing it. } catch (FunctionException e) { if (e.getMessage().contains("Function attributes at client and server don't match")) { ((AbstractExecution) execution).removeFunctionAttributes("UpdateTradeFunction"); // Retry function here or in loop } else { handleException(e); } } I'm not sure this was done on purpose. It seems like the product could clear the cached attributes and retry once internally before throwing the exception. It looks like changing optimizeForWrite or isHA would be fine, but changing hasResult would require changing your client read the result (or not). Barry From: Mario Salazar de Torres Sent: Wednesday, January 26, 2022 9:11 AM To: dev@geode.apache.org Subject: Upgrade scenario: FunctionAttributes update Hi everyone, During our testing, we have the following scenario: * We start up a cluster and a geode-native client. * Load a FunctionService jar into the cluster. * Execute normal flow of operations (function execs, puts, gets, ...) * After 5 mins, we update the FunctionService jar, which changes some of the function attributes. * From this point on, geode-native client receives an error from the server stating that the function exec failed due to function attributes mismatch. Obviously, this case is not supported by geode-native, as it would require updating the function attributes when this error is received. I am not sure if this is supported in the Java client, I've been looking, and I haven't seen that this mechanism is supported in the Java client, but maybe I am wrong. My questions are: * Have you encountered a scenario like this? * Do you know if function attributes update is not supported on purpose, to avoid an issue? Or it's just that nobody tried this upgrade scenario before? Thanks for all! /Mario.
Re: Question regarding geode thread priorities
Hi BRs/Jakov, I'm familiar with most of these threads, and these ones I know of do not spawn more than one thread total. Most of these are quite old, possibly predating Executors in Java. I doubt using max priority is important for these threads, but you should probably do some perf testing if you want to remove setMaxPriority. I recommend using https://github.com/apache/geode-benchmarks as well as writing targeted JMH micro-benchmarks. Cheers, Kirk On Mon, Jan 24, 2022 at 3:50 AM Jakov Varenina wrote: > Hi community, > > We have came across to some code in geode that prioritizes some of the > threads using > > https://cr.openjdk.java.net/~iris/se/11/latestSpec/api/java.base/java/lang/Thread.html#setPriority(int). > > Below you can find links to code. > > > https://github.com/apache/geode/blob/41eb49989f25607acfcbf9ac5afe3d4c0721bb35/geode-core/src/main/java/org/apache/geode/internal/statistics/HostStatSampler.java#L304 > > > https://github.com/apache/geode/blob/41eb49989f25607acfcbf9ac5afe3d4c0721bb35/geode-core/src/main/java/org/apache/geode/internal/cache/control/OffHeapMemoryMonitor.java#L92 > > > https://github.com/apache/geode/blob/d79a3c78eab96a9e760db07fa42580e61586b9c5/geode-core/src/main/java/org/apache/geode/internal/cache/control/InternalResourceManager.java#L147 > > > https://github.com/apache/geode/blob/a5bd36f9fa787d3a71c6e6efafed5a7b0fe52d2b/geode-core/src/main/java/org/apache/geode/internal/tcp/TCPConduit.java#L343 > > Just to add that every new thread inherits parent thread priority, so > that means that there will be more thread with max priority in addition > to the above threads. Does somebody know why this is set for these > particular threads? > > Additionally, in multiple online resources it is indicated that these > priories are not taken into the account by the Linux scheduler unless > additional parameters in JVM are set (UseThreadPriorities and > ThreadPriorityPolicy), please check links for more information's: > > https://github.com/openjdk/jdk/blob/jdk8-b120/hotspot/src/share/vm/runtime/globals.hpp#L3369,L3392 > and > > https://github.com/openjdk/jdk/blob/jdk8-b120/hotspot/src/os/linux/vm/os_linux.cpp#L3961,L3966 > > Are these priorities that are set in Apache Geode code crucial and > should be enabled for better performance, or they shouldn't be used? > Also did I maybe miss something and these priorities are somehow used > even without setting mentioned JVM parameters? > > Any help on this topic is welcome and sorry for bothering! > > BRs/Jakov > >
Re: Question regarding geode thread priorities
PS: Sorry if I didn't realize what BRs is :D On Wed, Jan 26, 2022 at 3:39 PM Kirk Lund wrote: > Hi BRs/Jakov, > > I'm familiar with most of these threads, and these ones I know of do not > spawn more than one thread total. Most of these are quite old, possibly > predating Executors in Java. I doubt using max priority is important for > these threads, but you should probably do some perf testing if you want to > remove setMaxPriority. I recommend using > https://github.com/apache/geode-benchmarks as well as writing targeted > JMH micro-benchmarks. > > Cheers, > Kirk > > On Mon, Jan 24, 2022 at 3:50 AM Jakov Varenina > wrote: > >> Hi community, >> >> We have came across to some code in geode that prioritizes some of the >> threads using >> >> https://cr.openjdk.java.net/~iris/se/11/latestSpec/api/java.base/java/lang/Thread.html#setPriority(int). >> >> Below you can find links to code. >> >> >> https://github.com/apache/geode/blob/41eb49989f25607acfcbf9ac5afe3d4c0721bb35/geode-core/src/main/java/org/apache/geode/internal/statistics/HostStatSampler.java#L304 >> >> >> https://github.com/apache/geode/blob/41eb49989f25607acfcbf9ac5afe3d4c0721bb35/geode-core/src/main/java/org/apache/geode/internal/cache/control/OffHeapMemoryMonitor.java#L92 >> >> >> https://github.com/apache/geode/blob/d79a3c78eab96a9e760db07fa42580e61586b9c5/geode-core/src/main/java/org/apache/geode/internal/cache/control/InternalResourceManager.java#L147 >> >> >> https://github.com/apache/geode/blob/a5bd36f9fa787d3a71c6e6efafed5a7b0fe52d2b/geode-core/src/main/java/org/apache/geode/internal/tcp/TCPConduit.java#L343 >> >> Just to add that every new thread inherits parent thread priority, so >> that means that there will be more thread with max priority in addition >> to the above threads. Does somebody know why this is set for these >> particular threads? >> >> Additionally, in multiple online resources it is indicated that these >> priories are not taken into the account by the Linux scheduler unless >> additional parameters in JVM are set (UseThreadPriorities and >> ThreadPriorityPolicy), please check links for more information's: >> >> https://github.com/openjdk/jdk/blob/jdk8-b120/hotspot/src/share/vm/runtime/globals.hpp#L3369,L3392 >> and >> >> https://github.com/openjdk/jdk/blob/jdk8-b120/hotspot/src/os/linux/vm/os_linux.cpp#L3961,L3966 >> >> Are these priorities that are set in Apache Geode code crucial and >> should be enabled for better performance, or they shouldn't be used? >> Also did I maybe miss something and these priorities are somehow used >> even without setting mentioned JVM parameters? >> >> Any help on this topic is welcome and sorry for bothering! >> >> BRs/Jakov >> >>