Re: [Discuss] Cutting Geode 1.14
Owen, we cannot plan an updated timeline till we have shipped 1.13. If the goal is to avoid overlap between 1.13 and 1.14 and give folks a break, we need to know when 1.13 ships. On Mon, Jul 20, 2020 at 4:57 PM Owen Nichols wrote: > Current schedule for cutting support branches is: > Aug 3, 2020: cut support/1.14 > Nov 2, 2020: cut support/1.15 > Feb 1, 2021: cut support/1.16 > May 3, 2021: cut support/1.17 > > I have a hard time understanding how anyone can trust that our schedule is > "time-based" if we can change the dates at the last minute. But, I am > reminded that this is only a discussion thread at this point, not a > proposal yet. If it does become a proposal, I would just like to see > concrete dates proposed for the next few branch cuts. Any change to the > 1.14 branch cut date almost certainly must affect subsequent branch cut > dates. > > I agree that there needs to be some downtime in between release cycles. > Given that we came up with the quarterly cadence using an assumption of 1 > month to get a release out, but we've found that it actually takes 2-3 > months, maybe the quarterly cadence is just too aggressive? > > > On 7/20/20, 3:12 PM, "Alexander Murmann" wrote: > > Hi Owen, > > I am not proposing to abandon our time-based releases. It's > unprecedented > for one of our releases to take this long. Even if we were to cut the > release now, it would likely not receive any attention till 1.13 is > out. So > I don't think there is any benefit in cutting the release now. In > addition > there are all the downsides, I discussed above that are unique to this > situation. > > > An aside: > > > [..] developers will hold off on high-risk changes and focus more on > > hardening as the cut date approaches. > > > To me that wasn't ever part of the reason for timed releases, but > predictability for users and for developers to know by when features > need > to be done to ship. If we feel a change is risky, let's write tests > till we > feel it's safe. > > > > On Mon, Jul 20, 2020 at 12:19 PM Owen Nichols > wrote: > > > The Geode community adopted a time-based quarterly cadence two years > ago > > in the hope it would lead to higher stability and more predictable > > releases. The idea was that by knowing exactly when a branch cut is > > upcoming, developers will hold off on high-risk changes and focus > more on > > hardening as the cut date approaches. The flip side was that the > next > > release cut was never more than 3 months away, making it more > palatable to > > delay features to the next release for the greater good. > > > > I am concerned about reneging on this promise so close to a date that > > developers have already been planning around. Develop has seen 259 > commits > > since support/1.13 was cut, which is a full release worth. Some > feature > > work such as geode-redis is eagerly anticipating a prompt branch cut > and > > swift release thereafter. > > > > Are you proposing to abandon time-based release cadence entirely? > If not, > > can you provide more detail on the new schedule you are envisioning > (e.g. > > still 4x/yr, but shifted out by a month? Or move to 3x/yr, starting > by > > delaying 1.14 by a month?). > > > > I don't know if this is the forum to reflect on *why* it takes so > long to > > stabilize from develop and get to something releasable, but if we > accept > > that the release process routinely takes 2-3 months (not the 1 month > our > > quarterly cadence was predicated on), then taking this opportunity > to move > > to a 3x/year cadence might be the smart play. > > > > -Owen > > > > On 7/20/20, 9:55 AM, "Donal Evans" wrote: > > > > +1 to postponing 1.14. > > > > Given the limited resources we have in terms of people who > shepherd > > the release process and ensure the quality of what we end up > releasing, it > > would put an unsustainable amount of strain on those who have > already been > > working extremely hard on getting 1.13 finished if we rolled right > into > > 1.14 without time to breathe and hopefully ramp up some more people > to take > > over parts of the release process. > > > > I'm also not in favour of abandoning 1.13 entirely, as there's > been a > > huge effort on the part of some community members to get it into a > good > > state to release, and dropping 1.13 now would effectively be seeing > all > > that work go to waste. It also wouldn't address the core issue that > those > > most heavily involved in the release process and in identifying and > > addressing potential release blockers are in danger of being > exhausted by > > the non-stop process of finding and fixing bugs in the release, > since 1.14 > > will have all of the same blockers that 1.13 currently has, plus an
Re: API (Recommanded way) to get heap and disk usage for cluster nodes
Steve, Here are some ways to access these statistics using JMX. To access JVM metrics for a member including heap usage using JMX, invoke the MemberMXBean showJVMMetrics operation. These values are gotten from ManagementFactory.getMemoryMXBean(), ManagementFactory.getThreadMXBean() and ManagementFactory.getGarbageCollectorMXBeans(). The DiskStoreMXBean attributes contain the disk metrics you're looking for (including TotalBytesOnDisk). These are per DiskStore per member. I attached a couple jconsole charts showing these. jconsole_jvm_metrics.png - shows a member's JVMMetrics jconsole_diskstore_attributes.png - shows a DiskStore's attributes I also attached a java client that dumps the JVM metrics, but it can easily be changed to periodically get them and take action depending on the state. java DumpJvmMetrics localhost 1091 ... === GemFire:type=Member,member=server-1 === JVM Metrics: committedMemory->518979584 gcCount->0 gcTimeMillis->31 initMemory->536870912 maxMemory->8502706176 totalThreads->60 usedMemory->145895584 ... I also attached a java client that dumps the DiskStore attributes, but it can easily be changed to periodically get them and take action depending on the state. java DumpDiskStores localhost 1091 ... === GemFire:service=DiskStore,name=disk_store_1,type=Member,member=server-1 === Disk Store: attribute=Name; value=disk_store_1 attribute=DiskReadsRate; value=0.0 attribute=DiskWritesRate; value=0.0 attribute=TotalBackupInProgress; value=0 attribute=TotalBackupCompleted; value=0 attribute=ForceCompactionAllowed; value=false attribute=MaxOpLogSize; value=10 attribute=DiskDirectories; value=[/path/to/server-1/.] attribute=DiskReadsAvgLatency; value=0 attribute=DiskWritesAvgLatency; value=0 attribute=FlushTimeAvgLatency; value=0 attribute=TotalRecoveriesInProgress; value=0 attribute=TimeInterval; value=1000 attribute=AutoCompact; value=true attribute=CompactionThreshold; value=50 attribute=WriteBufferSize; value=32768 attribute=DiskUsageWarningPercentage; value=90.0 attribute=DiskUsageCriticalPercentage; value=99.0 attribute=QueueSize; value=0 attribute=TotalQueueSize; value=0 attribute=TotalBytesOnDisk; value=2216241 attribute=DiskUsagePercentage; value=-1.0 attribute=DiskFreePercentage; value=-1.0 ... Thanks, Barry From: steve mathew Sent: Wednesday, July 8, 2020 11:20 AM To: dev@geode.apache.org Cc: u...@geode.apache.org Subject: Re: API (Recommanded way) to get heap and disk usage for cluster nodes Thanks Jacob and Anthony for sharing the details. I have tried to understand list_of_mbeans supported but finding it tough to understand completely. I can see "DiskStoreMXBean" and document says it can provide region(s) specific disk usage. For my experiment, looking for *mbeans that provide data node's (member) specific heap and disk usage*.. It would be great help if someone can guide me about these MBeans and how to use it to get the required stats (or point me some reference outlining this details.) Thanks -Steve On Wed, Jul 8, 2020 at 10:27 PM Anthony Baker wrote: > Another option is JMX, see > https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgeode.apache.org%2Fdocs%2Fguide%2F19%2Fmanaging%2Fmanagement%2Flist_of_mbeans.html&data=02%7C01%7Cboglesby%40vmware.com%7C2bb9bdb99af64d9f6cfb08d8236ba39c%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637298292514607524&sdata=ArIBcLD2FLtBkup6bI7tNlwTWN0GBkc1Z1kOIdzypnU%3D&reserved=0 > . > > Anthony > > > On Jul 8, 2020, at 9:24 AM, Jacob Barrett jabarr...@vmware.com>> wrote: > > Steve, > > Geode is in a transition from its on disk proprietary stats format to > utilizing Micrometer.io< > https://nam04.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmicrometer.io%2F&data=02%7C01%7Cboglesby%40vmware.com%7C2bb9bdb99af64d9f6cfb08d8236ba39c%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637298292514607524&sdata=bcAudAbLBnF1nySFevDnykilL3IOtRFwSuVy5C0jdkc%3D&reserved=0>. > Some of what you are looking for may already be exposed via Micrometer. If > so you can just use whatever registry of your choice to publish those > stats. If the metric you need is not converted to Micrometer its pretty > easy in most cases to refactor it and we would welcome a JIRA or even > better a PR. > > -Jake > > > On Jul 7, 2020, at 9:58 PM, steve mathew steve.mathe...@gmail.com>> wrote: > > Hello Geode Dev and users > > We have a requirement to constantly monitor the resource utilization (Disk > and Heap usage) for the cluster nodes from external processes. > Seeking help to understand the recommended way (or APIs available ) to get > this in a separate process...We need to trigger some actions/custom logic > if it goes above some threshold.. > > Thanks > Steve. > > >
Re: [Discuss] Cutting Geode 1.14
The wiki states: "Geode cuts support branches for new minor releases on a time-based schedule (Monday on or after Feb 1, May 1, Aug 1, Nov 1)." I guess let's leave this inaccurate statement as-is for now until 1.13 has shipped and we have a better sense of how releases will be initiated in the future. On 7/21/20, 11:57 AM, "Alexander Murmann" wrote: Owen, we cannot plan an updated timeline till we have shipped 1.13. If the goal is to avoid overlap between 1.13 and 1.14 and give folks a break, we need to know when 1.13 ships. On Mon, Jul 20, 2020 at 4:57 PM Owen Nichols wrote: > Current schedule for cutting support branches is: > Aug 3, 2020: cut support/1.14 > Nov 2, 2020: cut support/1.15 > Feb 1, 2021: cut support/1.16 > May 3, 2021: cut support/1.17 > > I have a hard time understanding how anyone can trust that our schedule is > "time-based" if we can change the dates at the last minute. But, I am > reminded that this is only a discussion thread at this point, not a > proposal yet. If it does become a proposal, I would just like to see > concrete dates proposed for the next few branch cuts. Any change to the > 1.14 branch cut date almost certainly must affect subsequent branch cut > dates. > > I agree that there needs to be some downtime in between release cycles. > Given that we came up with the quarterly cadence using an assumption of 1 > month to get a release out, but we've found that it actually takes 2-3 > months, maybe the quarterly cadence is just too aggressive? > > > On 7/20/20, 3:12 PM, "Alexander Murmann" wrote: > > Hi Owen, > > I am not proposing to abandon our time-based releases. It's > unprecedented > for one of our releases to take this long. Even if we were to cut the > release now, it would likely not receive any attention till 1.13 is > out. So > I don't think there is any benefit in cutting the release now. In > addition > there are all the downsides, I discussed above that are unique to this > situation. > > > An aside: > > > [..] developers will hold off on high-risk changes and focus more on > > hardening as the cut date approaches. > > > To me that wasn't ever part of the reason for timed releases, but > predictability for users and for developers to know by when features > need > to be done to ship. If we feel a change is risky, let's write tests > till we > feel it's safe. > > > > On Mon, Jul 20, 2020 at 12:19 PM Owen Nichols > wrote: > > > The Geode community adopted a time-based quarterly cadence two years > ago > > in the hope it would lead to higher stability and more predictable > > releases. The idea was that by knowing exactly when a branch cut is > > upcoming, developers will hold off on high-risk changes and focus > more on > > hardening as the cut date approaches. The flip side was that the > next > > release cut was never more than 3 months away, making it more > palatable to > > delay features to the next release for the greater good. > > > > I am concerned about reneging on this promise so close to a date that > > developers have already been planning around. Develop has seen 259 > commits > > since support/1.13 was cut, which is a full release worth. Some > feature > > work such as geode-redis is eagerly anticipating a prompt branch cut > and > > swift release thereafter. > > > > Are you proposing to abandon time-based release cadence entirely? > If not, > > can you provide more detail on the new schedule you are envisioning > (e.g. > > still 4x/yr, but shifted out by a month? Or move to 3x/yr, starting > by > > delaying 1.14 by a month?). > > > > I don't know if this is the forum to reflect on *why* it takes so > long to > > stabilize from develop and get to something releasable, but if we > accept > > that the release process routinely takes 2-3 months (not the 1 month > our > > quarterly cadence was predicated on), then taking this opportunity > to move > > to a 3x/year cadence might be the smart play. > > > > -Owen > > > > On 7/20/20, 9:55 AM, "Donal Evans" wrote: > > > > +1 to postponing 1.14. > > > > Given the limited resources we have in terms of people who > shepherd > > the release process and ensure the quality of what we end up > releasing, it > > would put an unsustainable amount of strain on those who have > already been > > working extremely hard on getting 1.13 finished if we rolled right