Re: Question about INDEX_THRESHOLD_SIZE

2022-06-08 Thread Nabarun Nag
Hi Mario,

Regarding this issue, we are investigating how we can move forward with this 
issue. After discussing with the team, and with the 1.15 release schedule 
coming up this month, we have planned to move this solution to 1.16.0 release. 
We have planned to go forward with this plan.

  *   Revert GEODE-9632 from develop and backport the revert to 1.15.0 ( This 
needs to be done to follow protocol) to get 1.15.0 release task complete.
  *   Work with you and the team to find a complete solution/improvement and 
push the solution to develop again and release in 1.16.0
  *   Document the additional purpose of INDEX_THRESHOLD_SIZE in the Apache 
Geode docs.

Please do reach us if you have any questions.

Regards
Nabarun Nag


From: Mario Kevo 
Sent: Tuesday, June 7, 2022 2:55 AM
To: dev@geode.apache.org 
Subject: Odg: Question about INDEX_THRESHOLD_SIZE

⚠ External Email

Hi all,

I dig it more in this problem and saw that if we have some range query like 
this:


query --query="SELECT  e.key, e.value from /example-region.entrySet e 
where e.value.positions['SUN'] LIKE 'abc%'"

It will turn it into two comparisons. First, it will check all that are < 'abd' 
and store them into intermediate results and then the second comparison will 
find all entries that have this attribute >= 'abc'. And their intersection can 
be null, when doing the first comparison(LT) it will store all entries that 
have this attribute lower than 'abd' which can also be something like '1234', 
'aab',... as all of them are lower by ASCII table. The problem is when adding 
entries to results it has a limit checked after every iteration. 
https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fgeode%2Fblob%2Fdevelop%2Fgeode-core%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fgeode%2Fcache%2Fquery%2Finternal%2Findex%2FCompactRangeIndex.java%23L859&data=05%7C01%7Cnnag%40vmware.com%7C521ece5f5f324b0c361d08da486bff8e%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637901925947631598%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=mfvHgPewuNmWFurIcyHr7zC0xrGwFjbvFt2LRYs2lGE%3D&reserved=0
This limit is set in 
https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fgeode%2Fblob%2Fdevelop%2Fgeode-core%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fgeode%2Fcache%2Fquery%2Finternal%2Findex%2FCompactRangeIndex.java%23L485&data=05%7C01%7Cnnag%40vmware.com%7C521ece5f5f324b0c361d08da486bff8e%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637901925947631598%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=OubRAWJ2S%2Fi4Dubh2wATXvdyhdIEoFOPKgHINk7yq6c%3D&reserved=0
In that case, it stores only the first 100 entries that it found and there is 
the possibility that many of them are not starting with 'abc'.
This limit is different from the one added in the query to limit printing 
results.

If we just go with checking >= 'abc' this will include also entries whose 
attribute looks like 'bcd', ...

If we change indexThresholdSize while starting servers this limit can be 
changed and can get correct results if the indexThresholdSize is high as the 
region has entries (The user in many cases couldn't know how many entries it 
will have in the region).

I tried to change it by default to Integer.MAX_VALUE, but then have some test 
failing, so think it is not the best solution.
The test that reproduces the problem is available on 
#7754.

Does anyone have some idea what is the best solution for this issue?

Thanks and BR,
Mario


Šalje: Jason Huynh 
Poslano: 15. ožujka 2022. 21:11
Prima: dev@geode.apache.org 
Predmet: Re: Question about INDEX_THRESHOLD_SIZE

Additional thought:
It would be nice to set/check CAN_APPLY_LIMIT_AT_INDEX on a per node basis or 
at least a finer grained setting.


  *   AND junctions probably it probably shouldn’t be applied at an index level
  *   OR junctions could as it’s a union
Combinations of the two or ancestor nodes should be smarter but that requires 
additional changes to make elegant…

From: Jason Huynh 
Date: Tuesday, March 15, 2022 at 1:03 PM
To: dev@geode.apache.org 
Subject: Re: Question about INDEX_THRESHOLD_SIZE
Hi Mario,



Digging a little bit more, I’m assuming the CompiledLike probably should 
reassess whether limits should be applied at the index level.  I imagine the 
“someth%” is currently being turned into two compiled comparisons joined by an 
AND?
The first comparison 

Re: [PROPOSAL] re-cut support/1.15

2022-06-08 Thread Owen Nichols
Hello Geode Community, counting the labels in Jira I see:

May 6: 1 needsTriage and 2 blocks-1.15.0.
May 20: 0 needsTriage and 3 blocks-1.15.0.
June 8: 0 needsTriage and 0 blocks-1.15.0. Yay!

At this time, please consider support/1.15 “frozen”.  If there are any further 
changes needed, in addition to adding one of the labels above, please also 
email the dev list, otherwise I will soon begin preparing RC1 for voting next 
week.

-1.15.0 Release Manager

From: Owen Nichols 
Date: Monday, May 9, 2022 at 10:31 AM
To: dev@geode.apache.org 
Subject: Re: [PROPOSAL] re-cut support/1.15
** The support/1.15 branch has now been (re)cut and develop is now 1.16. **

Please use your best judgement in determining what to backport.

PRs against support/1.15 are welcome (but optional!).  Committers should merge 
their own PRs.

If your backport will take some while, add the "blocks-1.15.0" label in Jira.

From: Owen Nichols 
Date: Friday, May 6, 2022 at 10:47 AM
To: dev@geode.apache.org 
Subject: Re: [PROPOSAL] re-cut support/1.15
Great news!  I would be delighted to continue as Release Manager for 1.15.0.

To track progress toward code-complete, I will monitor the "needsTriage" and 
"blocks-1.15.0" labels (for Affects Version = 1.15.0).  Currently I see 1 
needsTriage [1] and 2 blocks-1.15.0 [2].

[1] https://tinyurl.com/5h58766f
[2] https://tinyurl.com/2p8bje4n

From: Anthony Baker 
Date: Friday, May 6, 2022 at 10:19 AM
To: dev@geode.apache.org 
Subject: Re: [PROPOSAL] annul support/1.15
Owen, with all the recent work I think we are in an excellent position to 
resume work on the 1.15 release. While there are a few thing still outstanding, 
let’s go ahead and recut the release branch as of Monday, 2022-05-09. Would you 
be willing to resume release manager duties?

@Everyone - please chime in if you have in-progress work that you want to ship 
with 1.15 (ideally this is labeled in JIRA with “blocks-1.15.0”).

Thanks,
Anthony


> On Mar 16, 2022, at 2:12 PM, Owen Nichols  wrote:
>
> Seven weeks after cutting support/1.15, Jira now shows 11 blockers, up from 5 
> a few weeks ago.  I wonder if perhaps we cut the release branch prematurely?  
> I propose that we abandon this branch and focus on getting develop closer to 
> what we want to ship, then discuss re-cutting the branch.
>
> If this proposal is approved, I will archive support/1.15 as 
> support/1.15.old, revert develop's numbering to 1.15.0, and bulk-update all 
> Jira tickets fixed in 1.16.0 to fixed in 1.15.0 instead.  Build numbering 
> would start from 1.15.0-build.1000 to easily distinguish pre- and post- recut.
>
> Please vote/discuss with a goal of reaching consensus by 3PM PDT Monday Mar 
> 21.
>
> Thanks,
> -Geode 1.15.0 Release Manager
>
>