houserjohn commented on PR #13914:
URL: https://github.com/apache/lucene/pull/13914#issuecomment-2658101119

   Hey @HoustonPutman, I just published 
[GH#14238](https://github.com/apache/lucene/pull/14238) which contains all of 
the unit tests that I've created so far. Note that there was a slight API 
change between the main branch and this PR, so I included some unit tests that 
work for this PR below. While some of these tests are not considering the 
caveat (the change in behavior) you mentioned, I believe there are a few unit 
tests that capture a few existing issues. For instance:
   
   ```
   public void testComputeDynamicNumericRangesWithLargeTopN() {
       List<DynamicRangeUtil.DynamicRangeInfo> expectedRangeInfoList = new 
ArrayList<>();
       long[] values = new long[] {487, 439, 794, 277};
       long[] weights = new long[] {59, 508, 736, 560};
   
       expectedRangeInfoList.add(new DynamicRangeUtil.DynamicRangeInfo(1, 560L, 
277L, 277L, 277D));
       expectedRangeInfoList.add(new DynamicRangeUtil.DynamicRangeInfo(1, 508L, 
439L, 439L, 439D));
       expectedRangeInfoList.add(new DynamicRangeUtil.DynamicRangeInfo(2, 795L, 
487L, 794L, 640.5D));
       assertDynamicNumericRangeResults(values, weights, 42, 1997L, 1863L, 
expectedRangeInfoList);
     }
   ```
   Gives the exception:
   > 
       java.lang.IllegalArgumentException: All kWeights must be < beforeWeight 
+ rangeWeight
           at 
__randomizedtesting.SeedInfo.seed([913AAD1D60B9263B:FFD4EE9EA025DBF]:0)
           at 
org.apache.lucene.core@11.0.0-SNAPSHOT/org.apache.lucene.util.WeightedSelector.checkArgs(WeightedSelector.java:82)
           at 
org.apache.lucene.core@11.0.0-SNAPSHOT/org.apache.lucene.util.WeightedSelector.select(WeightedSelector.java:57)
           at 
org.apache.lucene.facet.range.DynamicRangeUtil.computeDynamicNumericRanges(DynamicRangeUtil.java:266)
           at 
org.apache.lucene.facet.range.TestDynamicRangeUtil.testComputeDynamicNumericRangesWithLargeTopN(TestDynamicRangeUtil.java:169)
           at 
java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
   
   I've tried to track down this bug, and I haven't quite fixed it, but I 
believe the fix is related to these lines:
   ```
   --- 
a/lucene/facet/src/java/org/apache/lucene/facet/range/DynamicRangeUtil.java
   +++ 
b/lucene/facet/src/java/org/apache/lucene/facet/range/DynamicRangeUtil.java
   @@ -216,7 +216,7 @@ public final class DynamicRangeUtil {
          return dynamicRangeResult;
        }
    
   -    double rangeWeightTarget = (double) totalWeight / topN;
   +    double rangeWeightTarget = (double) totalWeight / Math.min(topN, len);
        double[] kWeights = new double[topN];
        for (int i = 0; i < topN; i++) {
          kWeights[i] = (i == 0 ? 0 : kWeights[i - 1]) + rangeWeightTarget;
   -- 
   ```
   
   Additionally:
   
   ```
   public void testComputeDynamicNumericRangesWithSameWeights() {
       List<DynamicRangeUtil.DynamicRangeInfo> expectedRangeInfoList = new 
ArrayList<>();
       long totalValue = 0;
       long[] values = new long[100];
       long[] weights = new long[100];
       for (int i = 0; i < 100; i++) {
         values[i] = i;
         weights[i] = 50;
         totalValue += i;
       }
   
       expectedRangeInfoList.add(new DynamicRangeUtil.DynamicRangeInfo(25, 
1250L, 0L, 24L, 12.0D));
       expectedRangeInfoList.add(new DynamicRangeUtil.DynamicRangeInfo(25, 
1250L, 25L, 49L, 37.0D));
       expectedRangeInfoList.add(new DynamicRangeUtil.DynamicRangeInfo(25, 
1250L, 50L, 74L, 62.0D));
       expectedRangeInfoList.add(new DynamicRangeUtil.DynamicRangeInfo(25, 
1250L, 75L, 99L, 87.0D));
       assertDynamicNumericRangeResults(values, weights, 4, totalValue, 5000L, 
expectedRangeInfoList);
     }
   ```
   Gives (Important values marked with **):
   ```
       java.lang.AssertionError: expected:<[DynamicRangeInfo[count=**25**, 
weight=1250, min=0, max=24, centroid=12.0], DynamicRangeInfo[count=25, 
weight=1250, min=25, max=49, centroid=37.0], DynamicRangeInfo[count=25, 
weight=1250, min=50, max=74, centroid=62.0], DynamicRangeInfo[count=25, 
weight=1250, min=75, max=99, centroid=87.0]]> but 
was:<[DynamicRangeInfo[count=**26**, weight=1300, min=0, max=25, 
centroid=12.5], DynamicRangeInfo[count=25, weight=1250, min=26, max=50, 
centroid=38.0], DynamicRangeInfo[count=25, weight=1250, min=51, max=75, 
centroid=63.0], DynamicRangeInfo[count=24, weight=1200, min=76, max=99, 
centroid=87.5]]>
           at 
__randomizedtesting.SeedInfo.seed([DA6EB4C0C0CA4022:DBA0A4538AB03899]:0)
           at junit@4.13.1/org.junit.Assert.fail(Assert.java:89)
           at junit@4.13.1/org.junit.Assert.failNotEquals(Assert.java:835)
           at junit@4.13.1/org.junit.Assert.assertEquals(Assert.java:120)
           at junit@4.13.1/org.junit.Assert.assertEquals(Assert.java:146)
           at 
org.apache.lucene.facet.range.TestDynamicRangeUtil.compareDynamicRangeResult(TestDynamicRangeUtil.java:361)
           at 
org.apache.lucene.facet.range.TestDynamicRangeUtil.assertDynamicNumericRangeResults(TestDynamicRangeUtil.java:351)
           at 
org.apache.lucene.facet.range.TestDynamicRangeUtil.testComputeDynamicNumericRangesWithSameWeights(TestDynamicRangeUtil.java:156)
   ```
   
   I know you mentioned there is a change in behavior in the caveat, but I do 
believe that this example should probably return ranges with equal counts.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to