[GitHub] [lucene] iverase merged pull request #1065: LUCENE-10678: Fix potential overflow when computing the partition point on the BKD tree

2022-08-11 Thread GitBox


iverase merged PR #1065:
URL: https://github.com/apache/lucene/pull/1065


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10678) computing the partition point on a BKD tree merge can overflow

2022-08-11 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17578472#comment-17578472
 ] 

ASF subversion and git services commented on LUCENE-10678:
--

Commit fe8d11254a8a768608d7bb5e2bf8dcfd2c2c9310 in lucene's branch 
refs/heads/main from Ignacio Vera
[ https://gitbox.apache.org/repos/asf?p=lucene.git;h=fe8d11254a8 ]

LUCENE-10678: Fix potential overflow when computing the partition point on the 
BKD tree (#1065)

We currently compute the partition point for a set of points by multiplying the 
number of nodes that needs to be on
 the left of the BKD tree by the maxPointsInLeafNode. This multiplication is 
done on the integer space so if the partition point is bigger than 
Integer.MAX_VALUE it will overflow. This commit moves the multiplication to the 
long space so it doesn't overflow.

> computing the partition point on a BKD tree merge can overflow
> --
>
> Key: LUCENE-10678
> URL: https://issues.apache.org/jira/browse/LUCENE-10678
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Ignacio Vera
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> I just discover a bad bug in the BKD tree when doing merges. Before calling 
> the BKDTreeRadix selector we need to compute the partition point which is 
> dome multiplying two integers. If the partition point is > Integer.MAX_VALUE 
> then it will overflow.
> https://github.com/apache/lucene/blob/35ca2d79f73c6dfaf5e648fe241f7e0b37084a90/lucene/core/src/java/org/apache/lucene/util/bkd/BKDWriter.java#L2021
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10678) computing the partition point on a BKD tree merge can overflow

2022-08-11 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17578473#comment-17578473
 ] 

ASF subversion and git services commented on LUCENE-10678:
--

Commit 0b9850448560aae4715719823af9922de2e2dfe2 in lucene's branch 
refs/heads/branch_9x from Ignacio Vera
[ https://gitbox.apache.org/repos/asf?p=lucene.git;h=0b985044856 ]

LUCENE-10678: Fix potential overflow when computing the partition point on the 
BKD tree (#1065)

We currently compute the partition point for a set of points by multiplying the 
number of nodes that needs to be on
 the left of the BKD tree by the maxPointsInLeafNode. This multiplication is 
done on the integer space so if the partition point is bigger than 
Integer.MAX_VALUE it will overflow. This commit moves the multiplication to the 
long space so it doesn't overflow.

> computing the partition point on a BKD tree merge can overflow
> --
>
> Key: LUCENE-10678
> URL: https://issues.apache.org/jira/browse/LUCENE-10678
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Ignacio Vera
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> I just discover a bad bug in the BKD tree when doing merges. Before calling 
> the BKDTreeRadix selector we need to compute the partition point which is 
> dome multiplying two integers. If the partition point is > Integer.MAX_VALUE 
> then it will overflow.
> https://github.com/apache/lucene/blob/35ca2d79f73c6dfaf5e648fe241f7e0b37084a90/lucene/core/src/java/org/apache/lucene/util/bkd/BKDWriter.java#L2021
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10678) computing the partition point on a BKD tree merge can overflow

2022-08-11 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17578474#comment-17578474
 ] 

ASF subversion and git services commented on LUCENE-10678:
--

Commit 21f892d09698208ce146775e5b7641c554410002 in lucene's branch 
refs/heads/branch_9_3 from Ignacio Vera
[ https://gitbox.apache.org/repos/asf?p=lucene.git;h=21f892d0969 ]

LUCENE-10678: Fix potential overflow when computing the partition point on the 
BKD tree (#1065)

We currently compute the partition point for a set of points by multiplying the 
number of nodes that needs to be on
 the left of the BKD tree by the maxPointsInLeafNode. This multiplication is 
done on the integer space so if the partition point is bigger than 
Integer.MAX_VALUE it will overflow. This commit moves the multiplication to the 
long space so it doesn't overflow.
# Conflicts:
#   lucene/CHANGES.txt


> computing the partition point on a BKD tree merge can overflow
> --
>
> Key: LUCENE-10678
> URL: https://issues.apache.org/jira/browse/LUCENE-10678
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Ignacio Vera
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> I just discover a bad bug in the BKD tree when doing merges. Before calling 
> the BKDTreeRadix selector we need to compute the partition point which is 
> dome multiplying two integers. If the partition point is > Integer.MAX_VALUE 
> then it will overflow.
> https://github.com/apache/lucene/blob/35ca2d79f73c6dfaf5e648fe241f7e0b37084a90/lucene/core/src/java/org/apache/lucene/util/bkd/BKDWriter.java#L2021
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10678) computing the partition point on a BKD tree merge can overflow

2022-08-11 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17578481#comment-17578481
 ] 

ASF subversion and git services commented on LUCENE-10678:
--

Commit b19ba1098cb557fec168c569f8b4bdff9d56260c in lucene-solr's branch 
refs/heads/LUCENE-10678 from Ignacio Vera
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=b19ba1098cb ]

LUCENE-10678: Fix potential overflow when computing the partition point on the 
BKD tree (#1065)

We currently compute the partition point for a set of points by multiplying the 
number of nodes that needs to be on
the left of the BKD tree by the maxPointsInLeafNode. This multiplication is 
done on the integer space so if the partition point
is bigger than Integer.MAX_VALUE it will overflow.
This commit moves the multiplication to the long space so it doesn't overflow.


> computing the partition point on a BKD tree merge can overflow
> --
>
> Key: LUCENE-10678
> URL: https://issues.apache.org/jira/browse/LUCENE-10678
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Ignacio Vera
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> I just discover a bad bug in the BKD tree when doing merges. Before calling 
> the BKDTreeRadix selector we need to compute the partition point which is 
> dome multiplying two integers. If the partition point is > Integer.MAX_VALUE 
> then it will overflow.
> https://github.com/apache/lucene/blob/35ca2d79f73c6dfaf5e648fe241f7e0b37084a90/lucene/core/src/java/org/apache/lucene/util/bkd/BKDWriter.java#L2021
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] iverase opened a new pull request, #2668: LUCENE-10678: Fix potential overflow when computing the partition point on the BKD tree (#1065)

2022-08-11 Thread GitBox


iverase opened a new pull request, #2668:
URL: https://github.com/apache/lucene-solr/pull/2668

   We currently compute the partition point for a set of points by multiplying 
the number of nodes that needs to be on
   the left of the BKD tree by the maxPointsInLeafNode. This multiplication is 
done on the integer space so if the partition point
   is bigger than Integer.MAX_VALUE it will overflow.
   This commit moves the multiplication to the long space so it doesn't 
overflow.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-10678) computing the partition point on a BKD tree merge can overflow

2022-08-11 Thread Ignacio Vera (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-10678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ignacio Vera resolved LUCENE-10678.
---
Fix Version/s: 9.3.1
   8.11.3
   9.4
 Assignee: Ignacio Vera
   Resolution: Fixed

> computing the partition point on a BKD tree merge can overflow
> --
>
> Key: LUCENE-10678
> URL: https://issues.apache.org/jira/browse/LUCENE-10678
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Ignacio Vera
>Assignee: Ignacio Vera
>Priority: Major
> Fix For: 9.3.1, 8.11.3, 9.4
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> I just discover a bad bug in the BKD tree when doing merges. Before calling 
> the BKDTreeRadix selector we need to compute the partition point which is 
> dome multiplying two integers. If the partition point is > Integer.MAX_VALUE 
> then it will overflow.
> https://github.com/apache/lucene/blob/35ca2d79f73c6dfaf5e648fe241f7e0b37084a90/lucene/core/src/java/org/apache/lucene/util/bkd/BKDWriter.java#L2021
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] iverase merged pull request #2668: LUCENE-10678: Fix potential overflow when computing the partition point on the BKD tree (#1065)

2022-08-11 Thread GitBox


iverase merged PR #2668:
URL: https://github.com/apache/lucene-solr/pull/2668


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10678) computing the partition point on a BKD tree merge can overflow

2022-08-11 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17578489#comment-17578489
 ] 

ASF subversion and git services commented on LUCENE-10678:
--

Commit d426ff43c719acda20f5fc97a26f9f0774a36284 in lucene-solr's branch 
refs/heads/branch_8_11 from Ignacio Vera
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=d426ff43c71 ]

LUCENE-10678: Fix potential overflow when computing the partition point on the 
BKD tree (#1065) (#2668)

We currently compute the partition point for a set of points by multiplying the 
number of nodes that needs to be on
the left of the BKD tree by the maxPointsInLeafNode. This multiplication is 
done on the integer space so if the partition point is bigger than 
Integer.MAX_VALUE it will overflow.
This commit moves the multiplication to the long space so it doesn't overflow.

> computing the partition point on a BKD tree merge can overflow
> --
>
> Key: LUCENE-10678
> URL: https://issues.apache.org/jira/browse/LUCENE-10678
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Ignacio Vera
>Assignee: Ignacio Vera
>Priority: Major
> Fix For: 8.11.3, 9.4, 9.3.1
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> I just discover a bad bug in the BKD tree when doing merges. Before calling 
> the BKDTreeRadix selector we need to compute the partition point which is 
> dome multiplying two integers. If the partition point is > Integer.MAX_VALUE 
> then it will overflow.
> https://github.com/apache/lucene/blob/35ca2d79f73c6dfaf5e648fe241f7e0b37084a90/lucene/core/src/java/org/apache/lucene/util/bkd/BKDWriter.java#L2021
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] msokolov commented on pull request #1057: LUCENE-10670: Add a codec class to track merge time of each index part

2022-08-11 Thread GitBox


msokolov commented on PR #1057:
URL: https://github.com/apache/lucene/pull/1057#issuecomment-1212151454

   > I really don't think we should be doing this with a codec-wrapper. you can 
get this data alrady from InfoStream!
   
   InfoStream is not very structured though -- it's hard to see how you could 
extract merge times from it. You'd have to parse formatted Strings, right? It 
also has this comment in its javadoc: `NOTE: Enabling infostreams may cause 
performance degradation in some components.`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-jira-archive] mocobeta opened a new pull request, #143: Add instruction manual for infra team

2022-08-11 Thread GitBox


mocobeta opened a new pull request, #143:
URL: https://github.com/apache/lucene-jira-archive/pull/143

   I wrote a copy-pastable instruction manual for infra team.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-jira-archive] mocobeta merged pull request #143: Add instruction manual for infra team

2022-08-11 Thread GitBox


mocobeta merged PR #143:
URL: https://github.com/apache/lucene-jira-archive/pull/143


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (LUCENE-10679) FairDistributionMergePolicy

2022-08-11 Thread Atri Sharma (Jira)
Atri Sharma created LUCENE-10679:


 Summary: FairDistributionMergePolicy
 Key: LUCENE-10679
 URL: https://issues.apache.org/jira/browse/LUCENE-10679
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Atri Sharma
Assignee: Atri Sharma


TieredMergePolicy and LogMergePolicy can define merge specifications which have 
a skew in the distribution of overall "work" (i.e. number of documents to 
process) amongst threads. This is especially true when the underlying segment 
distribution is highly skewed.

 

A more optimal distribution can be achieved by performing a variation of the 
integer partitioning algorithm. Initial tests show a more optimal distribution 
on a simulated set of skewed segment distributions.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10654) New companion doc value format for LatLonShape and XYShape field types

2022-08-11 Thread Nick Knize (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17578674#comment-17578674
 ] 

Nick Knize commented on LUCENE-10654:
-

Nightly test failure on XY bounding box:


{code:java}
Reproduce with: gradlew :lucene:core:test --tests 
"org.apache.lucene.document.TestShapeDocValues.testXYPolygonBBox" 
-Ptests.jvms=4 -Ptests.haltonfailure=false 
-Ptests.jvmargs=-XX:TieredStopAtLevel=1 -Ptests.seed=ABDF070B81479950 
-Ptests.multiplier=2 -Ptests.nightly=true -Ptests.badapples=false 
-Ptests.gui=true -Ptests.file.encoding=ISO-8859-1 
-Ptests.linedocsfile=/home/jenkins/jenkins-slave/workspace/Lucene/Lucene-NightlyTests-main/test-data/enwiki.random.lines.txt
{code}



{code:java}
1 tests failed.
FAILED:  org.apache.lucene.document.TestShapeDocValues.testXYPolygonBBox

Error Message:
java.lang.AssertionError: expected:<-2.028229934961692E32> but 
was:<-2.026382696309321E32>
{code}


This is caused because the {{{TestUtil.nextPolygon}}} is producing a polygon 
with an extruded colinear self intersecting vertex and the 
{{BaseXYShapeTestCase}} is not throwing this as an invalid polygon because the 
test case uses {{randomBoolean}}. The simple fix is to switch the TestCase to 
always throw an exception on invalid polygons so we never test with a 
non-compliant polygon. This passed the queries because the tessellator would 
filter out the dirty vertext. This test failed because the dirty vertext just 
happened to be the minimum X value. So this does expose an inconsistency where 
an invalid polygon will have a bounding box inconsistent with the raw geometry. 
I think that's okay because we have API guardrails to enable or disable strict 
validation and I don't think that should be removed.

I will open a PR to switch the base test cases over to strict geometry 
validation instead of random validation.


> New companion doc value format for LatLonShape and XYShape field types
> --
>
> Key: LUCENE-10654
> URL: https://issues.apache.org/jira/browse/LUCENE-10654
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Nick Knize
>Priority: Major
> Fix For: 9.4
>
>  Time Spent: 7h 20m
>  Remaining Estimate: 0h
>
> {{XYDocValuesField}} provides doc value support for {{XYPoint}}. 
> {{LatLonDocValuesField}} provides docvalue support for {{LatLonPoint}}.
> However, neither {{LatLonShape}} nor {{XYShape}} currently have a docvalue 
> format. 
> This lack of doc value support for shapes means facets, aggregations, and 
> IndexOrDocValues queries are currently not possible for Shape field types. 
> This gap needs be closed in lucene.
> To support IndexOrDocValues queries along with various geometry aggregations 
> and facets, the ability to compute the spatial relation with the doc value is 
> needed. This is straightforward with {{XYPoint}} and {{LatLonPoint}} since 
> the doc value encoding is nothing more than a simple 2D integer encoding of 
> the x,y and lat,lon dimensional components. Accomplishing the same with a 
> naive integer encoded binary representation for N-vertex shapes would be 
> costly. 
> {{ComponentTree}} already provides an efficient in memory structure for 
> quickly computing spatial relations over Shape types based on a binary tree 
> of tessellated triangles provided by the {{Tessellator}}. Furthermore, this 
> tessellation is already computed at index time. If we create an on-disk 
> representation of {{ComponentTree}} 's binary tree of tessellated triangles 
> and use this as the doc value {{binaryValue}} format we will be able to 
> efficiently compute spatial relations with this binary representation and 
> achieve the same facet/aggregation result over shapes as we can with points 
> today (e.g., grid facets, centroid, area, etc).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-10654) New companion doc value format for LatLonShape and XYShape field types

2022-08-11 Thread Nick Knize (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17578674#comment-17578674
 ] 

Nick Knize edited comment on LUCENE-10654 at 8/11/22 9:08 PM:
--

Nightly test failure on XY bounding box:


{code:java}
Reproduce with: gradlew :lucene:core:test --tests 
"org.apache.lucene.document.TestShapeDocValues.testXYPolygonBBox" 
-Ptests.jvms=4 -Ptests.haltonfailure=false 
-Ptests.jvmargs=-XX:TieredStopAtLevel=1 -Ptests.seed=ABDF070B81479950 
-Ptests.multiplier=2 -Ptests.nightly=true -Ptests.badapples=false 
-Ptests.gui=true -Ptests.file.encoding=ISO-8859-1 
-Ptests.linedocsfile=/home/jenkins/jenkins-slave/workspace/Lucene/Lucene-NightlyTests-main/test-data/enwiki.random.lines.txt
{code}



{code:java}
1 tests failed.
FAILED:  org.apache.lucene.document.TestShapeDocValues.testXYPolygonBBox

Error Message:
java.lang.AssertionError: expected:<-2.028229934961692E32> but 
was:<-2.026382696309321E32>
{code}


This is caused because the {{TestUtil.nextPolygon}} is producing a polygon with 
an extruded colinear self intersecting vertex and the {{BaseXYShapeTestCase}} 
is not throwing this as an invalid polygon because the test case uses 
{{randomBoolean}}. The simple fix is to switch the TestCase to always throw an 
exception on invalid polygons so we never test with a non-compliant polygon. 
This passed the queries because the tessellator would filter out the dirty 
vertext. This test failed because the dirty vertext just happened to be the 
minimum X value. So this does expose an inconsistency where an invalid polygon 
will have a bounding box inconsistent with the raw geometry. I think that's 
okay because we have API guardrails to enable or disable strict validation and 
I don't think that should be removed.

I will open a PR to switch the base test cases over to strict geometry 
validation instead of random validation.



was (Author: nknize):
Nightly test failure on XY bounding box:


{code:java}
Reproduce with: gradlew :lucene:core:test --tests 
"org.apache.lucene.document.TestShapeDocValues.testXYPolygonBBox" 
-Ptests.jvms=4 -Ptests.haltonfailure=false 
-Ptests.jvmargs=-XX:TieredStopAtLevel=1 -Ptests.seed=ABDF070B81479950 
-Ptests.multiplier=2 -Ptests.nightly=true -Ptests.badapples=false 
-Ptests.gui=true -Ptests.file.encoding=ISO-8859-1 
-Ptests.linedocsfile=/home/jenkins/jenkins-slave/workspace/Lucene/Lucene-NightlyTests-main/test-data/enwiki.random.lines.txt
{code}



{code:java}
1 tests failed.
FAILED:  org.apache.lucene.document.TestShapeDocValues.testXYPolygonBBox

Error Message:
java.lang.AssertionError: expected:<-2.028229934961692E32> but 
was:<-2.026382696309321E32>
{code}


This is caused because the {{{TestUtil.nextPolygon}}} is producing a polygon 
with an extruded colinear self intersecting vertex and the 
{{BaseXYShapeTestCase}} is not throwing this as an invalid polygon because the 
test case uses {{randomBoolean}}. The simple fix is to switch the TestCase to 
always throw an exception on invalid polygons so we never test with a 
non-compliant polygon. This passed the queries because the tessellator would 
filter out the dirty vertext. This test failed because the dirty vertext just 
happened to be the minimum X value. So this does expose an inconsistency where 
an invalid polygon will have a bounding box inconsistent with the raw geometry. 
I think that's okay because we have API guardrails to enable or disable strict 
validation and I don't think that should be removed.

I will open a PR to switch the base test cases over to strict geometry 
validation instead of random validation.


> New companion doc value format for LatLonShape and XYShape field types
> --
>
> Key: LUCENE-10654
> URL: https://issues.apache.org/jira/browse/LUCENE-10654
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Nick Knize
>Priority: Major
> Fix For: 9.4
>
>  Time Spent: 7h 20m
>  Remaining Estimate: 0h
>
> {{XYDocValuesField}} provides doc value support for {{XYPoint}}. 
> {{LatLonDocValuesField}} provides docvalue support for {{LatLonPoint}}.
> However, neither {{LatLonShape}} nor {{XYShape}} currently have a docvalue 
> format. 
> This lack of doc value support for shapes means facets, aggregations, and 
> IndexOrDocValues queries are currently not possible for Shape field types. 
> This gap needs be closed in lucene.
> To support IndexOrDocValues queries along with various geometry aggregations 
> and facets, the ability to compute the spatial relation with the doc value is 
> needed. This is straightforward with {{XYPoint}} and {{LatLonPoint}} since 
> the doc value encoding is nothing more than a simple 2D integer encoding of 
> the x,y and lat,lon dimensional components. Accomplishing the 

[GitHub] [lucene] nknize opened a new pull request, #1066: LUCENE-10654: Fix ShapeDocValue Bounding Box failure

2022-08-11 Thread GitBox


nknize opened a new pull request, #1066:
URL: https://github.com/apache/lucene/pull/1066

   The base spatial test case may create invalid self crossing polygons. These
   polygons are cleaned by the tessellator which may result in an inconsistent
   bounding box between the tessellated shape and the original, invalid, 
geometry.
   This commit fixes the shape doc value test case to compute the bounding box 
from
   the cleaned geometry instead of relying on the, potentially invalid, original
   geometry. A logic bug is also cleaned up in the tessellator where the 
original
   edge membership check was always being checked regardless of colinearity.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-jira-archive] mocobeta commented on issue #104: Should we regenerate another full export?

2022-08-11 Thread GitBox


mocobeta commented on issue #104:
URL: 
https://github.com/apache/lucene-jira-archive/issues/104#issuecomment-1212671024

   Hi, here is the latest migration result. All recent improvements (e.g., 
#128, #131, #136, etc.) are applied.
   https://github.com/mocobeta/forks-migration-test-3/issues


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org