[GitHub] [lucene] mocobeta commented on a diff in pull request #811: Add some basic tasks to help/workflow

2022-04-14 Thread GitBox


mocobeta commented on code in PR #811:
URL: https://github.com/apache/lucene/pull/811#discussion_r850154058


##
help/workflow.txt:
##
@@ -25,11 +25,22 @@ Assemble a single module's JAR (here for lucene-core):
 gradlew -p lucene/core assemble
 ls lucene/core/build/libs
 
+Assemble all JARs:

Review Comment:
   I copied these lines then modified it:
   
https://github.com/apache/lucene/blob/9b96a05f50a42f2a4c46fccbadd3057ba9b22933/help/workflow.txt#L24-L25
   
   If "JAR" is too specific, how about "artifacts"?
   ```
   Assemble all Lucene artifacts (JARs, and so on).
   gradlew assemble
   ```
   
   Gradle's documentation says "Translates Assembly language source files into 
object files." This would be correct but too general to me...
   
https://docs.gradle.org/current/dsl/org.gradle.language.assembler.tasks.Assemble.html



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] mocobeta commented on a diff in pull request #811: Add some basic tasks to help/workflow

2022-04-14 Thread GitBox


mocobeta commented on code in PR #811:
URL: https://github.com/apache/lucene/pull/811#discussion_r850154058


##
help/workflow.txt:
##
@@ -25,11 +25,22 @@ Assemble a single module's JAR (here for lucene-core):
 gradlew -p lucene/core assemble
 ls lucene/core/build/libs
 
+Assemble all JARs:

Review Comment:
   I copied these lines then modified it:
   
https://github.com/apache/lucene/blob/9b96a05f50a42f2a4c46fccbadd3057ba9b22933/help/workflow.txt#L24-L25
   
   If "JAR" is too specific, how about "artifacts"?
   ```
   Assemble all Lucene artifacts (JARs, and so on).
   gradlew assemble
   ```
   
   ~Gradle's documentation says "Translates Assembly language source files into 
object files." This would be correct but too general to me...~
   
~https://docs.gradle.org/current/dsl/org.gradle.language.assembler.tasks.Assemble.html~
   Correction: this doc looks like about literally assembly language, not the 
convention task 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] mocobeta commented on a diff in pull request #811: Add some basic tasks to help/workflow

2022-04-14 Thread GitBox


mocobeta commented on code in PR #811:
URL: https://github.com/apache/lucene/pull/811#discussion_r850154058


##
help/workflow.txt:
##
@@ -25,11 +25,22 @@ Assemble a single module's JAR (here for lucene-core):
 gradlew -p lucene/core assemble
 ls lucene/core/build/libs
 
+Assemble all JARs:

Review Comment:
   I copied these lines then modified it:
   
https://github.com/apache/lucene/blob/9b96a05f50a42f2a4c46fccbadd3057ba9b22933/help/workflow.txt#L24-L25
   
   If "JAR" is too specific, how about "artifacts"?
   ```
   Assemble all Lucene artifacts (JARs, and so on).
   gradlew assemble
   ```
   
   ~Gradle's documentation says "Translates Assembly language source files into 
object files." This would be correct but too general to me...~
   
~https://docs.gradle.org/current/dsl/org.gradle.language.assembler.tasks.Assemble.html~
   Correction: this doc looks like about literally assembly language (I didn't 
know gradle support assembly...), not the convention task 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-10515) Sorting on FloatFiled or FloatPoint producing erroneous results

2022-04-14 Thread SAILENDRA PAVAN (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-10515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

SAILENDRA PAVAN updated LUCENE-10515:
-
Description: 
When we use FloatField for sorting, decimal values are getting rounded off and 
results is erroneous. Even after upgrading version to 6.0.0 FloatPoint results 
are also erroneous. We are getting correct results only when we use field as 
FloatDocValues.

I have attached a sample project which can show case the bug which i 
encountered.

I have debug to see what happened internally, below are few things which i 
encountered.
We can see in  !FloatFieldStored.png!  I used the following values to sort
12.5,15,0,0,12. We can clearly see 12.5 and 12 are saved as 1.7E-44 in queue.

when we use `FloatDocValuesField` to store the value then the sorting is 
working as expected. we can see values are stored without rounding off in 
FloatDocValuesStored.

Only difference i see in the file Lucene50DocValuesProducer . In case of 
floatPoint numeric entry format is CONST_COMPRESSED and for rest of field types 
like int point, long point or FloatDocValuesField the entry format is 
TABLE_COMPRESSED. 


 

 

  was:
When we use FloatField for sorting, decimal values are getting rounded off and 
results is erroneous. Even after upgrading version to 6.0.0 FloatPoint results 
are also erroneous. We are getting correct results only when we use field as 
FloatDocValues.

I have attached a sample project which can show case the bug which i 
encountered.

I have debug to see what happened internally, below are few things which i 
encountered.
We can see in FloatFieldStored.png I used the following values to sort
12.5,15,0,0,12. We can clearly see 12.5 and 12 are saved as 1.7E-44 in queue.

when we use `FloatDocValuesField` to store the value then the sorting is 
working as expected. we can see values are stored without rounding off in 
FloatDocValuesStored.

Only difference i see in the file Lucene50DocValuesProducer . In case of 
floatPoint numeric entry format is CONST_COMPRESSED and for rest of field types 
like int point, long point or FloatDocValuesField the entry format is 
TABLE_COMPRESSED. 


 

 


> Sorting on FloatFiled or FloatPoint producing erroneous results
> ---
>
> Key: LUCENE-10515
> URL: https://issues.apache.org/jira/browse/LUCENE-10515
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/search
>Affects Versions: 5.3.1
>Reporter: SAILENDRA PAVAN
>Priority: Major
> Attachments: FloatDocValuesStored.png, FloatFieldStored.png, 
> Lucene50DocValuesProducer.png, LuenceSortFloatIssue.zip
>
>
> When we use FloatField for sorting, decimal values are getting rounded off 
> and results is erroneous. Even after upgrading version to 6.0.0 FloatPoint 
> results are also erroneous. We are getting correct results only when we use 
> field as FloatDocValues.
> I have attached a sample project which can show case the bug which i 
> encountered.
> I have debug to see what happened internally, below are few things which i 
> encountered.
> We can see in  !FloatFieldStored.png!  I used the following values to sort
> 12.5,15,0,0,12. We can clearly see 12.5 and 12 are saved as 1.7E-44 in queue.
> when we use `FloatDocValuesField` to store the value then the sorting is 
> working as expected. we can see values are stored without rounding off in 
> FloatDocValuesStored.
> Only difference i see in the file Lucene50DocValuesProducer . In case of 
> floatPoint numeric entry format is CONST_COMPRESSED and for rest of field 
> types like int point, long point or FloatDocValuesField the entry format is 
> TABLE_COMPRESSED. 
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (LUCENE-10515) Sorting on FloatFiled or FloatPoint producing erroneous results

2022-04-14 Thread SAILENDRA PAVAN (Jira)
SAILENDRA PAVAN created LUCENE-10515:


 Summary: Sorting on FloatFiled or FloatPoint producing erroneous 
results
 Key: LUCENE-10515
 URL: https://issues.apache.org/jira/browse/LUCENE-10515
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/search
Affects Versions: 5.3.1
Reporter: SAILENDRA PAVAN
 Attachments: FloatDocValuesStored.png, FloatFieldStored.png, 
Lucene50DocValuesProducer.png, LuenceSortFloatIssue.zip

When we use FloatField for sorting, decimal values are getting rounded off and 
results is erroneous. Even after upgrading version to 6.0.0 FloatPoint results 
are also erroneous. We are getting correct results only when we use field as 
FloatDocValues.

I have attached a sample project which can show case the bug which i 
encountered.

I have debug to see what happened internally, below are few things which i 
encountered.
We can see in FloatFieldStored.png I used the following values to sort
12.5,15,0,0,12. We can clearly see 12.5 and 12 are saved as 1.7E-44 in queue.

when we use `FloatDocValuesField` to store the value then the sorting is 
working as expected. we can see values are stored without rounding off in 
FloatDocValuesStored.

Only difference i see in the file Lucene50DocValuesProducer . In case of 
floatPoint numeric entry format is CONST_COMPRESSED and for rest of field types 
like int point, long point or FloatDocValuesField the entry format is 
TABLE_COMPRESSED. 


 

 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-10515) Sorting on FloatFiled or FloatPoint producing erroneous results

2022-04-14 Thread SAILENDRA PAVAN (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-10515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

SAILENDRA PAVAN updated LUCENE-10515:
-
Description: 
When we use FloatField for sorting, decimal values are getting rounded off and 
results is erroneous. Even after upgrading version to 6.0.0 FloatPoint results 
are also erroneous. We are getting correct results only when we use field as 
FloatDocValues.

I have attached a sample project which can show case the bug which i 
encountered.

I debugged to see what happened internally, below are few things which i 
encountered.
 !FloatFieldStored.png! 
We can see I used the following values to sort
12.5,15,0,0,12. We can clearly see 12.5 and 12 are saved as 1.7E-44 in queue.

when we use `FloatDocValuesField` to store the value then the sorting is 
working as expected. we can see values are stored without rounding off in 
FloatDocValuesStored.

Only difference i see in the file Lucene50DocValuesProducer . In case of 
floatPoint numeric entry format is CONST_COMPRESSED and for rest of field types 
like int point, long point or FloatDocValuesField the entry format is 
TABLE_COMPRESSED. 


 

 

  was:
When we use FloatField for sorting, decimal values are getting rounded off and 
results is erroneous. Even after upgrading version to 6.0.0 FloatPoint results 
are also erroneous. We are getting correct results only when we use field as 
FloatDocValues.

I have attached a sample project which can show case the bug which i 
encountered.

I have debug to see what happened internally, below are few things which i 
encountered.
We can see in  !FloatFieldStored.png!  I used the following values to sort
12.5,15,0,0,12. We can clearly see 12.5 and 12 are saved as 1.7E-44 in queue.

when we use `FloatDocValuesField` to store the value then the sorting is 
working as expected. we can see values are stored without rounding off in 
FloatDocValuesStored.

Only difference i see in the file Lucene50DocValuesProducer . In case of 
floatPoint numeric entry format is CONST_COMPRESSED and for rest of field types 
like int point, long point or FloatDocValuesField the entry format is 
TABLE_COMPRESSED. 


 

 


> Sorting on FloatFiled or FloatPoint producing erroneous results
> ---
>
> Key: LUCENE-10515
> URL: https://issues.apache.org/jira/browse/LUCENE-10515
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/search
>Affects Versions: 5.3.1
>Reporter: SAILENDRA PAVAN
>Priority: Major
> Attachments: FloatDocValuesStored.png, FloatFieldStored.png, 
> Lucene50DocValuesProducer.png, LuenceSortFloatIssue.zip
>
>
> When we use FloatField for sorting, decimal values are getting rounded off 
> and results is erroneous. Even after upgrading version to 6.0.0 FloatPoint 
> results are also erroneous. We are getting correct results only when we use 
> field as FloatDocValues.
> I have attached a sample project which can show case the bug which i 
> encountered.
> I debugged to see what happened internally, below are few things which i 
> encountered.
>  !FloatFieldStored.png! 
> We can see I used the following values to sort
> 12.5,15,0,0,12. We can clearly see 12.5 and 12 are saved as 1.7E-44 in queue.
> when we use `FloatDocValuesField` to store the value then the sorting is 
> working as expected. we can see values are stored without rounding off in 
> FloatDocValuesStored.
> Only difference i see in the file Lucene50DocValuesProducer . In case of 
> floatPoint numeric entry format is CONST_COMPRESSED and for rest of field 
> types like int point, long point or FloatDocValuesField the entry format is 
> TABLE_COMPRESSED. 
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-10515) Sorting on FloatFiled or FloatPoint producing erroneous results

2022-04-14 Thread SAILENDRA PAVAN (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-10515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

SAILENDRA PAVAN updated LUCENE-10515:
-
Description: 
When we use FloatField for sorting, decimal values are getting rounded off and 
results is erroneous. Even after upgrading version to 6.0.0 FloatPoint results 
are also erroneous. We are getting correct results only when we use field as 
FloatDocValues.

I have attached a sample project which can show case the bug which i 
encountered.

I debugged to see what happened internally, below are few things which i 
encountered.

I used the following values to sort 12.5,15,0,0,12. We can clearly see 12.5 and 
12 are saved as 1.7E-44 in queue.
 !FloatFieldStored.png! 

when we use `FloatDocValuesField` to store the value then the sorting is 
working as expected. we can see values are stored without rounding off in below 
image
 !FloatDocValuesStored.png! 

Only difference i see in the file Lucene50DocValuesProducer . In case of 
floatPoint numeric entry format is CONST_COMPRESSED and for rest of field types 
like int point, long point or FloatDocValuesField the entry format is 
TABLE_COMPRESSED. 
 !Lucene50DocValuesProducer.png! 


 

 

  was:
When we use FloatField for sorting, decimal values are getting rounded off and 
results is erroneous. Even after upgrading version to 6.0.0 FloatPoint results 
are also erroneous. We are getting correct results only when we use field as 
FloatDocValues.

I have attached a sample project which can show case the bug which i 
encountered.

I debugged to see what happened internally, below are few things which i 
encountered.

I used the following values to sort 12.5,15,0,0,12. We can clearly see 12.5 and 
12 are saved as 1.7E-44 in queue.
 !FloatFieldStored.png! 

when we use `FloatDocValuesField` to store the value then the sorting is 
working as expected. we can see values are stored without rounding off in below 
image
 !FloatDocValuesStored.png! 

Only difference i see in the file Lucene50DocValuesProducer . In case of 
floatPoint numeric entry format is CONST_COMPRESSED and for rest of field types 
like int point, long point or FloatDocValuesField the entry format is 
TABLE_COMPRESSED. 


 

 


> Sorting on FloatFiled or FloatPoint producing erroneous results
> ---
>
> Key: LUCENE-10515
> URL: https://issues.apache.org/jira/browse/LUCENE-10515
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/search
>Affects Versions: 5.3.1
>Reporter: SAILENDRA PAVAN
>Priority: Major
> Attachments: FloatDocValuesStored.png, FloatFieldStored.png, 
> Lucene50DocValuesProducer.png, LuenceSortFloatIssue.zip
>
>
> When we use FloatField for sorting, decimal values are getting rounded off 
> and results is erroneous. Even after upgrading version to 6.0.0 FloatPoint 
> results are also erroneous. We are getting correct results only when we use 
> field as FloatDocValues.
> I have attached a sample project which can show case the bug which i 
> encountered.
> I debugged to see what happened internally, below are few things which i 
> encountered.
> I used the following values to sort 12.5,15,0,0,12. We can clearly see 12.5 
> and 12 are saved as 1.7E-44 in queue.
>  !FloatFieldStored.png! 
> when we use `FloatDocValuesField` to store the value then the sorting is 
> working as expected. we can see values are stored without rounding off in 
> below image
>  !FloatDocValuesStored.png! 
> Only difference i see in the file Lucene50DocValuesProducer . In case of 
> floatPoint numeric entry format is CONST_COMPRESSED and for rest of field 
> types like int point, long point or FloatDocValuesField the entry format is 
> TABLE_COMPRESSED. 
>  !Lucene50DocValuesProducer.png! 
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-10515) Sorting on FloatFiled or FloatPoint producing erroneous results

2022-04-14 Thread SAILENDRA PAVAN (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-10515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

SAILENDRA PAVAN updated LUCENE-10515:
-
Description: 
When we use FloatField for sorting, decimal values are getting rounded off and 
results is erroneous. Even after upgrading version to 6.0.0 FloatPoint results 
are also erroneous. We are getting correct results only when we use field as 
FloatDocValues.

I have attached a sample project which can show case the bug which i 
encountered.

I debugged to see what happened internally, below are few things which i 
encountered.

I used the following values to sort 12.5,15,0,0,12. We can clearly see 12.5 and 
12 are saved as 1.7E-44 in queue.
 !FloatFieldStored.png! 

when we use `FloatDocValuesField` to store the value then the sorting is 
working as expected. we can see values are stored without rounding off in below 
image
 !FloatDocValuesStored.png! 

Only difference i see in the file Lucene50DocValuesProducer . In case of 
floatPoint numeric entry format is CONST_COMPRESSED and for rest of field types 
like int point, long point or FloatDocValuesField the entry format is 
TABLE_COMPRESSED. 


 

 

  was:
When we use FloatField for sorting, decimal values are getting rounded off and 
results is erroneous. Even after upgrading version to 6.0.0 FloatPoint results 
are also erroneous. We are getting correct results only when we use field as 
FloatDocValues.

I have attached a sample project which can show case the bug which i 
encountered.

I debugged to see what happened internally, below are few things which i 
encountered.
 !FloatFieldStored.png! 
We can see I used the following values to sort
12.5,15,0,0,12. We can clearly see 12.5 and 12 are saved as 1.7E-44 in queue.

when we use `FloatDocValuesField` to store the value then the sorting is 
working as expected. we can see values are stored without rounding off in 
FloatDocValuesStored.

Only difference i see in the file Lucene50DocValuesProducer . In case of 
floatPoint numeric entry format is CONST_COMPRESSED and for rest of field types 
like int point, long point or FloatDocValuesField the entry format is 
TABLE_COMPRESSED. 


 

 


> Sorting on FloatFiled or FloatPoint producing erroneous results
> ---
>
> Key: LUCENE-10515
> URL: https://issues.apache.org/jira/browse/LUCENE-10515
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/search
>Affects Versions: 5.3.1
>Reporter: SAILENDRA PAVAN
>Priority: Major
> Attachments: FloatDocValuesStored.png, FloatFieldStored.png, 
> Lucene50DocValuesProducer.png, LuenceSortFloatIssue.zip
>
>
> When we use FloatField for sorting, decimal values are getting rounded off 
> and results is erroneous. Even after upgrading version to 6.0.0 FloatPoint 
> results are also erroneous. We are getting correct results only when we use 
> field as FloatDocValues.
> I have attached a sample project which can show case the bug which i 
> encountered.
> I debugged to see what happened internally, below are few things which i 
> encountered.
> I used the following values to sort 12.5,15,0,0,12. We can clearly see 12.5 
> and 12 are saved as 1.7E-44 in queue.
>  !FloatFieldStored.png! 
> when we use `FloatDocValuesField` to store the value then the sorting is 
> working as expected. we can see values are stored without rounding off in 
> below image
>  !FloatDocValuesStored.png! 
> Only difference i see in the file Lucene50DocValuesProducer . In case of 
> floatPoint numeric entry format is CONST_COMPRESSED and for rest of field 
> types like int point, long point or FloatDocValuesField the entry format is 
> TABLE_COMPRESSED. 
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-10515) Sorting on FloatFiled or FloatPoint producing erroneous results

2022-04-14 Thread SAILENDRA PAVAN (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-10515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

SAILENDRA PAVAN updated LUCENE-10515:
-
Description: 
When we use FloatField for sorting, decimal values are getting rounded off and 
results is erroneous. Even after upgrading version to 6.0.0 FloatPoint results 
are also erroneous. We are getting correct results only when we use field as 
FloatDocValues.

I have attached a sample project which can show case the bug which i 
encountered.

I debugged to see what happened internally, below are few things which i 
encountered.

I used the following values to sort 12.5,15,0,0,12. We can clearly see 12.5 and 
12 are saved as 1.7E-44 in queue.
 !FloatFieldStored.png! 

when we use `FloatDocValuesField` to store the value then the sorting is 
working as expected. we can see values are stored without rounding off in below 
image
 !FloatDocValuesStored.png! 

Only difference i see in the file Lucene50DocValuesProducer . In case of 
floatPoint numeric entry format is CONST_COMPRESSED and for rest of field types 
like int point, long point or FloatDocValuesField the entry format is 
TABLE_COMPRESSED. 
 !screenshot-1.png! 

 

 

  was:
When we use FloatField for sorting, decimal values are getting rounded off and 
results is erroneous. Even after upgrading version to 6.0.0 FloatPoint results 
are also erroneous. We are getting correct results only when we use field as 
FloatDocValues.

I have attached a sample project which can show case the bug which i 
encountered.

I debugged to see what happened internally, below are few things which i 
encountered.

I used the following values to sort 12.5,15,0,0,12. We can clearly see 12.5 and 
12 are saved as 1.7E-44 in queue.
 !FloatFieldStored.png! 

when we use `FloatDocValuesField` to store the value then the sorting is 
working as expected. we can see values are stored without rounding off in below 
image
 !FloatDocValuesStored.png! 

Only difference i see in the file Lucene50DocValuesProducer . In case of 
floatPoint numeric entry format is CONST_COMPRESSED and for rest of field types 
like int point, long point or FloatDocValuesField the entry format is 
TABLE_COMPRESSED. 
 !Lucene50DocValuesProducer.png! 


 

 


> Sorting on FloatFiled or FloatPoint producing erroneous results
> ---
>
> Key: LUCENE-10515
> URL: https://issues.apache.org/jira/browse/LUCENE-10515
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/search
>Affects Versions: 5.3.1
>Reporter: SAILENDRA PAVAN
>Priority: Major
> Attachments: FloatDocValuesStored.png, FloatFieldStored.png, 
> Lucene50DocValuesProducer.png, LuenceSortFloatIssue.zip, screenshot-1.png
>
>
> When we use FloatField for sorting, decimal values are getting rounded off 
> and results is erroneous. Even after upgrading version to 6.0.0 FloatPoint 
> results are also erroneous. We are getting correct results only when we use 
> field as FloatDocValues.
> I have attached a sample project which can show case the bug which i 
> encountered.
> I debugged to see what happened internally, below are few things which i 
> encountered.
> I used the following values to sort 12.5,15,0,0,12. We can clearly see 12.5 
> and 12 are saved as 1.7E-44 in queue.
>  !FloatFieldStored.png! 
> when we use `FloatDocValuesField` to store the value then the sorting is 
> working as expected. we can see values are stored without rounding off in 
> below image
>  !FloatDocValuesStored.png! 
> Only difference i see in the file Lucene50DocValuesProducer . In case of 
> floatPoint numeric entry format is CONST_COMPRESSED and for rest of field 
> types like int point, long point or FloatDocValuesField the entry format is 
> TABLE_COMPRESSED. 
>  !screenshot-1.png! 
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-10515) Sorting on FloatFiled or FloatPoint producing erroneous results

2022-04-14 Thread SAILENDRA PAVAN (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-10515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

SAILENDRA PAVAN updated LUCENE-10515:
-
Attachment: screenshot-1.png

> Sorting on FloatFiled or FloatPoint producing erroneous results
> ---
>
> Key: LUCENE-10515
> URL: https://issues.apache.org/jira/browse/LUCENE-10515
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/search
>Affects Versions: 5.3.1
>Reporter: SAILENDRA PAVAN
>Priority: Major
> Attachments: FloatDocValuesStored.png, FloatFieldStored.png, 
> Lucene50DocValuesProducer.png, LuenceSortFloatIssue.zip, screenshot-1.png
>
>
> When we use FloatField for sorting, decimal values are getting rounded off 
> and results is erroneous. Even after upgrading version to 6.0.0 FloatPoint 
> results are also erroneous. We are getting correct results only when we use 
> field as FloatDocValues.
> I have attached a sample project which can show case the bug which i 
> encountered.
> I debugged to see what happened internally, below are few things which i 
> encountered.
> I used the following values to sort 12.5,15,0,0,12. We can clearly see 12.5 
> and 12 are saved as 1.7E-44 in queue.
>  !FloatFieldStored.png! 
> when we use `FloatDocValuesField` to store the value then the sorting is 
> working as expected. we can see values are stored without rounding off in 
> below image
>  !FloatDocValuesStored.png! 
> Only difference i see in the file Lucene50DocValuesProducer . In case of 
> floatPoint numeric entry format is CONST_COMPRESSED and for rest of field 
> types like int point, long point or FloatDocValuesField the entry format is 
> TABLE_COMPRESSED. 
>  !Lucene50DocValuesProducer.png! 
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-10515) Sorting on FloatFiled or FloatPoint producing erroneous results

2022-04-14 Thread SAILENDRA PAVAN (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-10515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

SAILENDRA PAVAN updated LUCENE-10515:
-
Description: 
When we use FloatField for sorting, decimal values are getting rounded off and 
results is erroneous. Even after upgrading version to 6.0.0 FloatPoint results 
are also erroneous. We are getting correct results only when we use field as 
FloatDocValues.

I have attached a sample project which can show case the bug which i 
encountered [^LuenceSortFloatIssue.zip]

I debugged to see what happened internally, below are few things which i 
encountered.

I used the following values to sort 12.5,15,0,0,12. We can clearly see 12.5 and 
12 are saved as 1.7E-44 in queue.
!FloatFieldStored.png!

when we use `FloatDocValuesField` to store the value then the sorting is 
working as expected. we can see values are stored without rounding off in below 
image
!FloatDocValuesStored.png!

Only difference i see in the file Lucene50DocValuesProducer . In case of 
floatPoint numeric entry format is CONST_COMPRESSED and for rest of field types 
like int point, long point or FloatDocValuesField the entry format is 
TABLE_COMPRESSED. 
!screenshot-1.png!

 

 

  was:
When we use FloatField for sorting, decimal values are getting rounded off and 
results is erroneous. Even after upgrading version to 6.0.0 FloatPoint results 
are also erroneous. We are getting correct results only when we use field as 
FloatDocValues.

I have attached a sample project which can show case the bug which i 
encountered.

I debugged to see what happened internally, below are few things which i 
encountered.

I used the following values to sort 12.5,15,0,0,12. We can clearly see 12.5 and 
12 are saved as 1.7E-44 in queue.
 !FloatFieldStored.png! 

when we use `FloatDocValuesField` to store the value then the sorting is 
working as expected. we can see values are stored without rounding off in below 
image
 !FloatDocValuesStored.png! 

Only difference i see in the file Lucene50DocValuesProducer . In case of 
floatPoint numeric entry format is CONST_COMPRESSED and for rest of field types 
like int point, long point or FloatDocValuesField the entry format is 
TABLE_COMPRESSED. 
 !screenshot-1.png! 

 

 


> Sorting on FloatFiled or FloatPoint producing erroneous results
> ---
>
> Key: LUCENE-10515
> URL: https://issues.apache.org/jira/browse/LUCENE-10515
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/search
>Affects Versions: 5.3.1
>Reporter: SAILENDRA PAVAN
>Priority: Major
> Attachments: FloatDocValuesStored.png, FloatFieldStored.png, 
> Lucene50DocValuesProducer.png, LuenceSortFloatIssue.zip, screenshot-1.png
>
>
> When we use FloatField for sorting, decimal values are getting rounded off 
> and results is erroneous. Even after upgrading version to 6.0.0 FloatPoint 
> results are also erroneous. We are getting correct results only when we use 
> field as FloatDocValues.
> I have attached a sample project which can show case the bug which i 
> encountered [^LuenceSortFloatIssue.zip]
> I debugged to see what happened internally, below are few things which i 
> encountered.
> I used the following values to sort 12.5,15,0,0,12. We can clearly see 12.5 
> and 12 are saved as 1.7E-44 in queue.
> !FloatFieldStored.png!
> when we use `FloatDocValuesField` to store the value then the sorting is 
> working as expected. we can see values are stored without rounding off in 
> below image
> !FloatDocValuesStored.png!
> Only difference i see in the file Lucene50DocValuesProducer . In case of 
> floatPoint numeric entry format is CONST_COMPRESSED and for rest of field 
> types like int point, long point or FloatDocValuesField the entry format is 
> TABLE_COMPRESSED. 
> !screenshot-1.png!
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] mocobeta commented on a diff in pull request #811: Add some basic tasks to help/workflow

2022-04-14 Thread GitBox


mocobeta commented on code in PR #811:
URL: https://github.com/apache/lucene/pull/811#discussion_r850322876


##
help/workflow.txt:
##
@@ -25,11 +25,22 @@ Assemble a single module's JAR (here for lucene-core):
 gradlew -p lucene/core assemble
 ls lucene/core/build/libs
 
+Assemble all JARs:
+gradlew assemble
+
 Create all distributable packages, POMs, etc. and create a
 local maven repository for inspection:
 gradlew mavenLocal
 ls -R build/maven-local/
 
+Assemble Javdocs on a module:

Review Comment:
   > This is a general question on where to explain how gradle works (-p 
command) and whether this document should provide ready-to-use commands doing 
things or an explanation how to assemble those commands from scratch. 
   
   To me, a list of ready-to-use commands would be sufficient for this help 
document; here I'd give priority to brevity over detailed explanations. The 
usage of `-p` would be easily inferred from the examples, and I think 
developers who need more information about Gradle can always refer to the 
Gradle's documentation or other resources (just as I do).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] mocobeta commented on a diff in pull request #811: Add some basic tasks to help/workflow

2022-04-14 Thread GitBox


mocobeta commented on code in PR #811:
URL: https://github.com/apache/lucene/pull/811#discussion_r850322876


##
help/workflow.txt:
##
@@ -25,11 +25,22 @@ Assemble a single module's JAR (here for lucene-core):
 gradlew -p lucene/core assemble
 ls lucene/core/build/libs
 
+Assemble all JARs:
+gradlew assemble
+
 Create all distributable packages, POMs, etc. and create a
 local maven repository for inspection:
 gradlew mavenLocal
 ls -R build/maven-local/
 
+Assemble Javdocs on a module:

Review Comment:
   > This is a general question on where to explain how gradle works (-p 
command) and whether this document should provide ready-to-use commands doing 
things or an explanation how to assemble those commands from scratch. 
   
   To me, a list of ready-to-use commands would be sufficient for this help 
document; here I'd give priority to brevity over detailed explanations. The 
usage of `-p` would be easily inferred from the examples, and I think 
developers who need more information about Gradle can always refer to the 
Gradle's documentation or other resources (just as I do).
   Maybe I'm not the right person to give an opinion on documentation for 
beginners... but I guess readers of this guide are experienced Java developers 
anyway?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] mocobeta commented on a diff in pull request #811: Add some basic tasks to help/workflow

2022-04-14 Thread GitBox


mocobeta commented on code in PR #811:
URL: https://github.com/apache/lucene/pull/811#discussion_r850334643


##
help/workflow.txt:
##
@@ -25,11 +25,22 @@ Assemble a single module's JAR (here for lucene-core):
 gradlew -p lucene/core assemble
 ls lucene/core/build/libs
 
+Assemble all JARs:
+gradlew assemble
+
 Create all distributable packages, POMs, etc. and create a
 local maven repository for inspection:
 gradlew mavenLocal
 ls -R build/maven-local/
 
+Assemble Javdocs on a module:

Review Comment:
   > Separately - do you ever assemble javadocs for a single module? What's the 
point of doing that?
   
   Yes, sometimes - to see the changes on javadocs are fine in HTML (and it is 
convenient for lining in my usage). I added this line mainly for developers who 
want to contribute documentation.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] mocobeta commented on a diff in pull request #811: Add some basic tasks to help/workflow

2022-04-14 Thread GitBox


mocobeta commented on code in PR #811:
URL: https://github.com/apache/lucene/pull/811#discussion_r850334643


##
help/workflow.txt:
##
@@ -25,11 +25,22 @@ Assemble a single module's JAR (here for lucene-core):
 gradlew -p lucene/core assemble
 ls lucene/core/build/libs
 
+Assemble all JARs:
+gradlew assemble
+
 Create all distributable packages, POMs, etc. and create a
 local maven repository for inspection:
 gradlew mavenLocal
 ls -R build/maven-local/
 
+Assemble Javdocs on a module:

Review Comment:
   > Separately - do you ever assemble javadocs for a single module? What's the 
point of doing that?
   
   Yes, sometimes - to see the changes on javadocs are fine in HTML (and it is 
convenient for linting in my usage). I added this line mainly for developers 
who want to contribute documentation.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] mayya-sharipova merged pull request #800: Make constructor for QueryOffsetRange public

2022-04-14 Thread GitBox


mayya-sharipova merged PR #800:
URL: https://github.com/apache/lucene/pull/800


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] LuXugang commented on pull request #792: LUCENE-10502: Use IndexedDISI to store docIds and DirectMonotonicWriter/Reader to handle ordToDoc

2022-04-14 Thread GitBox


LuXugang commented on PR #792:
URL: https://github.com/apache/lucene/pull/792#issuecomment-1099309953

   Thanks @jtibshirani for reviewing such big PR though I had tried to split 
into several commits by different phase of modification.
   
   >The PR moves the ordToDoc mapping from the metadata file to the vector data 
file. This is great, but it means that we should update the way ordToDoc is 
loaded. We are still loading ordToDoc when the format is opened, which is not 
good, since we are not supposed to touch any data files at this point. I think 
we should follow the same pattern as in the Lucene90DocValuesProducer class 
where we only load the DirectMonotonicReader.Meta file when opening the format, 
then load the full reader later each time we search or load vector values?
   
   Your suggestion is really make sense, so I should remove  ordToDoc to 
`OffHeapVectorValues` and make it off-heap? 
   By the way, in method `Lucene91HnswVectorsReader#search`, ordToDoc is a 
frequent invocation. I worry that off-heap will case latency compared with 
loading all to memory.
   
   >This PR both moves the ordToDoc mapping to disk, and adds an IndexedDISI to 
support fast iteration. It'd be nice to focus on one change at a time, since it 
makes it easier to understand and review. Maybe we could just move ordToDoc to 
disk in this PR. Or do you think the two changes need to go together?
   
   In `Lucene91HnswVectorsReader`, ordToDoc as a array used to both iteration 
and mapping. but now ordToDoc in PR only for mapping, so I had do this two 
changes.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] HoustonPutman commented on pull request #523: LUCENE-10293: adds a stand-alone prepareSonatypeBundle that compiles …

2022-04-14 Thread GitBox


HoustonPutman commented on PR #523:
URL: https://github.com/apache/lucene/pull/523#issuecomment-1099460990

   I did use this patch, and it was all flattened. However Apache Nexus 
responded with a very long error message saying that every file in the zip was 
"invalid". (I then tried every possible combination of directories that I could 
try since flattened didn't work)
   
   In the meantime I have a small script that I might use for 9.0.0. Agreed 
that I would like it to be portable, but for now this is the only way I've been 
able to get it to work. We can always move back to this when we find out how to 
get it to work.
   
   It's much more digestible than the Spark script. You can check it out here: 
https://github.com/apache/solr/pull/807
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] nknize commented on a diff in pull request #809: LUCENE-10514: Component2D#Within methods should return NOTWITHIN when the query geometry contains the triangle

2022-04-14 Thread GitBox


nknize commented on code in PR #809:
URL: https://github.com/apache/lucene/pull/809#discussion_r850671907


##
lucene/core/src/java/org/apache/lucene/geo/Polygon2D.java:
##
@@ -257,10 +257,13 @@ public WithinRelation withinLine(
   boolean ab,
   double bX,
   double bY) {
-if (ab == true

Review Comment:
   I follow what you're saying wrt the title of the PR. I think I'm having 
trouble following the commit message and it might just need to be clearer?
   
   ```
   1. WE currently might return disjoint when a query geometry fully contains a 
triangle / line / point. 
   2. This causes issues as when an inner node is fully contained in the query 
shape, we marked those documents as NOTWITHIN. 
   3. This PR brings these behaviour together by making sure we always return 
NOTWITHIN for fully contained triangles.
   ```
   Wrt 2: inner nodes are driven by the bbox dimensions so why are we caring 
about the tessellated trianges for this case? If the bbox falls within the 
query geometry its children should be accepted for `INTERSECTS` and `WITHIN` 
but rejected for `CONTAINS` and `DISJOINT`.  I think the tessellations start to 
matter when traversing the leaf nodes? In that case a document should be 
collected if a triangle edge is contained by the query geometry in a `WITHIN` 
`INTERSECTS` but rejected in a `CONTAINS` and `DISJOINT`. 
   
   Can we maybe update the description to more clearly identify what failing 
query relations are being fixed by the change? 
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] dweiss commented on a diff in pull request #811: Add some basic tasks to help/workflow

2022-04-14 Thread GitBox


dweiss commented on code in PR #811:
URL: https://github.com/apache/lucene/pull/811#discussion_r850758274


##
help/workflow.txt:
##
@@ -25,11 +25,22 @@ Assemble a single module's JAR (here for lucene-core):
 gradlew -p lucene/core assemble
 ls lucene/core/build/libs
 
+Assemble all JARs:
+gradlew assemble
+
 Create all distributable packages, POMs, etc. and create a
 local maven repository for inspection:
 gradlew mavenLocal
 ls -R build/maven-local/
 
+Assemble Javdocs on a module:

Review Comment:
   ok, thanks.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] dweiss commented on a diff in pull request #811: Add some basic tasks to help/workflow

2022-04-14 Thread GitBox


dweiss commented on code in PR #811:
URL: https://github.com/apache/lucene/pull/811#discussion_r850758533


##
help/workflow.txt:
##
@@ -25,11 +25,22 @@ Assemble a single module's JAR (here for lucene-core):
 gradlew -p lucene/core assemble
 ls lucene/core/build/libs
 
+Assemble all JARs:

Review Comment:
   LGTM!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] dweiss commented on pull request #523: LUCENE-10293: adds a stand-alone prepareSonatypeBundle that compiles …

2022-04-14 Thread GitBox


dweiss commented on PR #523:
URL: https://github.com/apache/lucene/pull/523#issuecomment-1099571094

   It is strange because I remember verifying that it worked... I'll take 
another look at this once I'm back from Easter break.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] jtibshirani commented on pull request #792: LUCENE-10502: Use IndexedDISI to store docIds and DirectMonotonicWriter/Reader to handle ordToDoc

2022-04-14 Thread GitBox


jtibshirani commented on PR #792:
URL: https://github.com/apache/lucene/pull/792#issuecomment-1099651418

   > I worry that off-heap will case latency compared with loading all to 
memory.
   
   Doesn't your PR already move the ordToDoc data structure off-heap? 
`DirectMonotonicReader` reads directly from the index input 
(`RandomAccessIndexInput`). My suggestion is to follow the pattern in 
`Lucene90DocValuesProducer#getNumeric`. We would load 
`DirectMonotonicReader.Meta` when creating the `FieldEntry`, and then create a 
new `DirectMonotonicReader` when the `search` or `getVectorValues` methods are 
called.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] gautamworah96 commented on pull request #811: Add some basic tasks to help/workflow

2022-04-14 Thread GitBox


gautamworah96 commented on PR #811:
URL: https://github.com/apache/lucene/pull/811#issuecomment-1099803914

   I don't have any comments on this PR but a general +1 on the effort. I've 
sometimes found myself stumbling around the package searching for gradle 
commands that do what I want to do. Having a single file with some ready to use 
commands sounds perfect. I would also give a +1 for directly linking it from 
CONTRIBUTING.md. The easier we make it for contributors the better!
   
   I sometimes also use the `-Ptests.iters=` param for beasting 
out multiple runs of a single test to catch easy edge cases that I might have 
missed (this was another trick that I just stumbled upon through JIRA). Maybe 
we could add this to the workflow file as well?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (LUCENE-10516) reduce unnecessary loop matches in BKDReader

2022-04-14 Thread kkewwei (Jira)
kkewwei created LUCENE-10516:


 Summary: reduce unnecessary loop matches in BKDReader
 Key: LUCENE-10516
 URL: https://issues.apache.org/jira/browse/LUCENE-10516
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/other
Reporter: kkewwei


In `BKDReader.visitSparseRawDocValues()`, we will read a batch of docIds which 
have the same point value:scratchPackedValue, then call 
`visitor.visit(scratchIterator, scratchPackedValue)` to find which docIDs match 
the range.

{code:java}
default void visit(DocIdSetIterator iterator, byte[] packedValue) throws 
IOException {
  int docID;
  while ((docID = iterator.nextDoc()) != DocIdSetIterator.NO_MORE_DOCS) { 
visit(docID, packedValue); 
  }
}
{code}
we know that the packedValue are same, if the first doc match the range, the 
batch of docIds will also match the range, so the loop is useless.

we should call the method as follow:

{code:java}
  public void visit(DocIdSetIterator iterator, byte[] packedValue) 
throws IOException {
if (matches(packedValue)) {
  int docID;
  while ((docID = iterator.nextDoc()) != 
DocIdSetIterator.NO_MORE_DOCS) {
visit(docID);
  }
}
  }
{code}

https://github.com/apache/lucene/blob/2e941fcfed6cad3d9c8667ff5324cd04858ba547/lucene/core/src/java/org/apache/lucene/search/PointRangeQuery.java#L196




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-10516) reduce unnecessary loop matches in BKDReader

2022-04-14 Thread kkewwei (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-10516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kkewwei updated LUCENE-10516:
-
Description: 
In *BKDReader.visitSparseRawDocValues()*, we will read a batch of docIds which 
have the same point value:scratchPackedValue, then call 
*visitor.visit(scratchIterator, scratchPackedValue)* to find which docIDs match 
the range.

{code:java}
default void visit(DocIdSetIterator iterator, byte[] packedValue) throws 
IOException {
  int docID;
  while ((docID = iterator.nextDoc()) != DocIdSetIterator.NO_MORE_DOCS) { 
visit(docID, packedValue); 
  }
}
{code}
we know that the packedValue are same, if the first doc match the range, the 
batch of docIds will also match the range, so the loop is useless.

we should call the method as follow:

{code:java}
  public void visit(DocIdSetIterator iterator, byte[] packedValue) 
throws IOException {
if (matches(packedValue)) {
  int docID;
  while ((docID = iterator.nextDoc()) != 
DocIdSetIterator.NO_MORE_DOCS) {
visit(docID);
  }
}
  }
{code}

https://github.com/apache/lucene/blob/2e941fcfed6cad3d9c8667ff5324cd04858ba547/lucene/core/src/java/org/apache/lucene/search/PointRangeQuery.java#L196


  was:
In `BKDReader.visitSparseRawDocValues()`, we will read a batch of docIds which 
have the same point value:scratchPackedValue, then call 
`visitor.visit(scratchIterator, scratchPackedValue)` to find which docIDs match 
the range.

{code:java}
default void visit(DocIdSetIterator iterator, byte[] packedValue) throws 
IOException {
  int docID;
  while ((docID = iterator.nextDoc()) != DocIdSetIterator.NO_MORE_DOCS) { 
visit(docID, packedValue); 
  }
}
{code}
we know that the packedValue are same, if the first doc match the range, the 
batch of docIds will also match the range, so the loop is useless.

we should call the method as follow:

{code:java}
  public void visit(DocIdSetIterator iterator, byte[] packedValue) 
throws IOException {
if (matches(packedValue)) {
  int docID;
  while ((docID = iterator.nextDoc()) != 
DocIdSetIterator.NO_MORE_DOCS) {
visit(docID);
  }
}
  }
{code}

https://github.com/apache/lucene/blob/2e941fcfed6cad3d9c8667ff5324cd04858ba547/lucene/core/src/java/org/apache/lucene/search/PointRangeQuery.java#L196



> reduce unnecessary loop matches in BKDReader
> 
>
> Key: LUCENE-10516
> URL: https://issues.apache.org/jira/browse/LUCENE-10516
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/other
>Reporter: kkewwei
>Priority: Major
>
> In *BKDReader.visitSparseRawDocValues()*, we will read a batch of docIds 
> which have the same point value:scratchPackedValue, then call 
> *visitor.visit(scratchIterator, scratchPackedValue)* to find which docIDs 
> match the range.
> {code:java}
> default void visit(DocIdSetIterator iterator, byte[] packedValue) throws 
> IOException {
>   int docID;
>   while ((docID = iterator.nextDoc()) != DocIdSetIterator.NO_MORE_DOCS) { 
> visit(docID, packedValue); 
>   }
> }
> {code}
> we know that the packedValue are same, if the first doc match the range, the 
> batch of docIds will also match the range, so the loop is useless.
> we should call the method as follow:
> {code:java}
>   public void visit(DocIdSetIterator iterator, byte[] packedValue) 
> throws IOException {
> if (matches(packedValue)) {
>   int docID;
>   while ((docID = iterator.nextDoc()) != 
> DocIdSetIterator.NO_MORE_DOCS) {
> visit(docID);
>   }
> }
>   }
> {code}
> https://github.com/apache/lucene/blob/2e941fcfed6cad3d9c8667ff5324cd04858ba547/lucene/core/src/java/org/apache/lucene/search/PointRangeQuery.java#L196



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-10516) reduce unnecessary loop matches in BKDReader

2022-04-14 Thread kkewwei (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-10516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kkewwei updated LUCENE-10516:
-
Description: 
In *BKDReader.visitSparseRawDocValues()*, we will read a batch of docIds which 
have the same point value:scratchPackedValue, then call 
*visitor.visit(scratchIterator, scratchPackedValue)* to find which docIDs match 
the range.

{code:java}
default void visit(DocIdSetIterator iterator, byte[] packedValue) throws 
IOException {
  int docID;
  while ((docID = iterator.nextDoc()) != DocIdSetIterator.NO_MORE_DOCS) { 
visit(docID, packedValue); 
  }
}
{code}
we know that the packedValue are same for the batch of docIds, if the first doc 
match the range, the batch of other docIds will also match the range, so the 
loop is useless.

we should call the method as follow:

{code:java}
  public void visit(DocIdSetIterator iterator, byte[] packedValue) 
throws IOException {
if (matches(packedValue)) {
  int docID;
  while ((docID = iterator.nextDoc()) != 
DocIdSetIterator.NO_MORE_DOCS) {
visit(docID);
  }
}
  }
{code}

https://github.com/apache/lucene/blob/2e941fcfed6cad3d9c8667ff5324cd04858ba547/lucene/core/src/java/org/apache/lucene/search/PointRangeQuery.java#L196


  was:
In *BKDReader.visitSparseRawDocValues()*, we will read a batch of docIds which 
have the same point value:scratchPackedValue, then call 
*visitor.visit(scratchIterator, scratchPackedValue)* to find which docIDs match 
the range.

{code:java}
default void visit(DocIdSetIterator iterator, byte[] packedValue) throws 
IOException {
  int docID;
  while ((docID = iterator.nextDoc()) != DocIdSetIterator.NO_MORE_DOCS) { 
visit(docID, packedValue); 
  }
}
{code}
we know that the packedValue are same, if the first doc match the range, the 
batch of docIds will also match the range, so the loop is useless.

we should call the method as follow:

{code:java}
  public void visit(DocIdSetIterator iterator, byte[] packedValue) 
throws IOException {
if (matches(packedValue)) {
  int docID;
  while ((docID = iterator.nextDoc()) != 
DocIdSetIterator.NO_MORE_DOCS) {
visit(docID);
  }
}
  }
{code}

https://github.com/apache/lucene/blob/2e941fcfed6cad3d9c8667ff5324cd04858ba547/lucene/core/src/java/org/apache/lucene/search/PointRangeQuery.java#L196



> reduce unnecessary loop matches in BKDReader
> 
>
> Key: LUCENE-10516
> URL: https://issues.apache.org/jira/browse/LUCENE-10516
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/other
>Reporter: kkewwei
>Priority: Major
>
> In *BKDReader.visitSparseRawDocValues()*, we will read a batch of docIds 
> which have the same point value:scratchPackedValue, then call 
> *visitor.visit(scratchIterator, scratchPackedValue)* to find which docIDs 
> match the range.
> {code:java}
> default void visit(DocIdSetIterator iterator, byte[] packedValue) throws 
> IOException {
>   int docID;
>   while ((docID = iterator.nextDoc()) != DocIdSetIterator.NO_MORE_DOCS) { 
> visit(docID, packedValue); 
>   }
> }
> {code}
> we know that the packedValue are same for the batch of docIds, if the first 
> doc match the range, the batch of other docIds will also match the range, so 
> the loop is useless.
> we should call the method as follow:
> {code:java}
>   public void visit(DocIdSetIterator iterator, byte[] packedValue) 
> throws IOException {
> if (matches(packedValue)) {
>   int docID;
>   while ((docID = iterator.nextDoc()) != 
> DocIdSetIterator.NO_MORE_DOCS) {
> visit(docID);
>   }
> }
>   }
> {code}
> https://github.com/apache/lucene/blob/2e941fcfed6cad3d9c8667ff5324cd04858ba547/lucene/core/src/java/org/apache/lucene/search/PointRangeQuery.java#L196



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-10516) reduce unnecessary loop matches in BKDReader

2022-04-14 Thread kkewwei (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-10516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kkewwei updated LUCENE-10516:
-
Description: 
In *BKDReader.visitSparseRawDocValues()*, we will read a batch of docIds which 
have the same point value:scratchPackedValue, then call 
*visitor.visit(scratchIterator, scratchPackedValue)* to find which docIDs match 
the range.

{code:java}
default void visit(DocIdSetIterator iterator, byte[] packedValue) throws 
IOException {
  int docID;
  while ((docID = iterator.nextDoc()) != DocIdSetIterator.NO_MORE_DOCS) { 
visit(docID, packedValue); 
  }
}
{code}
We know that the packedValue are same for the batch of docIds, if the first doc 
match the range, the batch of other docIds will also match the range, so the 
loop is useless.

We should call the method as follow:

{code:java}
  public void visit(DocIdSetIterator iterator, byte[] packedValue) 
throws IOException {
if (matches(packedValue)) {
  int docID;
  while ((docID = iterator.nextDoc()) != 
DocIdSetIterator.NO_MORE_DOCS) {
visit(docID);
  }
}
  }
{code}

https://github.com/apache/lucene/blob/2e941fcfed6cad3d9c8667ff5324cd04858ba547/lucene/core/src/java/org/apache/lucene/search/PointRangeQuery.java#L196

if we should override the *visit(DocIdSetIterator iterator, byte[] 
packedValue)* in *ExitableDirectoryReader$ExitableIntersectVisitor* to avoid 
calling the default implement:
{code:java}
@Override
public void visit(DocIdSetIterator iterator, byte[] packedValue) throws 
IOException {
queryCancellation.checkCancelled();
in.visit(iterator, packedValue);
}
{code}



  was:
In *BKDReader.visitSparseRawDocValues()*, we will read a batch of docIds which 
have the same point value:scratchPackedValue, then call 
*visitor.visit(scratchIterator, scratchPackedValue)* to find which docIDs match 
the range.

{code:java}
default void visit(DocIdSetIterator iterator, byte[] packedValue) throws 
IOException {
  int docID;
  while ((docID = iterator.nextDoc()) != DocIdSetIterator.NO_MORE_DOCS) { 
visit(docID, packedValue); 
  }
}
{code}
we know that the packedValue are same for the batch of docIds, if the first doc 
match the range, the batch of other docIds will also match the range, so the 
loop is useless.

we should call the method as follow:

{code:java}
  public void visit(DocIdSetIterator iterator, byte[] packedValue) 
throws IOException {
if (matches(packedValue)) {
  int docID;
  while ((docID = iterator.nextDoc()) != 
DocIdSetIterator.NO_MORE_DOCS) {
visit(docID);
  }
}
  }
{code}

https://github.com/apache/lucene/blob/2e941fcfed6cad3d9c8667ff5324cd04858ba547/lucene/core/src/java/org/apache/lucene/search/PointRangeQuery.java#L196



> reduce unnecessary loop matches in BKDReader
> 
>
> Key: LUCENE-10516
> URL: https://issues.apache.org/jira/browse/LUCENE-10516
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/other
>Reporter: kkewwei
>Priority: Major
>
> In *BKDReader.visitSparseRawDocValues()*, we will read a batch of docIds 
> which have the same point value:scratchPackedValue, then call 
> *visitor.visit(scratchIterator, scratchPackedValue)* to find which docIDs 
> match the range.
> {code:java}
> default void visit(DocIdSetIterator iterator, byte[] packedValue) throws 
> IOException {
>   int docID;
>   while ((docID = iterator.nextDoc()) != DocIdSetIterator.NO_MORE_DOCS) { 
> visit(docID, packedValue); 
>   }
> }
> {code}
> We know that the packedValue are same for the batch of docIds, if the first 
> doc match the range, the batch of other docIds will also match the range, so 
> the loop is useless.
> We should call the method as follow:
> {code:java}
>   public void visit(DocIdSetIterator iterator, byte[] packedValue) 
> throws IOException {
> if (matches(packedValue)) {
>   int docID;
>   while ((docID = iterator.nextDoc()) != 
> DocIdSetIterator.NO_MORE_DOCS) {
> visit(docID);
>   }
> }
>   }
> {code}
> https://github.com/apache/lucene/blob/2e941fcfed6cad3d9c8667ff5324cd04858ba547/lucene/core/src/java/org/apache/lucene/search/PointRangeQuery.java#L196
> if we should override the *visit(DocIdSetIterator iterator, byte[] 
> packedValue)* in *ExitableDirectoryReader$ExitableIntersectVisitor* to avoid 
> calling the default implement:
> {code:java}
> @Override
> public void visit(DocIdSetIterator iterator, byte[] packedValue) 
> throws IOException {
> queryCancellation.checkCancelled()

[jira] [Updated] (LUCENE-10516) reduce unnecessary loop matches in BKDReader

2022-04-14 Thread kkewwei (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-10516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kkewwei updated LUCENE-10516:
-
Description: 
In *BKDReader.visitSparseRawDocValues()*, we will read a batch of docIds which 
have the same point value:scratchPackedValue, then call 
*visitor.visit(scratchIterator, scratchPackedValue)* to find which docIDs match 
the range.

{code:java}
default void visit(DocIdSetIterator iterator, byte[] packedValue) throws 
IOException {
  int docID;
  while ((docID = iterator.nextDoc()) != DocIdSetIterator.NO_MORE_DOCS) { 
visit(docID, packedValue); 
  }
}
{code}
We know that the packedValue are same for the batch of docIds, if the first doc 
match the range, the batch of other docIds will also match the range, so the 
loop is useless.

We should call the method as follow:

{code:java}
  public void visit(DocIdSetIterator iterator, byte[] packedValue) 
throws IOException {
if (matches(packedValue)) {
  int docID;
  while ((docID = iterator.nextDoc()) != 
DocIdSetIterator.NO_MORE_DOCS) {
visit(docID);
  }
}
  }
{code}

https://github.com/apache/lucene/blob/2e941fcfed6cad3d9c8667ff5324cd04858ba547/lucene/core/src/java/org/apache/lucene/search/PointRangeQuery.java#L196

If we should override the *visit(DocIdSetIterator iterator, byte[] 
packedValue)* in *ExitableDirectoryReader$ExitableIntersectVisitor* to avoid 
calling the default implement:
{code:java}
@Override
public void visit(DocIdSetIterator iterator, byte[] packedValue) throws 
IOException {
queryCancellation.checkCancelled();
in.visit(iterator, packedValue);
}
{code}



  was:
In *BKDReader.visitSparseRawDocValues()*, we will read a batch of docIds which 
have the same point value:scratchPackedValue, then call 
*visitor.visit(scratchIterator, scratchPackedValue)* to find which docIDs match 
the range.

{code:java}
default void visit(DocIdSetIterator iterator, byte[] packedValue) throws 
IOException {
  int docID;
  while ((docID = iterator.nextDoc()) != DocIdSetIterator.NO_MORE_DOCS) { 
visit(docID, packedValue); 
  }
}
{code}
We know that the packedValue are same for the batch of docIds, if the first doc 
match the range, the batch of other docIds will also match the range, so the 
loop is useless.

We should call the method as follow:

{code:java}
  public void visit(DocIdSetIterator iterator, byte[] packedValue) 
throws IOException {
if (matches(packedValue)) {
  int docID;
  while ((docID = iterator.nextDoc()) != 
DocIdSetIterator.NO_MORE_DOCS) {
visit(docID);
  }
}
  }
{code}

https://github.com/apache/lucene/blob/2e941fcfed6cad3d9c8667ff5324cd04858ba547/lucene/core/src/java/org/apache/lucene/search/PointRangeQuery.java#L196

if we should override the *visit(DocIdSetIterator iterator, byte[] 
packedValue)* in *ExitableDirectoryReader$ExitableIntersectVisitor* to avoid 
calling the default implement:
{code:java}
@Override
public void visit(DocIdSetIterator iterator, byte[] packedValue) throws 
IOException {
queryCancellation.checkCancelled();
in.visit(iterator, packedValue);
}
{code}




> reduce unnecessary loop matches in BKDReader
> 
>
> Key: LUCENE-10516
> URL: https://issues.apache.org/jira/browse/LUCENE-10516
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/other
>Reporter: kkewwei
>Priority: Major
>
> In *BKDReader.visitSparseRawDocValues()*, we will read a batch of docIds 
> which have the same point value:scratchPackedValue, then call 
> *visitor.visit(scratchIterator, scratchPackedValue)* to find which docIDs 
> match the range.
> {code:java}
> default void visit(DocIdSetIterator iterator, byte[] packedValue) throws 
> IOException {
>   int docID;
>   while ((docID = iterator.nextDoc()) != DocIdSetIterator.NO_MORE_DOCS) { 
> visit(docID, packedValue); 
>   }
> }
> {code}
> We know that the packedValue are same for the batch of docIds, if the first 
> doc match the range, the batch of other docIds will also match the range, so 
> the loop is useless.
> We should call the method as follow:
> {code:java}
>   public void visit(DocIdSetIterator iterator, byte[] packedValue) 
> throws IOException {
> if (matches(packedValue)) {
>   int docID;
>   while ((docID = iterator.nextDoc()) != 
> DocIdSetIterator.NO_MORE_DOCS) {
> visit(docID);
>   }
> }
>   }
> {code}
> https://github.com/apache/lucene/blob/2e941fcfed6cad3d9c8667ff5324cd04858ba547/lucene/core/src/java/o

[jira] [Updated] (LUCENE-10516) reduce unnecessary loop matches in BKDReader

2022-04-14 Thread kkewwei (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-10516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kkewwei updated LUCENE-10516:
-
Affects Version/s: (was: 8.11.1)

> reduce unnecessary loop matches in BKDReader
> 
>
> Key: LUCENE-10516
> URL: https://issues.apache.org/jira/browse/LUCENE-10516
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/other
>Affects Versions: 8.6.2
>Reporter: kkewwei
>Priority: Major
>
> In *BKDReader.visitSparseRawDocValues()*, we will read a batch of docIds 
> which have the same point value:scratchPackedValue, then call 
> *visitor.visit(scratchIterator, scratchPackedValue)* to find which docIDs 
> match the range.
> {code:java}
> default void visit(DocIdSetIterator iterator, byte[] packedValue) throws 
> IOException {
>   int docID;
>   while ((docID = iterator.nextDoc()) != DocIdSetIterator.NO_MORE_DOCS) { 
> visit(docID, packedValue); 
>   }
> }
> {code}
> We know that the packedValue are same for the batch of docIds, if the first 
> doc match the range, the batch of other docIds will also match the range, so 
> the loop is useless.
> We should call the method as follow:
> {code:java}
>   public void visit(DocIdSetIterator iterator, byte[] packedValue) 
> throws IOException {
> if (matches(packedValue)) {
>   int docID;
>   while ((docID = iterator.nextDoc()) != 
> DocIdSetIterator.NO_MORE_DOCS) {
> visit(docID);
>   }
> }
>   }
> {code}
> https://github.com/apache/lucene/blob/2e941fcfed6cad3d9c8667ff5324cd04858ba547/lucene/core/src/java/org/apache/lucene/search/PointRangeQuery.java#L196
> If we should override the *visit(DocIdSetIterator iterator, byte[] 
> packedValue)* in *ExitableDirectoryReader$ExitableIntersectVisitor* to avoid 
> calling the default implement:
> {code:java}
> @Override
> public void visit(DocIdSetIterator iterator, byte[] packedValue) 
> throws IOException {
> queryCancellation.checkCancelled();
> in.visit(iterator, packedValue);
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-10516) reduce unnecessary loop matches in BKDReader

2022-04-14 Thread kkewwei (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-10516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kkewwei updated LUCENE-10516:
-
Affects Version/s: 8.11.1
   8.6.2

> reduce unnecessary loop matches in BKDReader
> 
>
> Key: LUCENE-10516
> URL: https://issues.apache.org/jira/browse/LUCENE-10516
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/other
>Affects Versions: 8.6.2, 8.11.1
>Reporter: kkewwei
>Priority: Major
>
> In *BKDReader.visitSparseRawDocValues()*, we will read a batch of docIds 
> which have the same point value:scratchPackedValue, then call 
> *visitor.visit(scratchIterator, scratchPackedValue)* to find which docIDs 
> match the range.
> {code:java}
> default void visit(DocIdSetIterator iterator, byte[] packedValue) throws 
> IOException {
>   int docID;
>   while ((docID = iterator.nextDoc()) != DocIdSetIterator.NO_MORE_DOCS) { 
> visit(docID, packedValue); 
>   }
> }
> {code}
> We know that the packedValue are same for the batch of docIds, if the first 
> doc match the range, the batch of other docIds will also match the range, so 
> the loop is useless.
> We should call the method as follow:
> {code:java}
>   public void visit(DocIdSetIterator iterator, byte[] packedValue) 
> throws IOException {
> if (matches(packedValue)) {
>   int docID;
>   while ((docID = iterator.nextDoc()) != 
> DocIdSetIterator.NO_MORE_DOCS) {
> visit(docID);
>   }
> }
>   }
> {code}
> https://github.com/apache/lucene/blob/2e941fcfed6cad3d9c8667ff5324cd04858ba547/lucene/core/src/java/org/apache/lucene/search/PointRangeQuery.java#L196
> If we should override the *visit(DocIdSetIterator iterator, byte[] 
> packedValue)* in *ExitableDirectoryReader$ExitableIntersectVisitor* to avoid 
> calling the default implement:
> {code:java}
> @Override
> public void visit(DocIdSetIterator iterator, byte[] packedValue) 
> throws IOException {
> queryCancellation.checkCancelled();
> in.visit(iterator, packedValue);
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-10516) reduce unnecessary loop matches in BKDReader

2022-04-14 Thread kkewwei (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-10516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kkewwei updated LUCENE-10516:
-
Description: 
In *BKDReader.visitSparseRawDocValues()*, we will read a batch of docIds which 
have the same point value:*scratchPackedValue*, then call 
*visitor.visit(scratchIterator, scratchPackedValue)* to find which docIDs match 
the range.

{code:java}
default void visit(DocIdSetIterator iterator, byte[] packedValue) throws 
IOException {
  int docID;
  while ((docID = iterator.nextDoc()) != DocIdSetIterator.NO_MORE_DOCS) { 
visit(docID, packedValue); 
  }
}
{code}
We know that the packedValue are same for the batch of docIds, if the first doc 
match the range, the batch of other docIds will also match the range, so the 
loop is useless.

We should call the method as follow:

{code:java}
  public void visit(DocIdSetIterator iterator, byte[] packedValue) 
throws IOException {
if (matches(packedValue)) {
  int docID;
  while ((docID = iterator.nextDoc()) != 
DocIdSetIterator.NO_MORE_DOCS) {
visit(docID);
  }
}
  }
{code}

https://github.com/apache/lucene/blob/2e941fcfed6cad3d9c8667ff5324cd04858ba547/lucene/core/src/java/org/apache/lucene/search/PointRangeQuery.java#L196

If we should override the *visit(DocIdSetIterator iterator, byte[] 
packedValue)* in *ExitableDirectoryReader$ExitableIntersectVisitor* to avoid 
calling the default implement:
{code:java}
@Override
public void visit(DocIdSetIterator iterator, byte[] packedValue) throws 
IOException {
queryCancellation.checkCancelled();
in.visit(iterator, packedValue);
}
{code}



  was:
In *BKDReader.visitSparseRawDocValues()*, we will read a batch of docIds which 
have the same point value:scratchPackedValue, then call 
*visitor.visit(scratchIterator, scratchPackedValue)* to find which docIDs match 
the range.

{code:java}
default void visit(DocIdSetIterator iterator, byte[] packedValue) throws 
IOException {
  int docID;
  while ((docID = iterator.nextDoc()) != DocIdSetIterator.NO_MORE_DOCS) { 
visit(docID, packedValue); 
  }
}
{code}
We know that the packedValue are same for the batch of docIds, if the first doc 
match the range, the batch of other docIds will also match the range, so the 
loop is useless.

We should call the method as follow:

{code:java}
  public void visit(DocIdSetIterator iterator, byte[] packedValue) 
throws IOException {
if (matches(packedValue)) {
  int docID;
  while ((docID = iterator.nextDoc()) != 
DocIdSetIterator.NO_MORE_DOCS) {
visit(docID);
  }
}
  }
{code}

https://github.com/apache/lucene/blob/2e941fcfed6cad3d9c8667ff5324cd04858ba547/lucene/core/src/java/org/apache/lucene/search/PointRangeQuery.java#L196

If we should override the *visit(DocIdSetIterator iterator, byte[] 
packedValue)* in *ExitableDirectoryReader$ExitableIntersectVisitor* to avoid 
calling the default implement:
{code:java}
@Override
public void visit(DocIdSetIterator iterator, byte[] packedValue) throws 
IOException {
queryCancellation.checkCancelled();
in.visit(iterator, packedValue);
}
{code}




> reduce unnecessary loop matches in BKDReader
> 
>
> Key: LUCENE-10516
> URL: https://issues.apache.org/jira/browse/LUCENE-10516
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/other
>Affects Versions: 8.6.2
>Reporter: kkewwei
>Priority: Major
>
> In *BKDReader.visitSparseRawDocValues()*, we will read a batch of docIds 
> which have the same point value:*scratchPackedValue*, then call 
> *visitor.visit(scratchIterator, scratchPackedValue)* to find which docIDs 
> match the range.
> {code:java}
> default void visit(DocIdSetIterator iterator, byte[] packedValue) throws 
> IOException {
>   int docID;
>   while ((docID = iterator.nextDoc()) != DocIdSetIterator.NO_MORE_DOCS) { 
> visit(docID, packedValue); 
>   }
> }
> {code}
> We know that the packedValue are same for the batch of docIds, if the first 
> doc match the range, the batch of other docIds will also match the range, so 
> the loop is useless.
> We should call the method as follow:
> {code:java}
>   public void visit(DocIdSetIterator iterator, byte[] packedValue) 
> throws IOException {
> if (matches(packedValue)) {
>   int docID;
>   while ((docID = iterator.nextDoc()) != 
> DocIdSetIterator.NO_MORE_DOCS) {
> visit(docID);
>   }
> }
>   }
> {code}
> https://github.com/apache/lucene/blob/2e941fcfed6cad3d9c8667ff5324cd

[jira] [Updated] (LUCENE-10516) reduce unnecessary loop matches in BKDReader

2022-04-14 Thread kkewwei (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-10516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kkewwei updated LUCENE-10516:
-
Description: 
In *BKDReader.visitSparseRawDocValues()*, we will read a batch of docIds which 
have the same point value:*scratchPackedValue*, then call 
*visitor.visit(scratchIterator, scratchPackedValue)* to find which docIDs match 
the range.

{code:java}
default void visit(DocIdSetIterator iterator, byte[] packedValue) throws 
IOException {
  int docID;
  while ((docID = iterator.nextDoc()) != DocIdSetIterator.NO_MORE_DOCS) { 
visit(docID, packedValue); 
  }
}
{code}
We know that the packedValue are same for the batch of docIds, if the first doc 
match the range, the batch of other docIds will also match the range, so the 
loop seems useless.

We should call the method as follow:

{code:java}
  public void visit(DocIdSetIterator iterator, byte[] packedValue) 
throws IOException {
if (matches(packedValue)) {
  int docID;
  while ((docID = iterator.nextDoc()) != 
DocIdSetIterator.NO_MORE_DOCS) {
visit(docID);
  }
}
  }
{code}

https://github.com/apache/lucene/blob/2e941fcfed6cad3d9c8667ff5324cd04858ba547/lucene/core/src/java/org/apache/lucene/search/PointRangeQuery.java#L196

If we should override the *visit(DocIdSetIterator iterator, byte[] 
packedValue)* in *ExitableDirectoryReader$ExitableIntersectVisitor* to avoid 
calling the default implement:
{code:java}
@Override
public void visit(DocIdSetIterator iterator, byte[] packedValue) throws 
IOException {
queryCancellation.checkCancelled();
in.visit(iterator, packedValue);
}
{code}



  was:
In *BKDReader.visitSparseRawDocValues()*, we will read a batch of docIds which 
have the same point value:*scratchPackedValue*, then call 
*visitor.visit(scratchIterator, scratchPackedValue)* to find which docIDs match 
the range.

{code:java}
default void visit(DocIdSetIterator iterator, byte[] packedValue) throws 
IOException {
  int docID;
  while ((docID = iterator.nextDoc()) != DocIdSetIterator.NO_MORE_DOCS) { 
visit(docID, packedValue); 
  }
}
{code}
We know that the packedValue are same for the batch of docIds, if the first doc 
match the range, the batch of other docIds will also match the range, so the 
loop is useless.

We should call the method as follow:

{code:java}
  public void visit(DocIdSetIterator iterator, byte[] packedValue) 
throws IOException {
if (matches(packedValue)) {
  int docID;
  while ((docID = iterator.nextDoc()) != 
DocIdSetIterator.NO_MORE_DOCS) {
visit(docID);
  }
}
  }
{code}

https://github.com/apache/lucene/blob/2e941fcfed6cad3d9c8667ff5324cd04858ba547/lucene/core/src/java/org/apache/lucene/search/PointRangeQuery.java#L196

If we should override the *visit(DocIdSetIterator iterator, byte[] 
packedValue)* in *ExitableDirectoryReader$ExitableIntersectVisitor* to avoid 
calling the default implement:
{code:java}
@Override
public void visit(DocIdSetIterator iterator, byte[] packedValue) throws 
IOException {
queryCancellation.checkCancelled();
in.visit(iterator, packedValue);
}
{code}




> reduce unnecessary loop matches in BKDReader
> 
>
> Key: LUCENE-10516
> URL: https://issues.apache.org/jira/browse/LUCENE-10516
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/other
>Affects Versions: 8.6.2
>Reporter: kkewwei
>Priority: Major
>
> In *BKDReader.visitSparseRawDocValues()*, we will read a batch of docIds 
> which have the same point value:*scratchPackedValue*, then call 
> *visitor.visit(scratchIterator, scratchPackedValue)* to find which docIDs 
> match the range.
> {code:java}
> default void visit(DocIdSetIterator iterator, byte[] packedValue) throws 
> IOException {
>   int docID;
>   while ((docID = iterator.nextDoc()) != DocIdSetIterator.NO_MORE_DOCS) { 
> visit(docID, packedValue); 
>   }
> }
> {code}
> We know that the packedValue are same for the batch of docIds, if the first 
> doc match the range, the batch of other docIds will also match the range, so 
> the loop seems useless.
> We should call the method as follow:
> {code:java}
>   public void visit(DocIdSetIterator iterator, byte[] packedValue) 
> throws IOException {
> if (matches(packedValue)) {
>   int docID;
>   while ((docID = iterator.nextDoc()) != 
> DocIdSetIterator.NO_MORE_DOCS) {
> visit(docID);
>   }
> }
>   }
> {code}
> https://github.com/apache/lucene/blob/2e941fcfed6cad3d9c8667

[GitHub] [lucene] LuXugang commented on pull request #792: LUCENE-10502: Use IndexedDISI to store docIds and DirectMonotonicWriter/Reader to handle ordToDoc

2022-04-14 Thread GitBox


LuXugang commented on PR #792:
URL: https://github.com/apache/lucene/pull/792#issuecomment-1099848115

   > We would load DirectMonotonicReader.Meta when creating the FieldEntry, and 
then create a new DirectMonotonicReader when the search or getVectorValues 
methods are called.
   
   Thanks @jtibshirani , addressed in  
https://github.com/apache/lucene/pull/792/commits/8039e9e9300cd831b9de36dd219d571467579b31


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org