[ 
https://issues.apache.org/jira/browse/LUCENE-10534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17529572#comment-17529572
 ] 

Kevin Risden edited comment on LUCENE-10534 at 4/29/22 5:15 PM:
----------------------------------------------------------------

Updated metrics - there is a benefit to the new maxfloatfunction logic to avoid 
duplicate exists() even with using the new fieldsource stuff in LUCENE-10542. 

| Benchmark                                                       | Mode  | Cnt 
| Score and Error  | Units |
|-----------------------------------------------------------------|-------|-----|------------------|-------|
| MyBenchmark.testMaxFloatFunction                                | thrpt | 25  
| 64.159  ±  2.031 | ops/s |
| MyBenchmark.testNewMaxFloatFunction                             | thrpt | 25  
| 94.997  ±  2.365 | ops/s |
| MyBenchmark.testMaxFloatFunctionNewFloatFieldSource             | thrpt | 25  
| 123.191 ±  9.291 | ops/s |
| MyBenchmark.testNewMaxFloatFunctionNewFloatFieldSource          | thrpt | 25  
| 123.817 ±  6.191 | ops/s |
| MyBenchmark.testMaxFloatFunctionRareField                       | thrpt | 25  
| 244.921 ±  6.439 | ops/s |
| MyBenchmark.testNewMaxFloatFunctionRareField                    | thrpt | 25  
| 239.288 ±  5.136 | ops/s |
| MyBenchmark.testMaxFloatFunctionNewFloatFieldSourceRareField    | thrpt | 25  
| 271.521 ±  3.870 | ops/s |
| MyBenchmark.testNewMaxFloatFunctionNewFloatFieldSourceRareField | thrpt | 25  
| 279.334 ± 10.511 | ops/s |


was (Author: risdenk):
Updated metrics - there is a benefit to the new maxfloatfunction logic to avoid 
duplicate exists() even with using the new fieldsource stuff in LUCENE-10542. 

| Benchmark                                                       | Mode  | Cnt 
| Score and Error  | Units |
|-----------------------------------------------------------------|-------|-----|------------------|-------|
| MyBenchmark.testMaxFloatFunction                                | thrpt | 25  
| 69.949 ± 4.043   | ops/s |
| MyBenchmark.testMaxFloatFunctionNewFloatFieldSource             | thrpt | 25  
| 112.326 ± 3.228  | ops/s |
| MyBenchmark.testNewMaxFloatFunction                             | thrpt | 25  
| 93.216 ± 2.757   | ops/s |
| MyBenchmark.testNewMaxFloatFunctionNewFloatFieldSource          | thrpt | 25  
| 123.364 ± 7.861  | ops/s |
| MyBenchmark.testMaxFloatFunctionRareField                       | thrpt | 25  
| 257.339 ± 33.849 | ops/s |
| MyBenchmark.testMaxFloatFunctionNewFloatFieldSourceRareField    | thrpt | 25  
| 287.175 ± 22.840 | ops/s |
| MyBenchmark.testNewMaxFloatFunctionRareField                    | thrpt | 25  
| 235.268 ± 4.103  | ops/s |
| MyBenchmark.testNewMaxFloatFunctionNewFloatFieldSourceRareField | thrpt | 25  
| 272.397 ± 8.406  | ops/s |

> MinFloatFunction / MaxFloatFunction exists check can be slow
> ------------------------------------------------------------
>
>                 Key: LUCENE-10534
>                 URL: https://issues.apache.org/jira/browse/LUCENE-10534
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Kevin Risden
>            Assignee: Kevin Risden
>            Priority: Minor
>         Attachments: flamegraph.png, flamegraph_getValueForDoc.png
>
>          Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> MinFloatFunction 
> (https://github.com/apache/lucene/blob/main/lucene/queries/src/java/org/apache/lucene/queries/function/valuesource/MinFloatFunction.java)
>  and MaxFloatFunction 
> (https://github.com/apache/lucene/blob/main/lucene/queries/src/java/org/apache/lucene/queries/function/valuesource/MaxFloatFunction.java)
>  both check if values exist. This is needed since the underlying valuesource 
> returns 0.0f as either a valid value or as a value when the document doesn't 
> have a value.
> Even though this is changed to anyExists and short circuits in the case a 
> value is found in any document, the worst case is that there is no value 
> found and requires checking all the way through to the raw data. This is only 
> needed when 0.0f is returned and need to determine if it is a valid value or 
> the not found case.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to