Re: [PR] Use growNoCopy when copying bytes in BytesRefBuilder [lucene]

2024-10-20 Thread via GitHub


iverase merged PR #13930:
URL: https://github.com/apache/lucene/pull/13930


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Speedup PriorityQueue a little [lucene]

2024-10-20 Thread via GitHub


dweiss commented on code in PR #13936:
URL: https://github.com/apache/lucene/pull/13936#discussion_r1808146179


##
lucene/core/src/java/org/apache/lucene/util/PriorityQueue.java:
##
@@ -270,7 +280,7 @@ public final boolean remove(T element) {
 return false;
   }
 
-  private final boolean upHeap(int origPos) {
+  private boolean upHeap(int origPos, T[] heap) {

Review Comment:
   > the compiler can't hoist the field access out of the loop I think (maybe 
in some happy cases escape analysis helps here).
   
   I don't think there's anything in the spec preventing it from doing so. The 
final keyword is indeed for the java compiler, not for the jvm, but... you know 
- it's easy to show that c2 can happily hoist out field reads, try it.
   ```
   public final class SuperSoft {
 private static boolean ready;
   
 public static void startThread() {
   new Thread() {
   public void run() {
   try {
   sleep(2000);
   } catch (Exception e) { /* ignore */ }
   System.out.println("Marking loop exit.");
   ready = true;
   }
   }.start();
 }
   
 public static void main(String[] args) {
   startThread();
   System.out.println("Entering the loop...");
   while (!ready) {
   // Do nothing.
   }
   System.out.println("Done, I left the loop!");
 }
   }
   ```
   
   This aside, I am not rejecting the change - I just suggested to rename one 
local variable (s) and to remove method parameter in favor of a single local 
variable read - this should result in identical code to what your 1% gain was 
producing, if my gut feeling is right.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Speedup OrderIntervalsSource some more [lucene]

2024-10-20 Thread via GitHub


jpountz commented on code in PR #13937:
URL: https://github.com/apache/lucene/pull/13937#discussion_r1808166752


##
lucene/queries/src/java/org/apache/lucene/queries/intervals/OrderedIntervalsSource.java:
##
@@ -161,8 +163,8 @@ public int nextInterval() throws IOException {
 final int end = last.end();
 this.end = end;
 int slop = end - start + 1;
-for (IntervalIterator subIterator : subIterators) {
-  slop -= subIterator.width();
+for (int j = 0; j < subIterators.size(); j++) {

Review Comment:
   Can you add a small code comment, so that someone doesn't change it back to 
a for..in loop in the future?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Speedup PriorityQueue a little [lucene]

2024-10-20 Thread via GitHub


original-brownbear commented on code in PR #13936:
URL: https://github.com/apache/lucene/pull/13936#discussion_r1807955670


##
lucene/core/src/java/org/apache/lucene/util/PriorityQueue.java:
##
@@ -270,7 +280,7 @@ public final boolean remove(T element) {
 return false;
   }
 
-  private final boolean upHeap(int origPos) {
+  private boolean upHeap(int origPos, T[] heap) {

Review Comment:
   Whether or not it's worth doing this kind of optimization for the observed 
gain is a tricky question. From the perspective of a user of a large (and 
read-heavy) ES, Opensearch or similar deployment, an O(1%) gain might translate 
into a lot of dollars saved and this kind of thing is well worth the effort.
   Personally, an extra 10 lines of code for the observed speedups seems like a 
reasonable deal, but that's admittedly quite subjective. Maybe a stronger 
argument would be: optimizing this kind of thing in core hot code removes 
potential bottlenecks from the system, enabling other optimizations. If the 
core logic puts massive pressure on e.g. the CPU cache then optimizations (or 
regressions!) in higher-level code are masked on CPUs with smaller caches. So 
doing a 1% optimization and living with slightly more complicated code makes 
more sense here than a 1% gain would in more "peripheral" code. Also, you could 
use that same angle and argue that this code hardly ever gets touched, so the 
maintenance burden added matters less than it would elsewhere.
   
   That said :) as far as the technical details go I don't think it can hoist 
out those reads and it's not an exclusively C2/hotstpot specific thing either. 
Since Java allows using reflection to update final field values (except for 
fields that are either static, on a record or on a hidden classs) the compiler 
can't hoist the field access out of the loop I think (maybe in some happy cases 
escape analysis helps here).
   You can make the JIT hoist these things via `-XX:+TrustFinalNonStaticFields` 
which gives me a result like (main vs main with that flag set).
   
   
   results
   
   TaskQPS baseline  StdDevQPS 
my_modified_version  StdDevPct diff p-value
   HighTermTitleBDVSort   25.72  (7.0%)   25.77  
(6.5%)0.2% ( -12% -   14%) 0.897
BrowseRandomLabelSSDVFacets3.37  (6.0%)3.40  
(4.2%)1.0% (  -8% -   11%) 0.409
  OrHighMed  214.06  (3.7%)  216.37  
(3.8%)1.1% (  -6% -8%) 0.206
  OrHighNotHigh  353.55  (8.4%)  358.52  
(8.9%)1.4% ( -14% -   20%) 0.475
AndHighHigh  111.32  (4.9%)  113.08  
(5.5%)1.6% (  -8% -   12%) 0.179
  OrNotHighHigh  567.88  (4.8%)  577.94  
(4.9%)1.8% (  -7% -   12%) 0.108
   PKLookup  241.21  (2.1%)  245.53  
(2.1%)1.8% (  -2% -6%) 0.000
   HighTerm  455.94  (6.6%)  464.35  
(7.5%)1.8% ( -11% -   17%) 0.250
MedTerm  590.06  (6.5%)  601.24  
(6.0%)1.9% (  -9% -   15%) 0.182
 AndHighMed  156.22  (3.1%)  159.19  
(2.9%)1.9% (  -3% -8%) 0.005
LowTerm  750.87  (4.6%)  765.45  
(4.2%)1.9% (  -6% -   11%) 0.052
BrowseRandomLabelTaxoFacets4.48  (8.6%)4.57  
(3.9%)2.0% (  -9% -   15%) 0.182
   OrNotHighMed  479.29  (4.6%)  489.00  
(5.4%)2.0% (  -7% -   12%) 0.074
  HighTermMonthSort 1515.68  (6.4%) 1546.97  
(7.0%)2.1% ( -10% -   16%) 0.171
 OrHighHigh   85.48  (4.6%)   87.32  
(5.3%)2.2% (  -7% -   12%) 0.055
   MedTermDayTaxoFacets   19.13  (3.0%)   19.55  
(4.1%)2.2% (  -4% -9%) 0.007
MedIntervalsOrdered   28.59  (6.3%)   29.23  
(4.7%)2.2% (  -8% -   14%) 0.079
  OrHighLow  610.70  (5.0%)  624.94  
(5.0%)2.3% (  -7% -   13%) 0.040
   OrHighNotMed  474.52  (5.5%)  485.78  
(5.7%)2.4% (  -8% -   14%) 0.061
 Fuzzy2   66.51  (3.2%)   68.09  
(3.0%)2.4% (  -3% -8%) 0.001
   BrowseDateSSDVFacets1.24  (7.7%)1.27  
(8.1%)2.4% ( -12% -   19%) 0.181
MedSpanNear  119.05  (4.4%)  121.94  
(4.4%)2.4% (  -6% -   11%) 0.016
  HighTermTitleSort   76.83  (4.8%)   78.72  
(3.7%)2.5% (  -5% -   11%) 0.011
   AndHighHighDayTaxoFacets   14.60  (3.8%)   14.96  
(3.5%)2.5% (  -4% -   10%) 0.003
  Brows

Re: [PR] Speedup PriorityQueue a little [lucene]

2024-10-20 Thread via GitHub


original-brownbear commented on code in PR #13936:
URL: https://github.com/apache/lucene/pull/13936#discussion_r1807873638


##
lucene/core/src/java/org/apache/lucene/util/PriorityQueue.java:
##
@@ -270,7 +280,7 @@ public final boolean remove(T element) {
 return false;
   }
 
-  private final boolean upHeap(int origPos) {
+  private boolean upHeap(int origPos, T[] heap) {

Review Comment:
   > although I think assigning to a local variable within the method would 
yield the same result
   
   Not quite, the idea was that I already have `heap` in a local in the caller, 
so if I pass it as an argument I save a field read and as an added bonus get a 
smaller method that inlines better.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Speedup PriorityQueue a little [lucene]

2024-10-20 Thread via GitHub


original-brownbear commented on code in PR #13936:
URL: https://github.com/apache/lucene/pull/13936#discussion_r1807876659


##
lucene/core/src/java/org/apache/lucene/util/PriorityQueue.java:
##
@@ -270,7 +280,7 @@ public final boolean remove(T element) {
 return false;
   }
 
-  private final boolean upHeap(int origPos) {
+  private boolean upHeap(int origPos, T[] heap) {

Review Comment:
   > >long-term maintenance as worth the tiny performance benefit
   
   With this class in particular I'm not sure the argument holds. Isn't the 
whole point of it the ability to mutate top and resort via `updateTop` as an 
optimization over the JDKs priority queue? If the implementation is slower than 
java.util.PriorityQueue, then what's the point? :) Also, I'm still not sure I 
agree with the "tiny" part :) 
   Granted there's limits to the benchmark data provided, but it's more likely 
than not that a couple things improved by 3%+ isn't it? Plus, I could see a 
possible compounding effect with further optimizations in the users of the PQ 
if those can be reduced in size enough to have `lessThan` inline and not be a 
megamorphic callsite here and there.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Speedup PriorityQueue a little [lucene]

2024-10-20 Thread via GitHub


original-brownbear commented on code in PR #13936:
URL: https://github.com/apache/lucene/pull/13936#discussion_r1807876827


##
lucene/core/src/java/org/apache/lucene/util/PriorityQueue.java:
##
@@ -117,26 +117,29 @@ public PriorityQueue(int maxSize, Supplier 
sentinelObjectSupplier) {
* ArrayIndexOutOfBoundsException} is thrown.
*/
   public void addAll(Collection elements) {
-if (this.size + elements.size() > this.maxSize) {
+int s = size;

Review Comment:
   Right that was a little weird sorry :) renamed now.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Speedup PriorityQueue a little [lucene]

2024-10-20 Thread via GitHub


rmuir commented on PR #13936:
URL: https://github.com/apache/lucene/pull/13936#issuecomment-2424867914

   This results in a lot more code complexity, which makes maintenance 
difficult. Maybe the version of java you are testing with has a bug in its 
register allocator or something? 
   
   seriously? I think we should take a step back before making all of our code 
more complex for a 1% benefit which might just be an upstream compiler bug.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Speedup PriorityQueue a little [lucene]

2024-10-20 Thread via GitHub


original-brownbear commented on PR #13936:
URL: https://github.com/apache/lucene/pull/13936#issuecomment-2425018683

   I think the queue methods changed here  in isolation get a far bigger 
improvement than 1% in many cases. Plus making methods like the ones adjusted 
here smaller and easier on the CPU cache tends to help the performance of 
"neighboring" code as well in many case (hence the across the board speedup in 
the luceneutil run).
   
   I don't think this is the result of a JVM bug, it's just something that is 
hard to optimize by the compiler with Java so dynamic. It's a combination of 
two things. 
   
   1. Current JIT implementations don't really take advantage of `final` fields 
in normal objects https://openjdk.org/jeps/8132243, 
https://bugs.openjdk.org/browse/JDK-8058164 and so on. That's what makes 
caching stuff like size or the heap array so helpful.
   2. Field loads simple result in larger methods when measuring byte code size 
than caching in a local variable. Unless JIT implementations become more 
sophisticated (looking at e.g. 
https://mail.openjdk.org/pipermail/core-libs-dev/2023-July/109461.html it 
doesn't look like that's happening anytime soon), avoiding field access tends 
to result in deeper inlining here and there.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Speedup PriorityQueue a little [lucene]

2024-10-20 Thread via GitHub


dweiss commented on code in PR #13936:
URL: https://github.com/apache/lucene/pull/13936#discussion_r1807927256


##
lucene/core/src/java/org/apache/lucene/util/PriorityQueue.java:
##
@@ -270,7 +280,7 @@ public final boolean remove(T element) {
 return false;
   }
 
-  private final boolean upHeap(int origPos) {
+  private boolean upHeap(int origPos, T[] heap) {

Review Comment:
   > Not quite, the idea was that I already have heap in a local in the caller, 
so if I pass it as an argument I save a field read and as an added bonus get a 
smaller method that inlines better.
   
   I did understand the intention but I think the difference, if any, will be 
noticeable only if the loop doesn't hoist out the field read (which, I think it 
should?). My suggestion keeps the variables local, which helps in understanding 
of what it does. But anyway. I'm not entirely sold on these low-level 
optimizations that target c2/hotspot. There is so many moving parts here... 
operating system and CPU included. Eh.
   
   > Isn't the whole point of it the ability to mutate top and resort via 
updateTop as an optimization over the JDKs priority queue? If the 
implementation is slower than java.util.PriorityQueue, then what's the point? :)
   
   I believe the differences were also functional - insertWithOverflow is one 
particular example that comes to mind and would require more complex logic in 
the JDK's PQ. Another is resigning from one level of indirection (method 
instead of Comparator) - these choices predate a lot of newer Java's 
offerings - perhaps it could be implemented in a different way now.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Speedup PriorityQueue a little [lucene]

2024-10-20 Thread via GitHub


dweiss commented on code in PR #13936:
URL: https://github.com/apache/lucene/pull/13936#discussion_r1807867567


##
lucene/core/src/java/org/apache/lucene/util/PriorityQueue.java:
##
@@ -270,7 +280,7 @@ public final boolean remove(T element) {
 return false;
   }
 
-  private final boolean upHeap(int origPos) {
+  private boolean upHeap(int origPos, T[] heap) {

Review Comment:
   I'd create a local heap variable (var heap = this.heap) locally in this 
method, not pass it as an argument. It is confusing why you'd want it as an 
argument. I agree with Robert here that we should perhaps see long-term 
maintenance as worth the tiny performance benefit (although I think assigning 
to a local variable within the method would yield the same result).
   



##
lucene/core/src/java/org/apache/lucene/util/PriorityQueue.java:
##
@@ -117,26 +117,29 @@ public PriorityQueue(int maxSize, Supplier 
sentinelObjectSupplier) {
* ArrayIndexOutOfBoundsException} is thrown.
*/
   public void addAll(Collection elements) {
-if (this.size + elements.size() > this.maxSize) {
+int s = size;

Review Comment:
   Can we at least rename "s" to "size" and use this.size as the right hand 
side of this assignment?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org