rmuir commented on PR #14031:
URL: https://github.com/apache/lucene/pull/14031#issuecomment-2512878266

   good here too. we can also save another 5 bytes with something like this. it 
seems to help me a tiny bit according to the JMH too.
   
   not sure if it makes the code harder or easier to read/maintain. i sorta 
like today that it is clear at a glance there are no data dependencies. We 
could also move the i2/i3/4 to top of loop to accomplish that if we wanted.
   
   ```
   --- 
a/lucene/core/src/java21/org/apache/lucene/internal/vectorization/PanamaVectorUtilSupport.java
   +++ 
b/lucene/core/src/java21/org/apache/lucene/internal/vectorization/PanamaVectorUtilSupport.java
   @@ -129,18 +129,21 @@ final class PanamaVectorUtilSupport implements 
VectorUtilSupport {
          acc1 = fma(va, vb, acc1);
    
          // two
   -      FloatVector vc = FloatVector.fromArray(FLOAT_SPECIES, a, i + 
floatSpeciesLength);
   -      FloatVector vd = FloatVector.fromArray(FLOAT_SPECIES, b, i + 
floatSpeciesLength);
   +      final int i2 = i + floatSpeciesLength;
   +      FloatVector vc = FloatVector.fromArray(FLOAT_SPECIES, a, i2);
   +      FloatVector vd = FloatVector.fromArray(FLOAT_SPECIES, b, i2);
          acc2 = fma(vc, vd, acc2);
    
          // three
   -      FloatVector ve = FloatVector.fromArray(FLOAT_SPECIES, a, i + 2 * 
floatSpeciesLength);
   -      FloatVector vf = FloatVector.fromArray(FLOAT_SPECIES, b, i + 2 * 
floatSpeciesLength);
   +      final int i3 = i2 + floatSpeciesLength;
   +      FloatVector ve = FloatVector.fromArray(FLOAT_SPECIES, a, i3);
   +      FloatVector vf = FloatVector.fromArray(FLOAT_SPECIES, b, i3);
          acc3 = fma(ve, vf, acc3);
    
          // four
   -      FloatVector vg = FloatVector.fromArray(FLOAT_SPECIES, a, i + 3 * 
floatSpeciesLength);
   -      FloatVector vh = FloatVector.fromArray(FLOAT_SPECIES, b, i + 3 * 
floatSpeciesLength);
   +      final int i4 = i3 + floatSpeciesLength;
   +      FloatVector vg = FloatVector.fromArray(FLOAT_SPECIES, a, i4);
   +      FloatVector vh = FloatVector.fromArray(FLOAT_SPECIES, b, i4);
          acc4 = fma(vg, vh, acc4);
        }
        // vector tail: less scalar computations for unaligned sizes, esp with 
big vector sizes
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to