uschindler commented on PR #12731:
URL: https://github.com/apache/lucene/pull/12731#issuecomment-1785453474

   > Last time i tried to figure out WTF was happening here, I think i 
determined that floating point reproducibility was still preventing this from 
happening? That there isn't like a "bail out" from this on the vector API, 
instead just some clever wording in the javadocs of `reduceLanes`
   > 
   > Which is really sad, how is the vector API supposed to be usable if 
everyone has to unroll their own loops in order to use 100% of the hardware 
instead of 25%.
   
   The float use case is problematic becaue order of multiplications/sums 
changes the result. So you can't easily rewrite the stuff to run in parallel as 
the result would be different. This is also the reason why the auto-vectorizer 
can't do anything
   
   I think the Panama API should allow the user to figure out how many parallel 
units are available to somehow dynamically split work correctly.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to