[ 
https://issues.apache.org/jira/browse/LUCENE-10113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler resolved LUCENE-10113.
------------------------------------
    Resolution: Fixed

Thanks [~rcmuir]!

> Use VarHandles to access int/long/short types in byte arrays (e.g. 
> ByteArrayDataInput)
> --------------------------------------------------------------------------------------
>
>                 Key: LUCENE-10113
>                 URL: https://issues.apache.org/jira/browse/LUCENE-10113
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: core/store
>    Affects Versions: main (9.0)
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>            Priority: Major
>             Fix For: main (9.0)
>
>          Time Spent: 5h 10m
>  Remaining Estimate: 0h
>
> LUCENE-10112 reminded me about something i wanted to do long ago: Basically 
> for all IndexInputs/DataInputs we are able to natively read short, int, long 
> using little endian with single CPU instructions (due to using ByteBuffer's 
> methods that support primitive reads). Only ByteArrayDataInput still uses 
> manual code beased on the the inherited byte-by-byte approach to read single 
> bytes and combining the bytes using little endian.
> The approach here is to use Java 9+ VarHandles to allow reading 
> int/long/short as single cpu instructions and not manually recombining the 
> bytes. The trick is to make a "view" var handle which allows to access the 
> byte array using the same mechanisms as ByteBuffers or JDK 17 MemorySegments 
> (under the hood it uses Unsafe to use CPU instructions and optionally swap 
> bytes if platform endianness is BE).
> In LUCENE-10112 there were similar stuff done with LZ4 and a microbenchmark 
> was written that showed a significant speed improvement when accessing the 
> types with VarHandle.
> P.S.: The same applies to FST.BytesReader and/or ByteSliceReader, but I am no 
> sure if those use the int/short/long ones at all. At least this one does not 
> override the methods to read ints, longs and shorts, so there is no 
> optimization at all. FST seems to read bytes and byte[] only and 
> ByteSliceReader mostly VInts.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to