[ 
https://issues.apache.org/jira/browse/LUCENE-10113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17417659#comment-17417659
 ] 

ASF subversion and git services commented on LUCENE-10113:
----------------------------------------------------------

Commit c57d6e5f8c8a8cae943f99ba7393550579666eb4 in lucene's branch 
refs/heads/main from Uwe Schindler
[ https://gitbox.apache.org/repos/asf?p=lucene.git;h=c57d6e5 ]

LUCENE-10113: Use VarHandles to access int/long/short types in byte arrays 
(e.g. ByteArrayDataInput) (#308)

Co-authored-by: Robert Muir <rm...@apache.org>

> Use VarHandles to access int/long/short types in byte arrays (e.g. 
> ByteArrayDataInput)
> --------------------------------------------------------------------------------------
>
>                 Key: LUCENE-10113
>                 URL: https://issues.apache.org/jira/browse/LUCENE-10113
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: core/store
>    Affects Versions: main (9.0)
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>            Priority: Major
>             Fix For: main (9.0)
>
>          Time Spent: 5h
>  Remaining Estimate: 0h
>
> LUCENE-10112 reminded me about something i wanted to do long ago: Basically 
> for all IndexInputs/DataInputs we are able to natively read short, int, long 
> using little endian with single CPU instructions (due to using ByteBuffer's 
> methods that support primitive reads). Only ByteArrayDataInput still uses 
> manual code beased on the the inherited byte-by-byte approach to read single 
> bytes and combining the bytes using little endian.
> The approach here is to use Java 9+ VarHandles to allow reading 
> int/long/short as single cpu instructions and not manually recombining the 
> bytes. The trick is to make a "view" var handle which allows to access the 
> byte array using the same mechanisms as ByteBuffers or JDK 17 MemorySegments 
> (under the hood it uses Unsafe to use CPU instructions and optionally swap 
> bytes if platform endianness is BE).
> In LUCENE-10112 there were similar stuff done with LZ4 and a microbenchmark 
> was written that showed a significant speed improvement when accessing the 
> types with VarHandle.
> P.S.: The same applies to FST.BytesReader and/or ByteSliceReader, but I am no 
> sure if those use the int/short/long ones at all. At least this one does not 
> override the methods to read ints, longs and shorts, so there is no 
> optimization at all. FST seems to read bytes and byte[] only and 
> ByteSliceReader mostly VInts.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to