[ 
https://issues.apache.org/jira/browse/LUCENE-9791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17288339#comment-17288339
 ] 

Paweł Bugalski commented on LUCENE-9791:
----------------------------------------

This is how it would look like:
{code:java}
private boolean equals(int id, BytesRef b) {
  final int textStart = bytesStart[id];
  final byte[] bytes = pool.buffers[textStart >> BYTE_BLOCK_SHIFT];
  int pos = textStart & BYTE_BLOCK_MASK;
  final int length;
  final int offset;
  if ((bytes[pos] & 0x80) == 0) {
    // length is 1 byte
    length = bytes[pos];
    offset = pos + 1;
  } else {
    // length is 2 bytes
    length = (bytes[pos] & 0x7f) + ((bytes[pos + 1] & 0xff) << 7);
    offset = pos + 2;
  }
  return Arrays.equals(
      bytes,
      offset,
      offset + length,
      b.bytes,
      b.offset,
      b.offset + b.length);
} {code}

It works but it basically a merge of setBytesRef and BytesRef#bytesEquals and 
the only performance benefit that I was seeing is not present as  BytesRefHash 
does not allow for terms to breaks on buffer borders so both this solution and 
the previous one does no allocations. 

> Monitor (aka Luwak) has concurrency issues related to BytesRefHash#find
> -----------------------------------------------------------------------
>
>                 Key: LUCENE-9791
>                 URL: https://issues.apache.org/jira/browse/LUCENE-9791
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: core/other
>    Affects Versions: master (9.0), 8.7, 8.8
>            Reporter: Paweł Bugalski
>            Priority: Major
>         Attachments: LUCENE-9791.patch
>
>
> _org.apache.lucene.monitor.Monitor_ can sometimes *NOT* match a document that 
> should be matched by one of registered queries if match operations are run 
> concurrently from multiple threads. 
> This is because sometimes in a concurrent environment 
> _TermFilteredPresearcher_ might not select a query that could later on match 
> one of documents being matched.
> Internally _TermFilteredPresearcher_ is using a term acceptor: an instance of 
> _org.apache.lucene.monitor.QueryIndex.QueryTermFilter_. _QueryTermFilter_ is 
> correctly initialized under lock and its internal state (a map of 
> _org.apache.lucene.util.BytesRefHash_ instances) is correctly published. 
> Later one when those instances are used concurrently a problem with 
> _org.apache.lucene.util.BytesRefHash#find_ is triggered since it is not 
> thread safe.
> _org.apache.lucene.util.BytesRefHash#find_ internally is using a private 
> _org.apache.lucene.util.BytesRefHash#equals_ method, which is using an 
> instance field _scratch1_ as a temporary buffer to compare its _ByteRef_ 
> parameter with contents of _ByteBlockPool_. This is not thread safe and can 
> cause incorrect answers as well as _ArrayOutOfBoundException_. 
> __
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to