[ 
https://issues.apache.org/jira/browse/LUCENE-10405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17487895#comment-17487895
 ] 

ASF subversion and git services commented on LUCENE-10405:
----------------------------------------------------------

Commit 4c578017af84672edc45f4ebf7a411774a85d9bf in lucene's branch 
refs/heads/main from Ignacio Vera
[ https://gitbox.apache.org/repos/asf?p=lucene.git;h=4c57801 ]

LUCENE-10405:  binary and Sorted doc values are stored as BytesRef instead of 
BytesRefHash in memory index (#647)

When using the MemoryIndex, binary and Sorted doc values are stored 
as BytesRef instead of BytesRefHash so they don't have a limit on size.

> MemoryIndex: Binary and Sorted doc values should not be added to a 
> BytesRefHash
> -------------------------------------------------------------------------------
>
>                 Key: LUCENE-10405
>                 URL: https://issues.apache.org/jira/browse/LUCENE-10405
>             Project: Lucene - Core
>          Issue Type: Bug
>            Reporter: Ignacio Vera
>            Priority: Major
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently when we add a binary or sorted doc value in the MemoryIndex, it 
> will get stored in a BytesRefHash. This is not necessary as we only expect 
> one doc value per document so they don't need to be deduped.
> In addition  a BytesRefHash has a limit on term size (~32kb) which those doc 
> values don't have in normal codecs. Therefore as it s a different behaviour 
> it can be considered a bug. We should store those doc values as a plain 
> byte[] (or BytesRef). 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to