[ https://issues.apache.org/jira/browse/LUCENE-10405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17487895#comment-17487895 ]
ASF subversion and git services commented on LUCENE-10405: ---------------------------------------------------------- Commit 4c578017af84672edc45f4ebf7a411774a85d9bf in lucene's branch refs/heads/main from Ignacio Vera [ https://gitbox.apache.org/repos/asf?p=lucene.git;h=4c57801 ] LUCENE-10405: binary and Sorted doc values are stored as BytesRef instead of BytesRefHash in memory index (#647) When using the MemoryIndex, binary and Sorted doc values are stored as BytesRef instead of BytesRefHash so they don't have a limit on size. > MemoryIndex: Binary and Sorted doc values should not be added to a > BytesRefHash > ------------------------------------------------------------------------------- > > Key: LUCENE-10405 > URL: https://issues.apache.org/jira/browse/LUCENE-10405 > Project: Lucene - Core > Issue Type: Bug > Reporter: Ignacio Vera > Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > Currently when we add a binary or sorted doc value in the MemoryIndex, it > will get stored in a BytesRefHash. This is not necessary as we only expect > one doc value per document so they don't need to be deduped. > In addition a BytesRefHash has a limit on term size (~32kb) which those doc > values don't have in normal codecs. Therefore as it s a different behaviour > it can be considered a bug. We should store those doc values as a plain > byte[] (or BytesRef). -- This message was sent by Atlassian Jira (v8.20.1#820001) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org