[
https://issues.apache.org/jira/browse/LUCENE-9053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17018237#comment-17018237
]
ASF subversion and git services commented on LUCENE-9053:
---------------------------------------------------------
Commit 8147e491ce3905bb3543f2c7e34a4ecb60382b49 in lucene-solr's branch
refs/heads/master from Michael McCandless
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=8147e49 ]
LUCENE-9053: improve FST's package-info.java comment to clarify required
(Unicode code point) sort order for FST.Builder
> java.lang.AssertionError: inputs are added out of order lastInput=[f0 9d 9c
> 8b] vs input=[ef ac 81 67 75 72 65]
> ---------------------------------------------------------------------------------------------------------------
>
> Key: LUCENE-9053
> URL: https://issues.apache.org/jira/browse/LUCENE-9053
> Project: Lucene - Core
> Issue Type: Bug
> Reporter: gitesh
> Priority: Minor
>
> Even if the inputs are sorted in unicode order, I get following exception
> while creating FST:
>
> {code:java}
> // Input values (keys). These must be provided to Builder in Unicode sorted
> order!
> String inputValues[] = {"𝐴", "figure", "flagship"};
> long outputValues[] = {5, 7, 12};
> PositiveIntOutputs outputs = PositiveIntOutputs.getSingleton();
> Builder<Long> builder = new Builder<Long>(FST.INPUT_TYPE.BYTE1, outputs);
> BytesRefBuilder scratchBytes = new BytesRefBuilder();
> IntsRefBuilder scratchInts = new IntsRefBuilder();
> for (int i = 0; i < inputValues.length; i++) {
> scratchBytes.copyChars(inputValues[i]);
> builder.add(Util.toIntsRef(scratchBytes.get(), scratchInts),
> outputValues[i]);
> }
> FST<Long> fst = builder.finish();
> Long value = Util.get(fst, new BytesRef("figure"));
> System.out.println(value);
> {code}
> Please note that figure {color:#172b4d}and{color} flagship {color:#172b4d}are
> using the ligature character{color} fl {color:#172b4d}above. {color}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]