[ https://issues.apache.org/jira/browse/LUCENE-9130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17020375#comment-17020375 ]
Michael McCandless commented on LUCENE-9130: -------------------------------------------- {quote}from the document here, i presume the slop is only used for looking FORWARD, not looking AROUND {quote} Hmm, why is that? I see this reference in that javadoc: {quote}, and at a maximum edit distance of {{slop}}. {quote} and: {quote}The slop is an edit distance between respective positions of terms as defined in this [{{PhraseQuery}}|https://lucene.apache.org/core/8_4_0/core/org/apache/lucene/search/PhraseQuery.html] and the positions of terms in a document. For instance, when searching for {{"quick fox"}}, it is expected that the difference between the positions of {{fox}} and {{quick}} is 1. So {{"a quick brown fox"}} would be at an edit distance of 1 since the difference of the positions of {{fox}} and {{quick}} is 2. Similarly, {{"the fox is quick"}} would be at an edit distance of 3 since the difference of the positions of {{fox}} and {{quick}} is -2. The slop defines the maximum edit distance for a document to match. More exact matches are scored higher than sloppier matches, thus search results are sorted by exactness. {quote} I think the intention is that it is an arbitrary edit distance, with same +1 penalty for any edit (not sure if transpositions count as 1 or 2 though). Maybe propose a patch to improve this description? {quote}Also, IMO user can surely report issues which he thought as BUG, that's issue tracker is used for. Users are also part of community, detailed discussions can help others who has similar obstacles. {quote} +1, we can always iterate on the issue and then resolve as {{Not a Bug}} if need be. > Failed to match when create PhraseQuery with terms analyzed from long query > text > -------------------------------------------------------------------------------- > > Key: LUCENE-9130 > URL: https://issues.apache.org/jira/browse/LUCENE-9130 > Project: Lucene - Core > Issue Type: Bug > Components: core/search > Affects Versions: 8.4 > Reporter: Chen Zhixiang > Priority: Major > Attachments: LongTextFieldSearchTest.java > > > When i use a long text (which is euqual to doc's StringField at indexing > time) to build a PhraseQuery, i cannot match the document. But BooleanQuery > with MUST/AND mode successes. > > long query text is a address string: > "申长路988弄虹桥万科中心地下停车场LG2层2179-2184车位(锡虹路入,LG1层开到底下LG2)" > test case is attached. > logs: > > 15:46:11.940 [main] INFO test.LongTextFieldSearchTest - indexed terms: 开, 层, > 心, 弄, 万, 停车场, 地下, 科, 虹桥, 底下, 锡, 入, 2184, 中, 路, 到, 1, 2, 申, 2179, 车位, 988, 虹, > lg, 长 > 15:46:11.956 [main] INFO test.LongTextFieldSearchTest - terms: 申, 长, 路, 988, > 弄, 虹桥, 万, 科, 中, 心, 地下, 停车场, lg, 2, 层, 2179, 2184, 车位, 锡, 虹, 路, 入, lg, 1, 层, > 开, 到, 底下, lg, 2 > 15:46:11.962 [main] INFO test.LongTextFieldSearchTest - query: +(+address:申 > +address:长 +address:路 +address:988 +address:弄 +address:虹桥 +address:万 > +address:科 +address:中 +address:心 +address:地下 +address:停车场 +address:lg > +address:2 +address:层 +address:2179 +address:2184 +address:车位 +address:锡 > +address:虹 +address:路 +address:入 +address:lg +address:1 +address:层 +address:开 > +address:到 +address:底下 +address:lg +address:2) > 15:46:11.988 [main] INFO test.LongTextFieldSearchTest - > results.totalHits.value=1 > 15:46:12.181 [main] INFO test.LongTextFieldSearchTest - indexed terms: 开, 层, > 心, 弄, 万, 停车场, 地下, 科, 虹桥, 底下, 锡, 入, 2184, 中, 路, 到, 1, 2, 申, 2179, 车位, 988, 虹, > lg, 长 > 15:46:12.185 [main] INFO test.LongTextFieldSearchTest - terms: 申, 长, 路, 988, > 弄, 虹桥, 万, 科, 中, 心, 地下, 停车场, lg, 2, 层, 2179, 2184, 车位, 锡, 虹, 路, 入, lg, 1, 层, > 开, 到, 底下, lg, 2 > 15:46:12.188 [main] INFO test.LongTextFieldSearchTest - query: +address:"申 长 > 路 988 弄 虹桥 万 科 中 心 地下 停车场 lg 2 层 2179 2184 车位 锡 虹 路 入 lg 1 层 开 到 底下 lg 2"~2 > 15:46:12.210 [main] INFO test.LongTextFieldSearchTest - > results.totalHits.value=0 > 15:46:12.214 [main] INFO test.LongTextFieldSearchTest - no matching phrase -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org