[
https://issues.apache.org/jira/browse/LUCENE-9130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17017914#comment-17017914
]
Chen Zhixiang commented on LUCENE-9130:
---------------------------------------
PhraseQuery.Builder:
public Builder add(Term term) {
return add(term, positions.isEmpty() ? 0 : 1 + positions.get(positions.size()
- 1));
}
/**
* Adds a term to the end of the query phrase.
* The relative position of the term within the phrase is specified explicitly,
but must be greater than
* or equal to that of the previously added term.
* A greater position allows phrases with gaps (e.g. in connection with
stopwords).
* If the position is equal, you most likely should be using
* \{@link MultiPhraseQuery} instead which only requires one term at each
position to match; this class requires
* all of them.
*/
public Builder add(Term term, int position) {
...
I used the prev api add(term), but there is another api which can specify an
extra position argument.
Here in this case, maybe i should pass in positions 0 2 4 6 7, which can be got
from analyzing of raw query text...
> Failed to match when create PhraseQuery with terms analyzed from long query
> text
> --------------------------------------------------------------------------------
>
> Key: LUCENE-9130
> URL: https://issues.apache.org/jira/browse/LUCENE-9130
> Project: Lucene - Core
> Issue Type: Bug
> Components: core/search
> Affects Versions: 8.4
> Reporter: Chen Zhixiang
> Priority: Major
> Attachments: LongTextFieldSearchTest.java
>
>
> When i use a long text (which is euqual to doc's StringField at indexing
> time) to build a PhraseQuery, i cannot match the document. But BooleanQuery
> with MUST/AND mode successes.
>
> long query text is a address string:
> "申长路988弄虹桥万科中心地下停车场LG2层2179-2184车位(锡虹路入,LG1层开到底下LG2)"
> test case is attached.
> logs:
>
> 15:46:11.940 [main] INFO test.LongTextFieldSearchTest - indexed terms: 开, 层,
> 心, 弄, 万, 停车场, 地下, 科, 虹桥, 底下, 锡, 入, 2184, 中, 路, 到, 1, 2, 申, 2179, 车位, 988, 虹,
> lg, 长
> 15:46:11.956 [main] INFO test.LongTextFieldSearchTest - terms: 申, 长, 路, 988,
> 弄, 虹桥, 万, 科, 中, 心, 地下, 停车场, lg, 2, 层, 2179, 2184, 车位, 锡, 虹, 路, 入, lg, 1, 层,
> 开, 到, 底下, lg, 2
> 15:46:11.962 [main] INFO test.LongTextFieldSearchTest - query: +(+address:申
> +address:长 +address:路 +address:988 +address:弄 +address:虹桥 +address:万
> +address:科 +address:中 +address:心 +address:地下 +address:停车场 +address:lg
> +address:2 +address:层 +address:2179 +address:2184 +address:车位 +address:锡
> +address:虹 +address:路 +address:入 +address:lg +address:1 +address:层 +address:开
> +address:到 +address:底下 +address:lg +address:2)
> 15:46:11.988 [main] INFO test.LongTextFieldSearchTest -
> results.totalHits.value=1
> 15:46:12.181 [main] INFO test.LongTextFieldSearchTest - indexed terms: 开, 层,
> 心, 弄, 万, 停车场, 地下, 科, 虹桥, 底下, 锡, 入, 2184, 中, 路, 到, 1, 2, 申, 2179, 车位, 988, 虹,
> lg, 长
> 15:46:12.185 [main] INFO test.LongTextFieldSearchTest - terms: 申, 长, 路, 988,
> 弄, 虹桥, 万, 科, 中, 心, 地下, 停车场, lg, 2, 层, 2179, 2184, 车位, 锡, 虹, 路, 入, lg, 1, 层,
> 开, 到, 底下, lg, 2
> 15:46:12.188 [main] INFO test.LongTextFieldSearchTest - query: +address:"申 长
> 路 988 弄 虹桥 万 科 中 心 地下 停车场 lg 2 层 2179 2184 车位 锡 虹 路 入 lg 1 层 开 到 底下 lg 2"~2
> 15:46:12.210 [main] INFO test.LongTextFieldSearchTest -
> results.totalHits.value=0
> 15:46:12.214 [main] INFO test.LongTextFieldSearchTest - no matching phrase
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]