[
https://issues.apache.org/jira/browse/SOLR-14300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17057682#comment-17057682
]
Hongtai Xue commented on SOLR-14300:
------------------------------------
hi, I attached a patch to fix this issue.
h3. about bug
the if statement here is wrong.
{code:java}
for (BooleanClause clause : clauses) {
...
// NOTE, for query "B:1 OR B:2"
// when parse come to "B:2" ,
// filedValues here will not be null since "B:1" has been stored in
fieldValues
fieldValues = fmap.get(sfield);
...
if ((fieldValues == null && useTermsQuery) || !sfield.indexed()) {
fieldValues = new ArrayList<>(2); // <-- here, if B is not indexed,
fieldValues will be overwritten, and "B:1" will lost
fmap.put(sfield, fieldValues);
}
...
}
{code}
please check comment above,
if sfield is not indexed, fieldValues will always be overwritten.
even fieldValues is not null.
another question is why only "q=A:1 OR B:1 OR A:2 OR B:2" causes problem,
but "q=A:1 OR A:2 OR B:1 OR B:2" is OK.
the answer is
[here|https://github.com/apache/lucene-solr/blob/branch_8_4/solr/core/src/java/org/apache/solr/parser/SolrQueryParserBase.java#L705].
the bug code is only run when field change. if the fields are same in clause,
nothing will happen.
h3. how to fix
so, obviously, it's a very simple bug, and we only changed one line to fix it.
{code:java}
- if ((fieldValues == null && useTermsQuery) || !sfield.indexed()) {
+ if (fieldValues == null && (useTermsQuery || !sfield.indexed())) {
{code}
fieldValues will only be initialized when it's null.
h3. test
we confirmed the issue is fixed.
the following queries get same results.
* query1:
[http://localhost:8983/solr/books/select?q=+(name_str:Foundation+OR+cat:book+OR+name_str:Jhereg+OR+cat:cd)&debug=query]
{code:json}
"debug":{
"rawquerystring":" (name_str:Foundation OR cat:book OR name_str:Jhereg OR
cat:cd)",
"querystring":" (name_str:Foundation OR cat:book OR name_str:Jhereg OR
cat:cd)",
"parsedquery":"cat:book cat:cd ((name_str:[[46 6f 75 6e 64 61 74 69 6f 6e]
TO [46 6f 75 6e 64 61 74 69 6f 6e]]) (name_str:[[4a 68 65 72 65 67] TO [4a 68
65 72 65 67]]))",
"parsedquery_toString":"cat:book cat:cd (name_str:[[46 6f 75 6e 64 61 74 69
6f 6e] TO [46 6f 75 6e 64 61 74 69 6f 6e]] name_str:[[4a 68 65 72 65 67] TO [4a
68 65 72 65 67]])",
"QParser":"LuceneQParser"}
{code}
* query2:
[http://localhost:8983/solr/books/select?q=+(name_str:Foundation+OR+name_str:Jhereg+OR+cat:book+OR+cat:cd)&debug=query]
{code:json}
"debug":{
"rawquerystring":" (name_str:Foundation OR name_str:Jhereg OR cat:book OR
cat:cd)",
"querystring":" (name_str:Foundation OR name_str:Jhereg OR cat:book OR
cat:cd)",
"parsedquery":"cat:book cat:cd ((name_str:[[46 6f 75 6e 64 61 74 69 6f 6e]
TO [46 6f 75 6e 64 61 74 69 6f 6e]]) (name_str:[[4a 68 65 72 65 67] TO [4a 68
65 72 65 67]]))",
"parsedquery_toString":"cat:book cat:cd (name_str:[[46 6f 75 6e 64 61 74 69
6f 6e] TO [46 6f 75 6e 64 61 74 69 6f 6e]] name_str:[[4a 68 65 72 65 67] TO [4a
68 65 72 65 67]])",
"QParser":"LuceneQParser"}}
{code}
> Some conditional clauses on unindexed field will be ignored by query parser
> in some specific cases
> --------------------------------------------------------------------------------------------------
>
> Key: SOLR-14300
> URL: https://issues.apache.org/jira/browse/SOLR-14300
> Project: Solr
> Issue Type: Bug
> Security Level: Public(Default Security Level. Issues are Public)
> Components: query parsers
> Affects Versions: 7.3, 7.4, 7.5, 7.6, 7.7, 8.0, 8.1, 8.2, 8.3, 8.4
> Environment: Solr 7.3.1
> centos7.5
> Reporter: Hongtai Xue
> Priority: Minor
> Labels: newbie, patch
> Fix For: 7.3, 7.4, 7.5, 7.6, 7.7, 8.0, 8.1, 8.2, 8.3, 8.4
>
> Attachments: SOLR-14300.patch
>
>
> In some specific cases, some conditional clauses on unindexed field will be
> ignored
> * for query like, q=A:1 OR B:1 OR A:2 OR B:2
> if field B is not indexed(but docValues="true"), "B:1" will be lost.
>
> * but if you write query like, q=A:1 OR A:2 OR B:1 OR B:2,
> it will work perfect.
> the only difference of two queries is that they are wrote in different orders.
> one is *ABAB*, another is *AABB.*
>
> *steps of reproduce*
> you can easily reproduce this problem on a solr collection with _default
> configset and exampledocs/books.csv data.
> # create a _default collection
> {code:java}
> bin/solr create -c books -s 2 -rf 2{code}
> # post books.csv.
> {code:java}
> bin/post -c books example/exampledocs/books.csv{code}
> # run followed query.
> ** query1:
> [http://localhost:8983/solr/books/select?q=+(name_str:Foundation+OR+cat:book+OR+name_str:Jhereg+OR+cat:cd)&debug=query]
> ** query2:
> [http://localhost:8983/solr/books/select?q=+(name_str:Foundation+OR+name_str:Jhereg+OR+cat:book+OR+cat:cd)&debug=query]
> ** then you can find the parsedqueries are different.
> *** query1. ("name_str:Foundation" is lost.)
> {code:json}
> "debug":{
> "rawquerystring":"+(name_str:Foundation OR cat:book OR name_str:Jhereg
> OR cat:cd)",
> "querystring":"+(name_str:Foundation OR cat:book OR name_str:Jhereg OR
> cat:cd)",
> "parsedquery":"+(cat:book cat:cd (name_str:[[4a 68 65 72 65 67] TO [4a
> 68 65 72 65 67]]))",
> "parsedquery_toString":"+(cat:book cat:cd name_str:[[4a 68 65 72 65 67]
> TO [4a 68 65 72 65 67]])",
> "QParser":"LuceneQParser"}}{code}
> *** query2. ("name_str:Foundation" isn't lost.)
> {code:json}
> "debug":{
> "rawquerystring":"+(name_str:Foundation OR name_str:Jhereg OR cat:book
> OR cat:cd)",
> "querystring":"+(name_str:Foundation OR name_str:Jhereg OR cat:book OR
> cat:cd)",
> "parsedquery":"+(cat:book cat:cd ((name_str:[[46 6f 75 6e 64 61 74 69 6f
> 6e] TO [46 6f 75 6e 64 61 74 69 6f 6e]]) (name_str:[[4a 68 65 72 65 67] TO
> [4a 68 65 72 65 67]])))",
> "parsedquery_toString":"+(cat:book cat:cd (name_str:[[46 6f 75 6e 64 61
> 74 69 6f 6e] TO [46 6f 75 6e 64 61 74 69 6f 6e]] name_str:[[4a 68 65 72 65
> 67] TO [4a 68 65 72 65 67]]))",
> "QParser":"LuceneQParser"}{code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]