RE: Complexphrase treats wildcards differently than other query parsers

Allison, Timothy B. Fri, 06 Oct 2017 11:55:29 -0700

That could be it.  I'm not able to reproduce this with trunk.  More next week.


In trunk, if I add this to schema15.xml:
  <fieldType name="text_iso_latin1_mapping" class="solr.TextField">
    <analyzer>
      <charFilter class="solr.MappingCharFilterFactory" 
mapping="mapping-ISOLatin1Accent.txt"/>
      <tokenizer class="solr.MockTokenizerFactory"/>
    </analyzer>
  </fieldType>
  <field name="iso-latin1" type="text_iso_latin1_mapping" indexed="true" 
stored="true"/>

This test passes.

  @Test
  public void testCharFilter() {
    assertU(adoc("iso-latin1", "cr\u00E6zy tr\u00E6n", "id", "1"));
    assertU(commit());
    assertU(optimize());

    assertQ(req("q", "{!complexphrase} iso-latin1:craezy")
        , "//result[@numFound='1']"
        , "//doc[./str[@name='id']='1']"
    );

    assertQ(req("q", "{!complexphrase} iso-latin1:traen")
        , "//result[@numFound='1']"
        , "//doc[./str[@name='id']='1']"
    );

    assertQ(req("q", "{!complexphrase} iso-latin1:caezy~1")
        , "//result[@numFound='1']"
        , "//doc[./str[@name='id']='1']"
    );

    assertQ(req("q", "{!complexphrase} iso-latin1:crae*")
        , "//result[@numFound='1']"
        , "//doc[./str[@name='id']='1']"
    );

    assertQ(req("q", "{!complexphrase} iso-latin1:*aezy")
        , "//result[@numFound='1']"
        , "//doc[./str[@name='id']='1']"
    );

    assertQ(req("q", "{!complexphrase} iso-latin1:crae*y")
        , "//result[@numFound='1']"
        , "//doc[./str[@name='id']='1']"
    );

    assertQ(req("q", "{!complexphrase} iso-latin1:\"craezy traen\"")
        , "//result[@numFound='1']"
        , "//doc[./str[@name='id']='1']"
    );

    assertQ(req("q", "{!complexphrase} iso-latin1:\"caezy~1 traen\"")
        , "//result[@numFound='1']"
        , "//doc[./str[@name='id']='1']"
    );

    assertQ(req("q", "{!complexphrase} iso-latin1:\"craez* traen\"")
        , "//result[@numFound='1']"
        , "//doc[./str[@name='id']='1']"
    );

    assertQ(req("q", "{!complexphrase} iso-latin1:\"*aezy traen\"")
        , "//result[@numFound='1']"
        , "//doc[./str[@name='id']='1']"
    );

    assertQ(req("q", "{!complexphrase} iso-latin1:\"crae*y traen\"")
        , "//result[@numFound='1']"
        , "//doc[./str[@name='id']='1']"
    );
  }



-----Original Message-----
From: Bjarke Buur Mortensen [mailto:morten...@eluence.com] 
Sent: Friday, October 6, 2017 6:46 AM
To: solr-user@lucene.apache.org
Subject: Re: Complexphrase treats wildcards differently than other query parsers

Thanks a lot for your effort, Tim.

Looking at it from the Solr side, I see some use of local classes. The snippet 
below in particular caught my eye (in 
solr/core/src/java/org/apache/solr/search/ComplexPhraseQParserPlugin.java).
The instance of ComplexPhraseQueryParser is not the clean one from Lucene, but 
a modified one. If any of the modifications messes with the analysis logic, 
well then that might answer it.

What do you make of it?

lparser = new ComplexPhraseQueryParser(defaultField, getReq().getSchema().
getQueryAnalyzer())
{
protected Query newWildcardQuery(org.apache.lucene.index.Term t) { try { 
org.apache.lucene.search.Query wildcardQuery = reverseAwareParser.
getWildcardQuery(t.field(), t.text());
setRewriteMethod(wildcardQuery);
return wildcardQuery;
} catch (SyntaxError e) {
throw new RuntimeException(e);
}
}
private Query setRewriteMethod(org.apache.lucene.search.Query query) { if 
(query instanceof MultiTermQuery) {
((MultiTermQuery) query).setRewriteMethod( 
org.apache.lucene.search.MultiTermQuery.SCORING_BOOLEAN_REWRITE);
}
return query;
}
protected Query newRangeQuery(String field, String part1, String part2, boolean 
startInclusive, boolean endInclusive) { boolean reverse = 
reverseAwareParser.isRangeShouldBeProtectedFromReverse(field,
part1);
return super.newRangeQuery(field,
reverse ? reverseAwareParser.getLowerBoundForReverse() : part1, part2, 
startInclusive || reverse, endInclusive); } } ;

Thanks,
Bjarke

RE: Complexphrase treats wildcards differently than other query parsers

Reply via email to