<face_palm/> Right. Sorry. Despite appearances to the contrary, I'm not a bot designed to lead you down the garden path of debugging for yourself with the goal of increasing the size of the Solr contributor pool...
I confirmed the failure in 6.x, but all seems to work in 7.x and trunk. I opened SOLR-11450 and attached a unit test based on your correction of mine. 😊 Thank you, again! -----Original Message----- From: Bjarke Buur Mortensen [mailto:morten...@eluence.com] Sent: Monday, October 9, 2017 8:39 AM To: solr-user@lucene.apache.org Subject: Re: Complexphrase treats wildcards differently than other query parsers Thanks again, Tim, following your recipe, I was able to write a failing test: assertQ(req("q", "{!complexphrase} iso-latin1:cr\u00E6zy*") , "//result[@numFound='1']" , "//doc[./str[@name='id']='1']" ); Notice how cr\u00E6zy* is used as a query term which mimics the behaviour I originally reported, namely that CPQP does not analyse it because of the wildcard and thus does not hit the charfilter from the query side. 2017-10-06 20:54 GMT+02:00 Allison, Timothy B. <talli...@mitre.org>: > That could be it. I'm not able to reproduce this with trunk. More > next week. > > In trunk, if I add this to schema15.xml: > <fieldType name="text_iso_latin1_mapping" class="solr.TextField"> > <analyzer> > <charFilter class="solr.MappingCharFilterFactory" > mapping="mapping- ISOLatin1Accent.txt"/> > <tokenizer class="solr.MockTokenizerFactory"/> > </analyzer> > </fieldType> > <field name="iso-latin1" type="text_iso_latin1_mapping" indexed="true" > stored="true"/> > > This test passes. > > @Test > public void testCharFilter() { > assertU(adoc("iso-latin1", "cr\u00E6zy tr\u00E6n", "id", "1")); > assertU(commit()); > assertU(optimize()); > > assertQ(req("q", "{!complexphrase} iso-latin1:craezy") > , "//result[@numFound='1']" > , "//doc[./str[@name='id']='1']" > ); > > assertQ(req("q", "{!complexphrase} iso-latin1:traen") > , "//result[@numFound='1']" > , "//doc[./str[@name='id']='1']" > ); > > assertQ(req("q", "{!complexphrase} iso-latin1:caezy~1") > , "//result[@numFound='1']" > , "//doc[./str[@name='id']='1']" > ); > > assertQ(req("q", "{!complexphrase} iso-latin1:crae*") > , "//result[@numFound='1']" > , "//doc[./str[@name='id']='1']" > ); > > assertQ(req("q", "{!complexphrase} iso-latin1:*aezy") > , "//result[@numFound='1']" > , "//doc[./str[@name='id']='1']" > ); > > assertQ(req("q", "{!complexphrase} iso-latin1:crae*y") > , "//result[@numFound='1']" > , "//doc[./str[@name='id']='1']" > ); > > assertQ(req("q", "{!complexphrase} iso-latin1:\"craezy traen\"") > , "//result[@numFound='1']" > , "//doc[./str[@name='id']='1']" > ); > > assertQ(req("q", "{!complexphrase} iso-latin1:\"caezy~1 traen\"") > , "//result[@numFound='1']" > , "//doc[./str[@name='id']='1']" > ); > > assertQ(req("q", "{!complexphrase} iso-latin1:\"craez* traen\"") > , "//result[@numFound='1']" > , "//doc[./str[@name='id']='1']" > ); > > assertQ(req("q", "{!complexphrase} iso-latin1:\"*aezy traen\"") > , "//result[@numFound='1']" > , "//doc[./str[@name='id']='1']" > ); > > assertQ(req("q", "{!complexphrase} iso-latin1:\"crae*y traen\"") > , "//result[@numFound='1']" > , "//doc[./str[@name='id']='1']" > ); > } > > > > -----Original Message----- > From: Bjarke Buur Mortensen [mailto:morten...@eluence.com] > Sent: Friday, October 6, 2017 6:46 AM > To: solr-user@lucene.apache.org > Subject: Re: Complexphrase treats wildcards differently than other > query parsers > > Thanks a lot for your effort, Tim. > > Looking at it from the Solr side, I see some use of local classes. The > snippet below in particular caught my eye (in > solr/core/src/java/org/apache/ solr/search/ComplexPhraseQParserPlugin.java). > The instance of ComplexPhraseQueryParser is not the clean one from > Lucene, but a modified one. If any of the modifications messes with > the analysis logic, well then that might answer it. > > What do you make of it? > > lparser = new ComplexPhraseQueryParser(defaultField, getReq().getSchema(). > getQueryAnalyzer()) > { > protected Query newWildcardQuery(org.apache.lucene.index.Term t) { try > { org.apache.lucene.search.Query wildcardQuery = reverseAwareParser. > getWildcardQuery(t.field(), t.text()); > setRewriteMethod(wildcardQuery); return wildcardQuery; } catch > (SyntaxError e) { throw new RuntimeException(e); } } private Query > setRewriteMethod(org.apache.lucene.search.Query query) { if (query > instanceof MultiTermQuery) { > ((MultiTermQuery) query).setRewriteMethod( org.apache.lucene.search. > MultiTermQuery.SCORING_BOOLEAN_REWRITE); > } > return query; > } > protected Query newRangeQuery(String field, String part1, String > part2, boolean startInclusive, boolean endInclusive) { boolean reverse > = reverseAwareParser.isRangeShouldBeProtectedFromReverse(field, > part1); > return super.newRangeQuery(field, > reverse ? reverseAwareParser.getLowerBoundForReverse() : part1, part2, > startInclusive || reverse, endInclusive); } } ; > > Thanks, > Bjarke > > >