[ https://issues.apache.org/jira/browse/SOLR-13797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Mike Drob updated SOLR-13797: ----------------------------- Fix Version/s: master (9.0) Resolution: Fixed Status: Resolved (was: Patch Available) Thanks for the review Anshum. Fixed the annotation and pushed. > SolrResourceLoader produces inconsistent results when given bad arguments > ------------------------------------------------------------------------- > > Key: SOLR-13797 > URL: https://issues.apache.org/jira/browse/SOLR-13797 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Affects Versions: 7.7.2, 8.2 > Reporter: Mike Drob > Assignee: Mike Drob > Priority: Major > Fix For: master (9.0) > > Attachments: SOLR-13797.v1.patch, SOLR-13797.v2.patch > > > SolrResourceLoader will attempt to do some magic to infer what the user > wanted when loading TokenFilter and Tokenizer classes. However, this can end > up putting the wrong class in the cache such that the request succeeds the > first time but fails subsequent times. It should either succeed or fail > consistently on every call. > This can be triggered in a variety of ways, but the simplest is maybe by > specifying the wrong element type in an indexing chain. Consider the field > type definition: > {code:xml} > <fieldType name="text_en_partial" class="solr.TextField"> > <analyzer type="index"> > <tokenizer class="solr.WhitespaceTokenizerFactory"/> > <filter class="solr.NGramTokenizerFactory" minGramSize="1" > maxGramSize="2"/> > </analyzer> > </fieldType> > {code} > If loaded by itself (e.g. docker container for standalone validation) then > the schema will pass and collection will succeed, with Solr actually figuring > out that it needs an {{NGramTokenFilterFactory}}. However, if this is loaded > on a cluster with other collections where the {{NGramTokenizerFactory}} has > been loaded correctly then we get {{ClassCastException}}. Or if this > collection is loaded first then others using the Tokenizer will fail instead. > I'd argue that succeeding on both calls is the better approach because it > does what the user likely wants instead of what the user explicitly asks for, > and creates a nicer user experience that is marginally less pedantic. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org