On 2019-02-18T18:12:44, David '-1' Schmid wrote: > Will report back if that's working out. It's working!
If anybody want's to replicate, here's what I ended up with. .. managed-schema: . . <!-- the field(Type) where the original vallues are stored in --> . <fieldType name="important_strings" class="solr.StrField" . sortMissingLast="true" docValues="true" indexed="true" stored="true" . multiValued="true"/> . <field name="author" type="important_strings"/> . . <!-- lower case tokenizatrion for case insensitive matches --> . <fieldType name="text_lower" class="solr.TextField" multiValued="true" positionIncrementGap="100"> . <analyzer> . <tokenizer class="solr.WhitespaceTokenizerFactory"/> . <filter class="solr.LowerCaseFilterFactory"/> . </analyzer> . </fieldType> . <field name="author_lower" type="text_lower"/> . <copyField source="author" dest="author_lower"/> . . <!-- as above but with added edgeNGrams --> . <fieldType name="text_prefix" class="solr.TextField" multiValued="true" positionIncrementGap="100"> . <analyzer type="index"> . <tokenizer class="solr.WhitespaceTokenizerFactory"/> . <filter class="solr.LowerCaseFilterFactory"/> . <filter class="solr.EdgeNGramFilterFactory" minGramSize="2" maxGramSize="15"/> . </analyzer> . <analyzer type="query"> . <tokenizer class="solr.WhitespaceTokenizerFactory"/> . <filter class="solr.LowerCaseFilterFactory"/> . </analyzer> . </fieldType> . <field name="author_ngram" type="text_prefix"/> . <copyField source="author" dest="author_ngram"/> . The requestHandler uses the three fields above to provide suggestions. .. solrconfig.xml: . . <requestHandler class="solr.SearchHandler" name="/suggest_author"> . <lst name="defaults"> . <str name="defType">edismax</str> . <str name="rows">10</str> . <str name="fl">author</str> . <str name="qf">author_lower^10 author_ngram</str> . </lst> . </requestHandler> . In case a token will match the name (or surname) of an author completely it will boost the complete match over the partial match from author_ngram: Let's say I want to find "Hauck" and get a result for the first four chars. .. curl http://localhost:8983/solr/dblp/suggest_author?q=hauc . "docs": [ . { . "author": [ . "Gregor Hauc" . ] . }, . { . "author": [ . "Andrej Kovacic", . "Gregor Hauc", . "Brina Buh", . "Mojca Indihar Stemberger" . ] . }, . { . "author": [ . "Franz J. Hauck", . "Franz Johannes Hauck" . ] . }, . /* ... */ . ] once I get the last character in, it will boost complete over partial matches: .. curl http://localhost:8983/solr/dblp/suggest_author?q=hauck . "docs": [ . { . "author": [ . "Rainer Hauck" . ] . }, . { . "author": [ . "Julia Hauck" . ] . }, . { . "author": [ . "Bernd Hauck" . ] . }, . /* ... */ . ] As these are not the persons I were looking for, I start typing the first name: .. curl 'http://localhost:8983/solr/dblp/suggest_author?q=hauck%20fra' . "docs": [ . { . "author": [ . "Fra Angelico Viray" . ] . }, . { . "author": [ . "Alberto Del Fra" . ] . }, . { . "author": [ . "Alberto Del Fra" . ] . }, . /* ... */ . ] ohno, now my previous match was replaced by some other match. This can be curcumvented by adding "q.op=AND" to enforce both: .. curl 'http://localhost:8983/solr/dblp/suggest_author?q.op=AND&q=hauck%20fra' . "docs": [ . { . "author": [ . "Franz J. Hauck", . "Franz Johannes Hauck" . ] . }, . /* ... */ . ] Which achieves what I wanted, really. q.op can be set in solrconfig to always use AND. Adding hl=true to the query will provide highlighting: .. curl 'http://localhost:8983/solr/dblp/suggest_author?q.op=AND&q=hauck%20fra&hl=true' . "highlighting": { . "homepages/h/FranzJHauck": { . "author_lower": [ . "Franz J. <em>Hauck</em>" . ], . "author_ngram": [ . "<em>Franz</em> J. <em>Hauck</em>" . ] . }, I'm pretty happy with this :D The original idea came from the book "Solr in Action" by: Trey Grainger and Timothy Potter. It's from 2014 (builds on solr 4.7), and might need some adaptions :D regards, -1