Hello, I've had a break-through with my partial string search problem, I don't understand why though.
I found yet another example, https://medium.com/aubergine-solutions/partial-string-search-in-apache-solr-4b9200e8e6bb and this one uses a different tokenizer, whitespaceTokenizerFactory <fieldType name="text_ngrm" class="solr.TextField" positionIncrementGap= "100"> <analyzer type="index"> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.NGramFilterFactory" minGramSize="1" maxGramSize="50"/> <filter class="solr.LowerCaseFilterFactory"/> </analyzer> <analyzer type="query"> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory"/> </analyzer> </fieldType> The analysis results look very different. It seems to be returning the desired results so far. [image: image.png] I don't understand why the other examples that worked for other people weren't working for me. Is it version 8? StandardTokenizerFactory didn't work and when I was trying with the KeywordTokenizerFactory it wasn't even matching the full search term. If anyone can shed any light, then I'd be grateful. Thanks. On Wed, Aug 5, 2020 at 7:12 PM Philip Smith <phi...@keep.edu.hk> wrote: > Hello, > I'm new to Solr and to this user group. Any help with this problem > would be greatly appreciated. > > I'm trying to get partial keyword search results working. This seems like > a fairly common problem, I've found numerous google results offering > solutions > for instance > https://stackoverflow.com/questions/28753671/how-to-configure-solr-to-do-partial-word-matching > but when I attempt to implement them I'm not receiving the desired > results. > > I'm running solr 8.5.2 in standalone mode, manually editing the configs. > > I have configured the title field as > > <field name="title" type="edge_ngram_test_5" indexed="true" stored="true" > multiValued="false"/> > > I have also tried it with this parameter omitTermFreqAndPositions="true" > > The field type definition is: > > <fieldType name="edge_ngram_test_5" class="solr.TextField" omitNorms= > "false"> > <analyzer type="index"> > <tokenizer class="solr.StandardTokenizerFactory"/> > <filter class="solr.StopFilterFactory" words="stopwords.txt"/> > <filter class="solr.LowerCaseFilterFactory"/> > <filter class="solr.PorterStemFilterFactory"/> > <filter class="solr.EdgeNGramFilterFactory" minGramSize="2" maxGramSize= > "35" /> > </analyzer> > <analyzer type="query"> > <tokenizer class="solr.StandardTokenizerFactory"/> > <filter class="solr.StopFilterFactory" words="stopwords.txt"/> > <filter class="solr.LowerCaseFilterFactory"/> > <filter class="solr.PorterStemFilterFactory"/> > </analyzer> > </fieldType> > > I'm using edismax and searching on title. > > > http://localhost:8983/solr/events/select?defType=edismax&df=title&fl=title&q=educatio > > when using edge_ngram_test_5 > > edu correctly finds 4 results > educa finds 0 > educat finds 0 > educati finds 0 > educatio finds 0 > education correctly finds 4. > > Steps taken between changes to the schema. > bin/solr restart > reimport data > core admin > reload core > > In admin, I see the correct value, > Typeedge_ngram_test_5 when I check in schema. > > In admin , when I check in analysis and search on text analyse > > [image: image.png] > it appears to be breaking the word down into letters as I would guess is > the correct step. > > These are the query results: > [image: image.png] > > it looks like it is applying the correct filter names and the search term > isn't being altered. I don't understand enough to be able to determine why > the query can't find the search result when it appears to have been > indexed. Any advice is very welcome as I've spent hours trying to get this > working. > > > I've also tried with: > <fieldType name="edge_n2_kw_text" class="solr.TextField" omitNorms="true" > positionIncrementGap="100"> > <analyzer type="index"> > <tokenizer class="solr.KeywordTokenizerFactory"/> > <filter class="solr.LowerCaseFilterFactory"/> > <filter class="solr.EdgeNGramFilterFactory" minGramSize="2" maxGramSize= > "25"/> > </analyzer> > <analyzer type="query"> > <tokenizer class="solr.KeywordTokenizerFactory"/> > <filter class="solr.LowerCaseFilterFactory"/> > </analyzer> > </fieldType> > > <fieldType name="text_edgengram_prod" class="solr.TextField" > positionIncrementGap="100" > > <analyzer type="index" > > <tokenizer class="solr.KeywordTokenizerFactory"/> > <filter class="solr.StopFilterFactory" ignoreCase="true" words= > "stopwords.txt" /> > <filter class="solr.PorterStemFilterFactory" /> > <filter class="solr.EdgeNGramFilterFactory" minGramSize="1" maxGramSize= > "30"/> <!-- RDH - removed side="front"--> > </analyzer> > <analyzer type="query" > > <tokenizer class="solr.KeywordTokenizerFactory"/> > <filter class="solr.StopFilterFactory" ignoreCase="true" words= > "stopwords.txt" /> > <filter class="solr.PorterStemFilterFactory" /> > <filter class="solr.RemoveDuplicatesTokenFilterFactory" /> > </analyzer> > </fieldType> > > > <fieldType name="edge_ngram_test_4" class="solr.TextField" > positionIncrementGap="100" > > <analyzer type="index" > > <tokenizer class="solr.KeywordTokenizerFactory"/> > <filter class="solr.SnowballPorterFilterFactory" language="English" /> > <filter class="solr.EdgeNGramFilterFactory" minGramSize="1" maxGramSize= > "25" /> > </analyzer> > <analyzer type="query" > > <tokenizer class="solr.KeywordTokenizerFactory"/> > </analyzer> > </fieldType> > > > Thanks in advance for any insights offered. > Kind regards, > Phil. >