Sure, this is the schema I used...
<fieldType name="text_ngram" class="solr.TextField"
positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords_en.txt" enablePositionIncrement="true"/>
<filter class="solr.EdgeNGramFilterFactory" minGramSize="2"
maxGramSize="15" side="front"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords_en.txt" enablePositionIncrement="true"/>
</analyzer>
</fieldType>
Input was CopyFielded into this field. So using your example...
Katy Perry gets split into ...
Ka, Kat, Katy, Katy 'space', Katy P etc
So when a user is initially inputting their search, you can use an ajax
call to search for each part of the query and return matches. This
system would return Katy Perry if a user has just typed in 'Ka' whereas
other schemas wouldnt.
Note that theres some assumptions here... for instance the entire phrase
is Tokenised so a search for Perry wouldnt return a match. Still found
it more effective for autosuggest than either the Suggestor options in
Solr 3.4 or using Spellchecker.
It is worth using Spellcheck on a failed search however.
Hope that helps
On 12/10/2011 09:34, Oliver Beattie wrote:
Hi Doug,
Sounds very interesting; would you mind sharing some details of how
exactly you did this? What request handler did you use etc?
Many thanks,
Oliver
On 11 October 2011 17:37, Doug McKenzie<doug.mcken...@firebox.com> wrote:
I've just done something similar and rather than using the Spellchecker went
for NEdgeGramFilters instead for the suggestions. Worth looking into imo
On 11/10/2011 16:13, Oliver Beattie wrote:
Hi,
I'm sure this is something that's probably been covered before, and I
shouldn't need to ask. But anyway. I'm trying to build an autosuggest
with org.apache.solr.spelling.suggest.Suggester
The content being searched is music artist names, so I need to be able
to deal with suggesting things like "Katy Perry" if the user types
"Katy Pe" (sorry, couldn't think of a more tasteful example off the
cuff). I've tried a few things, but so far none give satisfactory
results. Here's my current configuration:
<fieldType name="autosuggestString" class="solr.TextField"
omitNorms="true">
<analyzer type="index">
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.TrimFilterFactory"/>
</analyzer>
</fieldType>
…for which I have a copyField called suggestionArtist. In my
solrconfig.xml I have:
<searchComponent name="autosuggester" class="solr.SpellCheckComponent">
<lst name="spellchecker">
<str name="name">autosuggester</str>
<str
name="classname">org.apache.solr.spelling.suggest.Suggester</str>
<str
name="lookupImpl">org.apache.solr.spelling.suggest.fst.FSTLookup</str>
<str name="field">suggestionArtist</str>
<float name="threshold">0.0005</float>
<str name="buildOnCommit">true</str>
</lst>
</searchComponent>
<requestHandler name="/suggest" class="solr.SearchHandler">
<lst name="defaults">
<str name="spellcheck">true</str>
<str name="spellcheck.dictionary">autosuggester</str>
<str name="spellcheck.onlyMorePopular">false</str>
<str name="spellcheck.count">5</str>
<str name="spellcheck.collate">true</str>
</lst>
<arr name="components">
<str>autosuggester</str>
</arr>
</requestHandler>
If anyone could give me any pointers, I'd be really grateful.
—Oliver