Hi Rafael, Your problem is clear and it has actually been explored few times in the past. I agree with you in a first instance.
A Suggester basic unit of information is a term. Not a document. This means that actually it does not make a lot of sense to return duplicates terms ( because they are coming from different docs). The term id should be the term itself as there is no way for a human to perceive any difference between two different terms returned by the Suggester. So, this consideration apart, are you using an intermediate API to query Solr ( you should definitely do) . If you are using any client, your client language should provide you a data structure implementation to use to avoid duplicates. Java for example is giving you HashSet , TreeSet and all the related classes. Hope this helps, Cheers 2015-07-01 18:40 GMT+01:00 Rafael <rafael.man...@gmail.com>: > Hi, I'm building a autocomplete solution on top of Solr for an ebook > seller, but my database is complete denormalized, for example, I have this > kind of records: > > *author | title | price* > -----------------+-----------------------------+--------- > J. R. R. Tolkien | Lord of the Rings | $10.0 > J. R. R. Tolkien | Lord of the Rings Vol. 3 | $12.0 > J. R. R. Tolkien | Lord of the Rings | $11.0 > J. R. R. Tolkien | Lord of the Rings Vol. 3 | $7.5 > J. R. R. Tolkien | Lord of the Rings Hardcover | $30.5 > > ****We are already spending effort to normalize the database, but it will > take a while* > > > Thus, when I try to implement a suggest on author field, for example, if I > type "*J.*" I'd get "*J. R. R. Tolkien*" 4 times. > > My Suggester Configuration is pretty standard: > > <!-- schema --> > <fieldType name="textSuggest" class="solr.TextField" > positionIncrementGap="100"> > <analyzer type="index"> > <tokenizer class="solr.KeywordTokenizerFactory"/> > <filter class="solr.LowerCaseFilterFactory"/> > </analyzer> > <analyzer type="query"> > <tokenizer class="solr.KeywordTokenizerFactory"/> > <filter class="solr.LowerCaseFilterFactory"/> > </analyzer> > </fieldType> > > > <!-- Solrconfig --> > <searchComponent name="suggest" class="solr.SuggestComponent"> > <lst name="suggester"> > <str name="name">mySuggester</str> > <str name="lookupImpl">AnalyzingInfixLookupFactory</str> > <str name="dictionaryImpl">DocumentDictionaryFactory</str> > <str name="field">author</str> > <str name="suggestAnalyzerFieldType">textSuggest</str> > </lst> > </searchComponent> > > <requestHandler name="/suggest" class="solr.SearchHandler" > startup="lazy"> > <lst name="defaults"> > <str name="suggest">true</str> > <str name="suggest.count">20</str> > <str name="suggest.dictionary">mySuggester</str> > </lst> > <arr name="components"> > <str>suggest</str> > </arr> > </requestHandler> > > > And I'm using Solr 5.2.1. > > *Question:* Is there a way to get only unique values for suggestion ? Or, > would be simpler to export a file (or even a nem table in database) without > duplicated values ? > > Thanks. > -- -------------------------- Benedetti Alessandro Visiting card : http://about.me/alessandro_benedetti "Tyger, tyger burning bright In the forests of the night, What immortal hand or eye Could frame thy fearful symmetry?" William Blake - Songs of Experience -1794 England