Thanks, Alessandro! Well, I'm using Ruby and the r-solr as a client library. I didn't get what you said about term id. Do I have to create this field ? Or is it a "hidden field" utilized by solr under the hood ?
[]'s Rafael On Thu, Jul 2, 2015 at 6:41 AM, Alessandro Benedetti < benedetti.ale...@gmail.com> wrote: > Hi Rafael, > Your problem is clear and it has actually been explored few times in the > past. > I agree with you in a first instance. > > A Suggester basic unit of information is a term. Not a document. > This means that actually it does not make a lot of sense to return > duplicates terms ( because they are coming from different docs). > The term id should be the term itself as there is no way for a human to > perceive any difference between two different terms returned by the > Suggester. > > So, this consideration apart, are you using an intermediate API to query > Solr ( you should definitely do) . > If you are using any client, your client language should provide you a data > structure implementation to use to avoid duplicates. > Java for example is giving you HashSet , TreeSet and all the related > classes. > > Hope this helps, > > Cheers > > 2015-07-01 18:40 GMT+01:00 Rafael <rafael.man...@gmail.com>: > > > Hi, I'm building a autocomplete solution on top of Solr for an ebook > > seller, but my database is complete denormalized, for example, I have > this > > kind of records: > > > > *author | title | price* > > -----------------+-----------------------------+--------- > > J. R. R. Tolkien | Lord of the Rings | $10.0 > > J. R. R. Tolkien | Lord of the Rings Vol. 3 | $12.0 > > J. R. R. Tolkien | Lord of the Rings | $11.0 > > J. R. R. Tolkien | Lord of the Rings Vol. 3 | $7.5 > > J. R. R. Tolkien | Lord of the Rings Hardcover | $30.5 > > > > ****We are already spending effort to normalize the database, but it will > > take a while* > > > > > > Thus, when I try to implement a suggest on author field, for example, if > I > > type "*J.*" I'd get "*J. R. R. Tolkien*" 4 times. > > > > My Suggester Configuration is pretty standard: > > > > <!-- schema --> > > <fieldType name="textSuggest" class="solr.TextField" > > positionIncrementGap="100"> > > <analyzer type="index"> > > <tokenizer class="solr.KeywordTokenizerFactory"/> > > <filter class="solr.LowerCaseFilterFactory"/> > > </analyzer> > > <analyzer type="query"> > > <tokenizer class="solr.KeywordTokenizerFactory"/> > > <filter class="solr.LowerCaseFilterFactory"/> > > </analyzer> > > </fieldType> > > > > > > <!-- Solrconfig --> > > <searchComponent name="suggest" class="solr.SuggestComponent"> > > <lst name="suggester"> > > <str name="name">mySuggester</str> > > <str name="lookupImpl">AnalyzingInfixLookupFactory</str> > > <str name="dictionaryImpl">DocumentDictionaryFactory</str> > > <str name="field">author</str> > > <str name="suggestAnalyzerFieldType">textSuggest</str> > > </lst> > > </searchComponent> > > > > <requestHandler name="/suggest" class="solr.SearchHandler" > > startup="lazy"> > > <lst name="defaults"> > > <str name="suggest">true</str> > > <str name="suggest.count">20</str> > > <str name="suggest.dictionary">mySuggester</str> > > </lst> > > <arr name="components"> > > <str>suggest</str> > > </arr> > > </requestHandler> > > > > > > And I'm using Solr 5.2.1. > > > > *Question:* Is there a way to get only unique values for suggestion ? Or, > > would be simpler to export a file (or even a nem table in database) > without > > duplicated values ? > > > > Thanks. > > > > > > -- > -------------------------- > > Benedetti Alessandro > Visiting card : http://about.me/alessandro_benedetti > > "Tyger, tyger burning bright > In the forests of the night, > What immortal hand or eye > Could frame thy fearful symmetry?" > > William Blake - Songs of Experience -1794 England >