Hi, I'm building a autocomplete solution on top of Solr for an ebook
seller, but my database is complete denormalized, for example, I have this
kind of records:

*author           | title                       | price*
-----------------+-----------------------------+---------
J. R. R. Tolkien | Lord of the Rings           | $10.0
J. R. R. Tolkien | Lord of the Rings Vol. 3    | $12.0
J. R. R. Tolkien | Lord of the Rings           | $11.0
J. R. R. Tolkien | Lord of the Rings Vol. 3    | $7.5
J. R. R. Tolkien | Lord of the Rings Hardcover | $30.5

****We are already spending effort to normalize the database, but it will
take a while*


Thus, when I try to implement a suggest on author field, for example, if I
type "*J.*" I'd get "*J. R. R. Tolkien*" 4 times.

My Suggester Configuration is pretty standard:

<!-- schema -->
    <fieldType name="textSuggest" class="solr.TextField"
positionIncrementGap="100">
      <analyzer type="index">
        <tokenizer class="solr.KeywordTokenizerFactory"/>
        <filter class="solr.LowerCaseFilterFactory"/>
      </analyzer>
      <analyzer type="query">
        <tokenizer class="solr.KeywordTokenizerFactory"/>
        <filter class="solr.LowerCaseFilterFactory"/>
      </analyzer>
    </fieldType>


<!-- Solrconfig -->
  <searchComponent name="suggest" class="solr.SuggestComponent">
        <lst name="suggester">
      <str name="name">mySuggester</str>
      <str name="lookupImpl">AnalyzingInfixLookupFactory</str>
      <str name="dictionaryImpl">DocumentDictionaryFactory</str>
      <str name="field">author</str>
      <str name="suggestAnalyzerFieldType">textSuggest</str>
    </lst>
  </searchComponent>

  <requestHandler name="/suggest" class="solr.SearchHandler" startup="lazy">
    <lst name="defaults">
      <str name="suggest">true</str>
      <str name="suggest.count">20</str>
      <str name="suggest.dictionary">mySuggester</str>
    </lst>
    <arr name="components">
      <str>suggest</str>
    </arr>
  </requestHandler>


And I'm using Solr 5.2.1.

*Question:* Is there a way to get only unique values for suggestion ? Or,
would be simpler to export a file (or even a nem table in database) without
duplicated values ?

Thanks.

Reply via email to