Nitin,

I have not tested using shingles with collations but my guess here is the 
collation feature is not going to work as expected with a shingled index.  So 
try re-indexing without the shingles and see if it gives you more intuitive 
results.  If that helps, and if you want to still correct whitespace errors, 
then consider using WordBreakSolrSpellChecker instead of shingles (the main 
solr example demonstrates how).  

Beyond that, without some queries *and* the full spellcheck response, and an 
explanation as to why you feel the spellcheck response is wrong, I'm not sure 
you will get much more help with this.

Here is what "hits" in the collation response means:

> By "hits", it means if you replaced the "q" parameter on the original
> query but left everything else the same (filters, etc), this is how many
> results you would get.

James Dyer
Ingram Content Group


-----Original Message-----
From: Nitin Solanki [mailto:nitinml...@gmail.com] 
Sent: Monday, February 09, 2015 11:38 PM
To: solr-user@lucene.apache.org
Subject: Re: Collations are not working fine.

Hi *James Dyer*
*,*
                       I have not done stemming and my
spellcheck.alternativeTermCount is set equals to spellcheck.count. Below, I
have pasted my solrconfig.xml and schema.xml configuration.


*URL: *
localhost:8983/solr/wikingram/spell?q=gram_ci:"deligh"&wt=json&indent=true&shards.qt=/spell

*solrconfig.xml:*

<searchComponent name="spellcheck" class="solr.SpellCheckComponent">
    <str name="queryAnalyzerFieldType">textSpellCi</str>
    <lst name="spellchecker">
      <str name="name">default</str>
      <str name="field">gram_ci</str>
      <str name="classname">solr.DirectSolrSpellChecker</str>
      <str name="distanceMeasure">internal</str>
      <float name="accuracy">0.5</float>
      <int name="maxEdits">2</int>
      <int name="minPrefix">0</int>
      <int name="maxInspections">5</int>
      <int name="minQueryLength">2</int>
      <float name="maxQueryFrequency">0.9</float>
      <str name="comparatorClass">freq</str>
    </lst>
</searchComponent>

<requestHandler name="/spell" class="solr.SearchHandler" startup="lazy">
    <lst name="defaults">
      <str name="df">gram_ci</str>
      <str name="spellcheck.dictionary">default</str>
      <str name="spellcheck">on</str>
      <str name="spellcheck.extendedResults">true</str>
      <str name="spellcheck.count">25</str>
      <str name="spellcheck.onlyMorePopular">true</str>
      <str name="spellcheck.maxResultsForSuggest">100000000</str>
      <str name="spellcheck.alternativeTermCount">25</str>
      <str name="spellcheck.collate">true</str>
      <str name="spellcheck.maxCollations">50</str>
      <str name="spellcheck.maxCollationTries">50</str>
      <str name="spellcheck.collateExtendedResults">true</str>
    </lst>
    <arr name="last-components">
      <str>spellcheck</str>
    </arr>
  </requestHandler>

*Schema.xml: *

<field name="gram_ci" type="textSpellCi" indexed="true" stored="true"
multiValued="false"/>

</fieldType><fieldType name="textSpellCi" class="solr.TextField"
positionIncrementGap="100">
       <analyzer type="index">
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.ShingleFilterFactory" maxShingleSize="5"
minShingleSize="2" outputUnigrams="true"/>
</analyzer>
    <analyzer type="query">
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.ShingleFilterFactory" maxShingleSize="5"
minShingleSize="2" outputUnigrams="true"/>
</analyzer>
</fieldType>

On Tue, Feb 10, 2015 at 1:23 AM, Dyer, James <james.d...@ingramcontent.com>
wrote:

> Nitin,
>
> My guess here is that your spellcheck field is a field that has stemming.
> This might be why you get a collation that return "wind" even though the
> user queried "wnd" and it does not get any suggestions.  Perhaps "wnd" is
> stemmed the same as "wind" ?  (Spellcheck usually works best if you
> "copyField" the query field to something that is tokenized but not heavily
> analyzed, and use the copy as the spellcheck dictionary.)
>
> The other problem might be because "wind" is in the index but you are not
> using "spellcheck.alternativeTermCount".  If you set this to the same value
> as "spellcheck.count", then it will give suggestions even when words exist
> in the index.
>
> By "hits", it means if you replaced the "q" parameter on the original
> query but left everything else the same (filters, etc), this is how many
> results you would get.
>
> If you need more help, please include in your message the pertinent
> sections of solrconfig.xml, schema.xml and also the full query url you are
> using and the full spellcheck response.
>
> James Dyer
> Ingram Content Group
>
>
> -----Original Message-----
> From: Nitin Solanki [mailto:nitinml...@gmail.com]
> Sent: Monday, February 09, 2015 7:47 AM
> To: solr-user@lucene.apache.org
> Subject: Collations are not working fine.
>
> I am working on spell checking in Solr. I have implemented Suggestions and
> collations in my spell checker component.
>
> Most of the time collations work fine but in few case it fails.
>
> *Working*:
> I tried query:*gone wthh thes wnd*: In this "wnd" doesn't give suggestion
> "wind" but collation is coming right = "gone with the wind", hits = 117
>
>
> *Not working:*
> But when I tried query: *gone wthh thes wint*: In this "wint" does give
> suggestion "wind" but collation is not coming right. Instead of gone with
> the wind it gives gone with the west, hits = 1.
>
> And I want to also know what is *hits* in collations.
>

Reply via email to