Perhaps it's not the correct tool here but decompounding using a simple 
dictionary decompounder token filter will fix this problem.

 
 
-----Original message-----
> From:Erick Erickson <erickerick...@gmail.com>
> Sent: Saturday 3rd August 2013 13:33
> To: solr-user@lucene.apache.org
> Subject: Re: SOLR matching keywords with / without whitespace
> 
> No good way comes immediately to mind. How would Solr know
> that 'wal mart' should be concatenated but 'many people' should
> not?
> 
> You can do this with somewhat with synonyms, but it depends on
> knowing ahead of time what all the possibilities are.
> 
> Best
> Erick
> 
> 
> On Fri, Aug 2, 2013 at 1:27 PM, SolrLover <bbar...@gmail.com> wrote:
> 
> > I am trying to match the keywords with / without white space but one of the
> > case fails always..
> >
> > For ex:
> >
> > I am indexing 4 documents
> >
> > name: wal mart
> > name: walmart
> > name: WalMart
> > name: Walmart
> >
> > Now searching on name either using
> > wal mart
> > walmart
> > Walmart
> > WalMart
> >
> > should return all the above 4 documents but searching using keyword 'wal
> > mart' returns only the first document and not the remaining 3 documents.
> >
> > I am using shingle filter factory to create combination of the words during
> > indexing. Please find below my configuration. Can someone let me know where
> > I am wrong?
> >
> >       <fieldType name="shingleString" class="solr.TextField"
> > omitNorms="true">
> >           <analyzer type="index">
> >              <charFilter class="solr.PatternReplaceCharFilterFactory"
> >               pattern="'+" replacement=""/>
> >               <tokenizer class="solr.WhitespaceTokenizerFactory"/>
> >               <filter class="solr.ASCIIFoldingFilterFactory"/>
> >               <filter class="solr.ShingleFilterFactory" minShingleSize="2"
> >               maxShingleSize="3" outputUnigrams="true"/>
> >               <filter class="solr.PatternReplaceFilterFactory"
> > pattern="\W+"
> >               replacement=""/>
> >               <filter class="solr.LowerCaseFilterFactory"/>
> >           </analyzer>
> >          <analyzer type="query">
> >              <tokenizer class="solr.WhitespaceTokenizerFactory"/>
> >              <filter class="solr.ShingleFilterFactory" minShingleSize="2"
> >              maxShingleSize="99" outputUnigrams="true"/>
> >              <filter class="solr.PatternReplaceFilterFactory" pattern="\W+"
> >              replacement=""/>
> >              <filter class="solr.LowerCaseFilterFactory"/>
> >          </analyzer>
> >     </fieldType>
> >   </types>
> >
> >
> >
> >
> >
> > --
> > View this message in context:
> > http://lucene.472066.n3.nabble.com/SOLR-matching-keywords-with-without-whitespace-tp4082244.html
> > Sent from the Solr - User mailing list archive at Nabble.com.
> >
> 

Reply via email to