But what is your generic problem then. Because you probably are not looking
for "andthe" kind of tokens.

However a shingle plus regex to remove whitespace can give you "anytwo
wordstogether smooshed" tokens in the index.

Regards,
     Alex


On Fri, Aug 3, 2018, 7:19 AM Clemens Wyss DEV, <clemens...@mysign.ch> wrote:

> Hi Markus,
> thanks for the quick answer.
>
> "sound stage" was just an example. We are looking for a generic solution
> ...
>
> Is it "ok" to apply an NGRamFilter for query-analyzing?
> <analyzer type="query">
>         <tokenizer class="solr.WhitespaceTokenizerFactory" />
>         <filter class="solr.LowerCaseFilterFactory" />
>         <filter class="solr.NGramFilterFactory" minGramSize="3"
> maxGramSize="15" />
> </analyzer>
>
> I guess (besides the performance impact) this reduces search results
> accuracy?
>
> -Clemens
>
> -----Ursprüngliche Nachricht-----
> Von: Markus Jelsma <markus.jel...@openindex.io>
> Gesendet: Freitag, 3. August 2018 12:43
> An: solr-user@lucene.apache.org
> Betreff: RE: indexing two words, searching single word
>
> Hello,
>
> If your case is English you could use synonyms to work around the problem
> of the few compound words of the language. However, would you be dealing
> with a Germanic compound language, the HyphenationCompoundWordTokenFilter
> [1] or DictionaryCompoundWordTokenFilter are a better choice. The former is
> much more flexible but has its drawbacks.
>
> Regards,
> Markus
>
>
> https://lucene.apache.org/core/7_4_0/analyzers-common/org/apache/lucene/analysis/compound/HyphenationCompoundWordTokenFilterFactory.html
>
>
>
> -----Original message-----
> > From:Clemens Wyss DEV <clemens...@mysign.ch>
> > Sent: Friday 3rd August 2018 12:22
> > To: solr-user@lucene.apache.org
> > Subject: indexing two words, searching single word
> >
> > Sounds like a rather simple issue:
> > if I index "sound stage" and search for "soundstage" I get no hits
> >
> > What am I doing wrong
> > a) when indexing
> > b) when searching
> > ?
> >
> > Thx in advance
> > - Clemens
> >
>

Reply via email to