Hi Phil,

The WordDelimiterFilterFactory (
https://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.WordDelimiterFilterFactory)
can be used to avoid splitting at hypen etc along with
WhiteSpaceTokenizerFactory.  Use  generateWordParts="0"...

Thnx

On Thu, Jun 8, 2017 at 10:39 PM, Phil Scadden <p.scad...@gns.cri.nz> wrote:

> We have important entities referenced in indexed documents which have
> convention naming of geographicname-number. Eg Wainui-8
> I want the tokenizer to treat it as Wainui-8 when indexing, and when I
> search I want to a q of Wainui-8 (must it be specified as Wainui\-8 ??) to
> return docs with Wainui-8 but not with Wainui-9 or plain Wainui.
>
> Docs are pdfs, and I have using tika to extract text.
>
> How do I set up solr for queries like this?
>
> Notice: This email and any attachments are confidential and may not be
> used, published or redistributed without the prior written consent of the
> Institute of Geological and Nuclear Sciences Limited (GNS Science). If
> received in error please destroy and immediately notify GNS Science. Do not
> copy or disclose the contents.
>

Reply via email to