The shingle filter will likely be part of the strategy:

https://cwiki.apache.org/confluence/display/solr/Filter+Descriptions#FilterDescriptions-ShingleFilter

and the keep word filter will help if you have an authority list:

https://cwiki.apache.org/confluence/display/solr/Filter+Descriptions#FilterDescriptions-KeepWordFilter

Joel Bernstein
http://joelsolr.blogspot.com/

On Sun, Apr 9, 2017 at 1:54 AM, marotosg <marot...@gmail.com> wrote:

> Hi,
>
> I am trying to extract locations from a location field which contains
> location information in different formats. My initial idea is to extract
> only UK and USA location and get them Standard.
>
> For instance if my field contains "Wakefield" then I will convert it to
> "Wakefield West Yorkshire".
> After that my new field will be a facet field with standard locations.
>
> I am trying to achieve this with the new SynonymGraphFilterFactory but waht
> I get after applying it is a list of tokens like that.
> Wakefield (1)
> West (1)
> Yorkshire (1).
>
> How is it possible to get all the tokens as one Wakefield West
> Yorkshire(1).
>
> Here is the analyzer field definition
> <analyzer type="index">
> <tokenizer class="solr.StandardTokenizerFactory"/>
>                 <filter class="solr.SynonymGraphFilterFactory"
> synonyms="locations-classifier.txt" ignoreCase="true" expand="false"
> tokenizerFactory="solr.StandardTokenizerFactory"/>
>                 <filter class="solr.FlattenGraphFilterFactory"/>
> </analyzer>
>
>
> thanks
>
>
>
> --
> View this message in context: http://lucene.472066.n3.
> nabble.com/Tagging-Locations-using-SynonymGraphFilterFactory-
> tp4329089.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Reply via email to