Hello, Solr Community:
Actually, you can set up a tokenizer for the managed synonyms.
But, the configuration is not on the reference guide, and I do not know how
to add a Tokenizer via API-call.
So, you might need to manually edit a JSON file below the config directory.
In the _schema_analysis_synonyms_<Name of Resource>.json under config
directory, you will see the JSON below.
{
"responseHeader":{
"status":0,
"QTime":3},
"synonymMappings":{
"initArgs":{
"ignoreCase":true,
"format":"solr"},
"initializedOn":"2014-12-16T22:44:05.33Z",
"managedMap":{
"GB":
["GiB",
"Gigabyte"],
"TV":
["Television"],
"happy":
["glad",
"joyful"]}}}
In order to add a tokenizer, under the "initArgs" key, you need to add the
following key-value data.
"tokenizerFactory":"solr.<Name Of Tokenizer>Factory"
Eventually, you will get the following JSON.
{ "responseHeader":{
"status":0, "QTime":3},
"synonymMappings":{ "
initArgs":{
"ignoreCase":true,
"format":"solr",
"tokenizerFactory":"solr.<Name Of Tokenizer>Factory"
},
"initializedOn":"2014-12-16T22:44:05.33Z",
"managedMap":{
"GB": ["GiB", "Gigabyte"],
"TV": ["Television"],
"happy": ["glad", "joyful"]}}}
I would like to add this configuration to Solr reference guide, but I have
not created a JIRA issue yet.
--
Sincerely,
Kaya
github: https://github.com/28kayak
2020年7月7日(火) 11:55 Koji Sekiguchi <[email protected]>:
> I think the question makes sense as SynonymGraphFilterFactory accepts
> tokenizerFactory,
> he asked the managed version of SynonymGraphFilter could accept it as well.
>
>
> https://lucene.apache.org/solr/guide/8_5/filter-descriptions.html#synonym-graph-filter
>
> The answer seems to be NO.
>
> Koji
>
>
> On 2020/07/07 8:18, Erick Erickson wrote:
> > This question doesn’t really make sense. You don’t specify tokenizers on
> > filters, they’re specified at the _field_ level.
> >
> > You can certainly define as many field(type)s as you want, each with a
> different
> > analysis chain and those chains can be made up of whatever you want to
> use, and
> > there are lots of choices.
> >
> > If you are asking to do _additional_ tokenization on the output of a
> synonym
> > filter, no.
> >
> > Perhaps if you defined the problem you’re trying to solve we could make
> some
> > suggestions.
> >
> > Best,
> > Erick
> >
> >> On Jul 6, 2020, at 6:43 PM, Thomas Corthals <[email protected]>
> wrote:
> >>
> >> Hi,
> >>
> >> Is it possible to specify a Tokenizer Factory on a Managed Synonym Graph
> >> Filter? I would like to use a Standard Tokenizer or Keyword Tokenizer on
> >> some fields.
> >>
> >> Best,
> >>
> >> Thomas
> >
> >
>
<https://github.com/28kayak>