Re: Synonym filters memory usage

Dominique Bejean Wed, 02 Oct 2019 02:50:18 -0700

Thank you for all your responses.
Dominique

Le lun. 30 sept. 2019 à 13:38, Erick Erickson <erickerick...@gmail.com> a
écrit :


> Solr/Lucene _better_ not have a copy of the synonym map for every segment,
> if so it’s a JIRA for sure. I’ve seen indexes with 100s of segments. With a
> large synonym file it’d be terrible.
>
> I would be really, really, really surprised if this is the case. The
> Lucene people are very careful with memory usage and would hop on this in
> an instant if true I’d guess.
>
> Best,
> Erick
>
> > On Sep 30, 2019, at 5:27 AM, Andrea Gazzarini <a.gazzar...@sease.io>
> wrote:
> >
> > That sounds really strange to me.
> > Segments are created gradually depending on changes applied to the
> index, while the Schema should have a completely different lifecycle,
> independent from that.
> > If that is true, that would mean each time a new segment is created Solr
> would instantiate a new Schema instance (or at least, assuming this is
> valid only for synonyms, one SynonymFilterFactory, one SynonymFilter, one
> SynonymMap), which again, sounds really strange.
> >
> > Thanks for the point, I'll check and I'll let you know
> >
> > Cheers,
> > Andrea
> >
> > On 30/09/2019 09:58, Bernd Fehling wrote:
> >> Yes, I think so.
> >> While integrating a Thesaurus as synonyms.txt I saw massive memory
> usage.
> >> A heap dump and analysis with MemoryAnalyzer pointed out that the
> >> SynonymMap took 3 times a huge amount of memory, together with each
> >> opened index segment.
> >> Just try it and check that by yourself with heap dump and
> MemoryAnalyzer.
> >>
> >> Regards
> >> Bernd
> >>
> >>
> >> Am 30.09.19 um 09:44 schrieb Andrea Gazzarini:
> >>> mmm, ok for the core but are you sure things in this case are working
> per-segment? I would expect a FilterFactory instance per index, initialized
> at schema loading time.
> >>>
> >>> On 30/09/2019 09:04, Bernd Fehling wrote:
> >>>> And I think this is per core per index segment.
> >>>>
> >>>> 2 cores per instance, each core with 3 index segments, sums up to 6
> times
> >>>> the 2 SynonymMaps. Results in 12 times SynonymMaps.
> >>>>
> >>>> Regards
> >>>> Bernd
> >>>>
> >>>>
> >>>> Am 30.09.19 um 08:41 schrieb Andrea Gazzarini:
> >>>>>   Hi,
> >>>>> looking at the stateful nature of SynonymGraphFilter/FilterFactory
> classes,
> >>>>> the answer should be 2 times (one time per type instance).
> >>>>> The SynonymMap, which internally holds the synonyms table, is a
> private
> >>>>> member of the filter factory and it is loaded each time the factory
> needs
> >>>>> to create a type.
> >>>>>
> >>>>> Best,
> >>>>> Andrea
> >>>>>
> >>>>> On 29/09/2019 23:49, Dominique Bejean wrote:
> >>>>>
> >>>>> Hi,
> >>>>>
> >>>>> My concern is about memory used by synonym filter, especially if
> synonyms
> >>>>> resources files are large.
> >>>>>
> >>>>> If in my schema, there are two field types "TypeSyno1" and
> "TypeSyno2"
> >>>>> using synonym filter with the same synonyms files.
> >>>>> For each of these two field types there are two fields
> >>>>>
> >>>>> Field1 type is TypeSyno1
> >>>>> Field2 type is TypeSyno1
> >>>>> Field3 type is TypeSyno2
> >>>>> Field4 type is TypeSyno2
> >>>>>
> >>>>> How many times is the synonym file loaded in memory ?
> >>>>> 4 times, so one time per field ?
> >>>>> 2 times, so one time per instanciated type ?
> >>>>>
> >>>>> Regards
> >>>>>
> >>>>> Dominique
> >>>
> >
> > --
> > Andrea Gazzarini
> > Search Consultant, R&D Software Engineer
> >
> >
> >
> > mobile: +39 349 513 86 25
> > email: a.gazzar...@sease.io
> >
>
>

Re: Synonym filters memory usage

Reply via email to