Thank you for all your responses. Dominique Le lun. 30 sept. 2019 à 13:38, Erick Erickson <erickerick...@gmail.com> a écrit :
> Solr/Lucene _better_ not have a copy of the synonym map for every segment, > if so it’s a JIRA for sure. I’ve seen indexes with 100s of segments. With a > large synonym file it’d be terrible. > > I would be really, really, really surprised if this is the case. The > Lucene people are very careful with memory usage and would hop on this in > an instant if true I’d guess. > > Best, > Erick > > > On Sep 30, 2019, at 5:27 AM, Andrea Gazzarini <a.gazzar...@sease.io> > wrote: > > > > That sounds really strange to me. > > Segments are created gradually depending on changes applied to the > index, while the Schema should have a completely different lifecycle, > independent from that. > > If that is true, that would mean each time a new segment is created Solr > would instantiate a new Schema instance (or at least, assuming this is > valid only for synonyms, one SynonymFilterFactory, one SynonymFilter, one > SynonymMap), which again, sounds really strange. > > > > Thanks for the point, I'll check and I'll let you know > > > > Cheers, > > Andrea > > > > On 30/09/2019 09:58, Bernd Fehling wrote: > >> Yes, I think so. > >> While integrating a Thesaurus as synonyms.txt I saw massive memory > usage. > >> A heap dump and analysis with MemoryAnalyzer pointed out that the > >> SynonymMap took 3 times a huge amount of memory, together with each > >> opened index segment. > >> Just try it and check that by yourself with heap dump and > MemoryAnalyzer. > >> > >> Regards > >> Bernd > >> > >> > >> Am 30.09.19 um 09:44 schrieb Andrea Gazzarini: > >>> mmm, ok for the core but are you sure things in this case are working > per-segment? I would expect a FilterFactory instance per index, initialized > at schema loading time. > >>> > >>> On 30/09/2019 09:04, Bernd Fehling wrote: > >>>> And I think this is per core per index segment. > >>>> > >>>> 2 cores per instance, each core with 3 index segments, sums up to 6 > times > >>>> the 2 SynonymMaps. Results in 12 times SynonymMaps. > >>>> > >>>> Regards > >>>> Bernd > >>>> > >>>> > >>>> Am 30.09.19 um 08:41 schrieb Andrea Gazzarini: > >>>>> Hi, > >>>>> looking at the stateful nature of SynonymGraphFilter/FilterFactory > classes, > >>>>> the answer should be 2 times (one time per type instance). > >>>>> The SynonymMap, which internally holds the synonyms table, is a > private > >>>>> member of the filter factory and it is loaded each time the factory > needs > >>>>> to create a type. > >>>>> > >>>>> Best, > >>>>> Andrea > >>>>> > >>>>> On 29/09/2019 23:49, Dominique Bejean wrote: > >>>>> > >>>>> Hi, > >>>>> > >>>>> My concern is about memory used by synonym filter, especially if > synonyms > >>>>> resources files are large. > >>>>> > >>>>> If in my schema, there are two field types "TypeSyno1" and > "TypeSyno2" > >>>>> using synonym filter with the same synonyms files. > >>>>> For each of these two field types there are two fields > >>>>> > >>>>> Field1 type is TypeSyno1 > >>>>> Field2 type is TypeSyno1 > >>>>> Field3 type is TypeSyno2 > >>>>> Field4 type is TypeSyno2 > >>>>> > >>>>> How many times is the synonym file loaded in memory ? > >>>>> 4 times, so one time per field ? > >>>>> 2 times, so one time per instanciated type ? > >>>>> > >>>>> Regards > >>>>> > >>>>> Dominique > >>> > > > > -- > > Andrea Gazzarini > > Search Consultant, R&D Software Engineer > > > > > > > > mobile: +39 349 513 86 25 > > email: a.gazzar...@sease.io > > > >