On Fri, 29 Mar 2024 14:56:14 +0100 Ondrej Zary <z...@gsystem.sk> wrote: > Package: dovecot-fts-xapian > Version: 1.5.5-1+b2 > Severity: important > Tags: upstream > X-Debbugs-Cc: z...@gsystem.sk > > Dear Maintainer, > dovecot-fts-xapian crashes with "Memory too low (text) 'std::bad_alloc'" error when indexing large mailboxes.
This points to a configuration issue. It could be that dovecot-fts- xapian requires more memory than dovecot-fts-flatcurve, but you should up the memory size for the indexer worker if you reach the limit with default settings (are you using the default values?). Note that the fts xapian docs tells to set vsz_limit for the 'service indexer-worker' to 2GB; Did you set it so ? Note that I don't say that fts xapian is perfect or that fts-flatcurve is not better. Only that there is a doc to follow to use fts xapian. Out of memory is not an fts xapian bug. Only a non compliant configuration. COuld eb that fts xapian requires more memory by design. > I've compiled and tested 1.7.8 and 1.7.9 but it's not getting any better. > They introduce major bugs such as excessive debug logging regardless of verbosity configured. Could you share your workaround? (and maybe send a PR upstream ie https://github.com/grosjo/fts-xapian) or open a bug report explaining the context and the workaround? I might stay with fts xapian for a while with trixie, so if such bugs are fixed before release that would be great. > I've patched that manually but the basic thing - indexing - does not work reliably. > It was not able to complete indexing of the mailbox (about 18K messages, 18GB) - it stuck on one message with 100% CPU usage (on all cores!) for hours, > probably an infinite loop. Do you have logs? or a list of the processes? It coulod even be dovecot constantly respawning indexer workers because they constantly crash ... because the memory limit for indexer worker is too low for an fts xapian worker (at least one that has all indexing options enabled which as far as I know is the default for fts xapian but not fts flatcurve). > This package is so buggy that it's useless. > Removed it and compiled dovecot-fts-flatcurve (1.0.2) instead (files to create a deb package are already in #1010868). > It simply works - indexed the mailbox without any problems in a couple of minutes! My maildir is 118GB and no infinite loop. Still I am a user and could be interested in fts flatcurve if it was really fast. Did you enable substring match in fts-flatcurve with such a speed to index 18GB in a few minutes? Are you on SSD or NVME or HDD 5400 rpm? (ie I don't see how you could build a db of terms for 18GB in a few minutes on an HDD) Cheers, Alban