It's definitely safe to have one rolling mode writing and one repacking. I wouldn't run multiple repacks in parallel, as they can wind up doing duplicate work (though the end result should always be correct and safe).
Here's what we run: # Any time the disk gets over 50%, compress -o single down to data 13 * * * * [% INCLUDE cronjob c='/home/mod_perl/hm/scripts/xapian_compact.pl -a -o -d 50 temp data' %] # Copy the temporary search databases down to data during the week 43 1 * * 1,2,3,4,5,6 [% INCLUDE cronjob c='/home/mod_perl/hm/scripts/xapian_compact.pl -a temp,meta data' %] # Sundays repack the entire data directory 43 1 * * 0 [% INCLUDE cronjob c='/home/mod_perl/hm/scripts/xapian_compact.pl -a temp,meta,data data' %] # Late on Sundays, pack any oversized data directories down to archive 0 15 * * 0 [% INCLUDE cronjob c='/home/mod_perl/hm/scripts/xapian_archive.pl -a' %] And here's the interesting logic. In xapian_compact.pl: if ($Opts{d}) { my $Path = $Slot->SearchPath(); my $Usage = df($Path); my $RunUsage = df("/run/cyrus"); return Process::Status->new(0) if ($Usage->{per} < $Opts{d} and $RunUsage->{per} < $Opts{d}); } my @args = (-z => $dest, -t => $src); push @args, '-v' if $Opts{v}; push @args, '-o' if $Opts{o}; push @args, '-F' if $Opts{F}; push @args, '-X' if $Opts{X}; push @args, ('-T' => $Opts{T}) if $Opts{T}; push @args, ('-u' => $Opts{u}) if $Opts{u}; my %RunOpts = ( PrintOutput => 1, ); $RunOpts{Nice} = 1 unless $Opts{N}; $RunOpts{Daemon} = 1 if $Opts{D}; $0 = "xapian_compact: $SN"; $Slot->RunCommand(\%RunOpts, 'squatter', @args); And in xapian_archive.pl: my $Percent = $Opts{P} || 20; [...] foreach my $user (sort keys %$DataUsage) { my $au = $ArchiveUsage->{$user} || 1; my $du = $DataUsage->{$user} || 1; if ($du < 5000) { print "Too small $user ($du)\n"; next; } my $This = int($du * 100 / $au); if ($This < $Percent) { print "Not enough dirty $user: ($du, $au)\n"; next; } print "Recompacting $user: ($du, $au)\n"; my @args = (-z => 'archive', -t => 'data,archive'); [...] In summary, repack data down to archive if data is more than 1/5 size of existing archive. So each of these scripts is a wrapper around squatter to help it run automatically. Bron. On Mon, Feb 11, 2019, at 21:55, Egoitz Aurrekoetxea wrote: > Now I'm noticing for instance, for moving data between Xapian databases.. you > need to launch something like : > > sudo -u cyrus /usr/cyrus/bin/squatter -C /usr/local/etc/imapd.conf -v -z > archive -t temp,meta,data,archive -u user/ego...@sarenet.es > > > perhaps would be better to do : > sudo -u cyrus /usr/cyrus/bin/squatter -C /usr/local/etc/imapd.conf _*-F*_ -v > -z archive -t temp,meta,data,archive -u user/ego...@sarenet.es > But then, having two Squatter processes running at same time, one for rolling > mode and one for moving/repacking data, should not be an issue?. > > > Thanks mates!! > > --- > > sarenet > *Egoitz Aurrekoetxea* > Departamento de sistemas > 944 209 470 > Parque Tecnológico. Edificio 103 > 48170 Zamudio (Bizkaia) > ego...@sarenet.es > www.sarenet.es > > Antes de imprimir este correo electrónico piense si es necesario hacerlo. > > El 11-02-2019 11:22, Egoitz Aurrekoetxea escribió: >> Hi Bron, >> >> So, it would be interesting to run once a day... for instance in cyrus.conf >> in events section : >> repack_xapian cmd="squatter -F" at=0200 >> Is it needed top stop the other rolling Squatter we run, in same cyrus.conf >> as : >> START { >> # do not delete this entry! >> recover cmd="ctl_cyrusdb -r" >> >> squatter cmd="squatter -R" >> } >> >> Thank you so much for all the clarifications mate :) really :) >> >> Cheers! >> --- >> >> sarenet >> *Egoitz Aurrekoetxea* >> Departamento de sistemas >> 944 209 470 >> Parque Tecnológico. Edificio 103 >> 48170 Zamudio (Bizkaia) >> ego...@sarenet.es >> www.sarenet.es >> >> Antes de imprimir este correo electrónico piense si es necesario hacerlo. >> >> El 11-02-2019 10:23, Bron Gondwana escribió: >>> Conversations.db is an index over lots of interesting bits of the message, >>> but the key part that's used by Xapian is the mapping from G key (aka: >>> GUID, aka: sha1 of the message RFC822 data) to individual email. It's used >>> for deduplication and for mapping from results to messages. >>> >>> The data in conversations.db is added and removed in real time as messages >>> are appended and updated in the cyrus.index. >>> >>> The data in the xapian databases on the other hand is append only - so you >>> can wind up with hits that no longer map to existing emails. The way to >>> solve that is with a xapian repack that filters messages - which can be >>> done using the -F flag to squatter. >>> >>> Cheers, >>> >>> Bron. >>> >>> On Sat, Feb 9, 2019, at 23:04, Egoitz Aurrekoetxea wrote: >>>> Good morning, >>>> >>>> As far as I understood, for Xapian you first create it's conversation >>>> database in order to work. Later you create database(s) for each mailbox >>>> where Xapian can search in. You can move data between them, new mails >>>> become indexed for instance Squatter in rolling mode... that's ok... and >>>> understood I think. I was wondering, what happens when mail indexed in the >>>> archive database in removed and then does not exist any more in the >>>> database... does Squatter rolling log manage that too?. >>>> >>>> By the way. I was wondering if mail gets indexed in the tier databases >>>> (for instance in Fastmail in temp, meta, data, archine...) what's the role >>>> or function of conversations databases you create with ctl_conversationsdb >>>> -b -r ?. >>>> >>>> Cheers! >>>> -- >>>> >>>> sarenet >>>> *Egoitz Aurrekoetxea* >>>> Departamento de sistemas >>>> 944 209 470 >>>> Parque Tecnológico. Edificio 103 >>>> 48170 Zamudio (Bizkaia) >>>> ego...@sarenet.es >>>> www.sarenet.es >>>> >>>> Antes de imprimir este correo electrónico piense si es necesario hacerlo. >>>> ---- >>>> Cyrus Home Page: http://www.cyrusimap.org/ >>>> List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/ >>>> To Unsubscribe: >>>> https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus >>> >>> -- >>> Bron Gondwana, CEO, FastMail Pty Ltd >>> br...@fastmailteam.com >>> >>> >>> >>> ---- >>> Cyrus Home Page: http://www.cyrusimap.org/ >>> List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/ >>> To Unsubscribe: >>> https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus > ---- > Cyrus Home Page: http://www.cyrusimap.org/ > List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/ > To Unsubscribe: > https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus -- Bron Gondwana, CEO, FastMail Pty Ltd br...@fastmailteam.com
---- Cyrus Home Page: http://www.cyrusimap.org/ List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/ To Unsubscribe: https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus