Apparently, though unproven, at 18:31 on Friday 19 November 2010, Paul Hartman did opine thusly:
> On Fri, Nov 19, 2010 at 9:17 AM, Alan McKinnon <alan.mckin...@gmail.com> wrote: > > Hi all, > > > > Haven't had much luck finding this info: > > > > If I reboot this machine and start KDE, Nepomuk starts a rather > > long-lived index of my home directory. It takes up about 30-40% cpu and > > lasts as much as 15 minutes sometimes. This is annoying because after a > > reboot I usually want to catch up on mail, rss feeds and fire up > > VirtualBox. So nepomuk is just wasting my time at this point. > > My /guess/ is that it scans every time you restart to be sure nothing > changed while it was shutdown. It doesn't know if you've dual-booted, > logged into xfce, mounted the disk in another machine, had fsck remove > files, etc. > > I think Tracker behaves the same way in gnome-land. I think that's a bit silly, so do a full scan just in case stuff changed. If so, a very simple optimization would be to calculate a hash of some aspect of a directory, store the hash persistently, and only do a full scan if the hash is different. I haven't read the code, so I'm in no real position to know how it's done or how to optimize it. > > How does nepomuk know when to do it's thing, how can I tweak what it does > > and how can I discover why it feels it necessary to reindex my entire > > maildir when surely it has a perfectly valid index already from just > > before I shut down? > > I am pretty sure it is tied to your KDE user session, and not running > as a system daemon in the background. Perhaps you can suspend it via > some autostarting script, and then resume it after whatever amount of > time you're comfortable with. > > Looking in here: > http://api.kde.org/4.5-api/kdebase-runtime-apidocs/nepomuk/html/classNepomu > k_1_1IndexScheduler.html > > In the indexing speed settings, it says: > " > enum Nepomuk::IndexScheduler::IndexingSpeed > > Enumerator: > FullSpeed Index at full speed, i.e. do not use any artificial delays. > This is the mode used if the user is "away". > > ReducedSpeed Reduce the indexing speed mildly. > This is the normal mode used while the user works. The indexer > uses small delay between indexing two files in order to keep the load > on CPU and IO down. > > SnailPace Like ReducedSpeed delays are used but they are much > longer to get even less CPU and IO load. > This mode is used for the first 2 minutes after startup to give > the KDE session manager time to start up the KDE session rapidly. > " > > So based on that, for the first 2 minutes after KDE starts it should > be using the least aggressive indexing speed (but indexing > nevertheless). Good find. Personally, I'd like it to wait for 10-20 minutes after session start, then just run at SnailPace period. This machine is seldom booted or even logged out of KDE (I suspend) so I can tolerate the wait as it's rare > (Personally I've always had all that indexing/social-semantic-desktop > stuff disabled completely.) Maybe I should too. But I *did* want to use this nepomuk thing myself for a while and see what the semantic-desktop can do for myself. It looks like it could be awesomely useful (like Google turned out to be awesomely useful) but it takes usage for real to know -- alan dot mckinnon at gmail dot com