Apparently, though unproven, at 18:31 on Friday 19 November 2010, Paul Hartman 
did opine thusly:

> On Fri, Nov 19, 2010 at 9:17 AM, Alan McKinnon <alan.mckin...@gmail.com> 
wrote:
> > Hi all,
> > 
> > Haven't had much luck finding this info:
> > 
> > If I reboot this machine and start KDE, Nepomuk starts a rather
> > long-lived index of my home directory. It takes up about 30-40% cpu and
> > lasts as much as 15 minutes sometimes. This is annoying because after a
> > reboot I usually want to catch up on mail, rss feeds and fire up
> > VirtualBox. So nepomuk is just wasting my time at this point.
> 
> My /guess/ is that it scans every time you restart to be sure nothing
> changed while it was shutdown. It doesn't know if you've dual-booted,
> logged into xfce, mounted the disk in another machine, had fsck remove
> files, etc.
> 
> I think Tracker behaves the same way in gnome-land.

I think that's a bit silly, so do a full scan just in case stuff changed.

If so, a very simple optimization would be to calculate a hash of some aspect 
of a directory, store the hash persistently, and only do a full scan if the 
hash is different.

I haven't read the code, so I'm in no real position to know how it's done or 
how to optimize it.

> > How does nepomuk know when to do it's thing, how can I tweak what it does
> > and how can I discover why it feels it necessary to reindex my entire
> > maildir when surely it has a perfectly valid index already from just
> > before I shut down?
> 
> I am pretty sure it is tied to your KDE user session, and not running
> as a system daemon in the background. Perhaps you can suspend it via
> some autostarting script, and then resume it after whatever amount of
> time you're comfortable with.
> 
> Looking in here:
> http://api.kde.org/4.5-api/kdebase-runtime-apidocs/nepomuk/html/classNepomu
> k_1_1IndexScheduler.html
> 
> In the indexing speed settings, it says:
> "
> enum Nepomuk::IndexScheduler::IndexingSpeed
> 
> Enumerator:
>     FullSpeed         Index at full speed, i.e. do not use any artificial 
delays.
>     This is the mode used if the user is "away".
> 
>     ReducedSpeed      Reduce the indexing speed mildly.
>     This is the normal mode used while the user works. The indexer
> uses small delay between indexing two files in order to keep the load
> on CPU and IO down.
> 
>     SnailPace         Like ReducedSpeed delays are used but they are much
> longer to get even less CPU and IO load.
>     This mode is used for the first 2 minutes after startup to give
> the KDE session manager time to start up the KDE session rapidly.
> "
> 
> So based on that, for the first 2 minutes after KDE starts it should
> be using the least aggressive indexing speed (but indexing
> nevertheless).

Good find. Personally, I'd like it to wait for 10-20 minutes after session 
start, then just run at SnailPace period. This machine is seldom booted or 
even logged out of KDE (I suspend) so I can tolerate the wait as it's rare

> (Personally I've always had all that indexing/social-semantic-desktop
> stuff disabled completely.)

Maybe I should too. But I *did* want to use this nepomuk thing myself for a 
while and see what the semantic-desktop can do for myself. It looks like it 
could be awesomely useful (like Google turned out to be awesomely useful) but 
it takes usage for real to know




-- 
alan dot mckinnon at gmail dot com

Reply via email to