I had a fun little weekend project. I tried out using inotify to speed up dpkg file triggers, with man-db as a test case, not that this approach is limited to that.
My code consists of two programs, one that parses a .triggers file and collects the events in the background and another one that asks for those changes. I didn't need to recompile any programs for this, but just add a few lines to man-db's postinst file. Some benchmarks, to motivate this thing. I chose "the" as a benchmark since it's a simple package with a singular man page in section 1, where man1 would be a directory with a few thousand entries. # time { for a in {1..10}; do dpkg --remove the; dpkg -i /var/cache/apt/archives/the_3.3~rc1-2_amd64.deb ; done } Plain old man-db trigger: real 0m39.733s user 0m11.949s sys 0m7.768s With inotify and using mandb -f: real 0m26.910s user 0m11.081s sys 0m4.428s That's a lot of stat calls left uncalled. The reason why I've made my test code to handle just singular man pages is that mandb accepts only one -f parameter. I didn't try changing that for this test. mandb still has a nontrivial startup time and I wouldn't call it in a loop, as it is. I'd say that doing this is a worthwhile thing, but I'd like to discuss the specifics. How closely should this be associated with dpkg itself? Starting the collection process takes about 200ms so I'm not quite sure how well launching it at the same time as dpkg itself would work. With apt-get or aptitude that'd pose no problem. On the other hand, man is an example where we could eliminate that delay if we applied some domain specific knowledge. Stop readdir early if there are any non-directories in a directory, since we know that, for man, none of those will have subdirectories. We're only adding inotify watches on directories. Who should decide what packages have inotify data collection enabled? I don't expect this level of detail to be useful for all packages. How configurable should this be? I doubt any trigger would benefit from getting a list of a hundred files or so and would be better off just doing a full run of whatever they're doing. I'd keep having this information available optional, with having triggers fall back to do what they currently do. There's a chance (however small) that inotify fills up its event buffer and any data collection routine will have no choice but to bail out, and we have non-Linux systems to consider too. I'm not entirely sure this thing couldn't have false negatives, with having changes go unnoticed. But triggers are supposed to cope with that already. I haven't tried looking at dpkg's source to see what it does to decide to call a file trigger and why it won't make a file list available, or what would need to be done to expose that. I know that it doesn't use inotify. Strictly speaking, none of what I did really necessiates dpkg's, apt's or anyone's cooperation, if I made it an independent daemon and just let a package's postinst trigger optionally use it if it was active. I've attached my test code. I don't know what all earlier attempts there are at doing this sort of a thing. Most of the file alteration monitor software (e.g. fam, gamin, incron) are more geared towards having actions happen when files change, not recording the changes.
inotify-interest.tar.gz
Description: Binary data