URL: <http://savannah.gnu.org/bugs/?28730>
Summary: Bad Mach object cache reuse Project: The GNU Hurd Submitted by: sthibaul Submitted on: dim 24 jan 2010 23:28:15 CET Category: Meta Severity: 3 - Normal Priority: 5 - Normal Item Group: None Status: None Privacy: Public Assigned to: None Originator Name: Originator Email: Open/Closed: Open Discussion Lock: Any Reproducibility: None Size (loc): None Planned Release: None Effort: 0.00 Wiki-like text discussion box: _______________________________________________________ Details: /var/tmp/hurd-20090404 holds 4142 files totalling 77MB /var/tmp/sudo* holds 434 files totalling 26MB vm_object_cached_max is set to 4000 in my gnumach, thus not enough for the whole hurd-20090404 I have 256MB memory, thus plenty for cache $ time rgrep foobarbaz /var/tmp/hurd-20090404 1m5 $ time rgrep foobarbaz /var/tmp/sudo* 15s $ time rgrep foobarbaz /var/tmp/sudo* 10s $ time rgrep foobarbaz /var/tmp/sudo* 8s $ time rgrep foobarbaz /var/tmp/sudo* 5s $ time rgrep foobarbaz /var/tmp/sudo* 4s $ reboot $ time rgrep foobarbaz /var/tmp/sudo* 15s $ time rgrep foobarbaz /var/tmp/sudo* 4s This means filling the cache with hurd-20090404 first gets in the way of getting sudo* into the cache. I've dug a bit, the problem lies in the periodic sync of ext2fs. If I pass --sync=60 to ext2fs, and then retry $ time rgrep foobarbaz /var/tmp/hurd-20090404 1m5 $ ... wait for ext2fs sync to happen, then as soon as it's over, $ time rgrep foobarbaz /var/tmp/sudo* 15s $ time rgrep foobarbaz /var/tmp/sudo* 4s What I can notice is that on ext2fs sync, all its cached objects get out of the object cache (for the sync) and then put again (probably because of the call to memory_object_lock_request()). In the first scenario, it means the following happening on the object cache (put in brackets, objects on the left were queued first). [] $ time rgrep foobarbaz /var/tmp/hurd-20090404 [ ---------------- hurd-20090404 objs ------------------- ] $ time rgrep foobarbaz /var/tmp/sudo* [ hurd-20090404 objs ------------------------ | sudo objs ] # ext2fs sync [ ------ mixture of hurd-20090404 objs and sudo objs ---- ] # rgrep continues, new sudo objs push the mixture out, and some sudo # objs get dropped [ mixture of hurd-20090404 objs and sudo objs | sudo objs ] # ext2fs sync [ ------ mixture of hurd-20090404 objs and sudo objs ---- ] [ mixture of hurd-20090404 objs and sudo objs | sudo objs ] 15s I.e. the ext2fs sync mixes old useless objects with sudo objs, according to hash functions (see diskfs_node_iterate() or ports_bucket_iterate()), and in the end a lot of the sudo objs have been pushed out and dropped from the cache, which is unfortunate. When raising the period to 60s, the rgrep has the time to finish before its sudo objects gets scrambled with the hurd objects, and then cache hit can happen. To summarize, the issue is that write_all_disknodes() touches the cached objects too heavily so that GNU Mach believes they are being used, while they're not. Raising vm_object_cached_max is not a solution since that won't prevent the scrambling, just reduce its effect. Tinkering with memory_object_lock_request() is not so easy either, the ground issue being that ext2fs issues syncs for all objects in a basically random order. Could there be a way for ext2fs to know which objects need a sync? _______________________________________________________ Reply to this item at: <http://savannah.gnu.org/bugs/?28730> _______________________________________________ Message posté via/par Savannah http://savannah.gnu.org/