I redid my patch series to include just the things I'm sure are bugfixes for git HEAD. I also redid their commit messages with line wrapping. Will send in a quoting cleanup patch later, with more stuff in one patch, when I'm ready to sign off on it.
I've played around with my prefetch idea, and now I'm happier with
it, having explored the possibilities a bit. It'd be neat if some
people could test this on their own systems, esp. ones with any kind
of slow hard drive or slow CPU, or a crufty /etc/bash_completion.d
with a lot of crap in it.
Sending an email chock-full of everything that's worth writing down,
just so it's in the list archives in case anyone ever wants it. I
didn't want to stick most of this into a git commit or a bug comment.
It seems to be hard to get Linux to really drop filesystem caches.
That or my SATA hard drive's internal cache is big enough that it
doesn't need to seek around to get the requested data when all you're
doing is time . ./bash_completion between dropping caches. That
might be more likely, since it's faster than the first time running it
after not doing so for a while, but nowhere near as fast as with hot
cache.
Intel Core2 Duo E6600 (2.4GHz, 2 non-hyperthreaded cores) 5GB RAM
/etc on ext4 (noatime) on a WD10EARS-00M (fairly old green-power magnetic HD)
/usr/local/src/bash-completion on xfs on the same HD
One point of interest with these results is that the very FIRST
test of no-prefetch is FAR higher than any of the others. IDK how to
get my system back to a state where it will take that long again. If
anyone has any suggestions, that'd be great. The hd churn from
grepping a linux kernel tree helps some, but it's not enough.
alias churn_hd="grep -r --include '*.c' xxxxx
/usr/local/src/linux/ubuntu-trusty/ >/dev/null"
alias dropcache="sync; echo 3 | sudo tee /proc/sys/vm/drop_caches > /dev/null;
sleep 6"
# the sleep 6 seconds is to give things time to settle down right
# after dropping caches. Otherwise you get re-reading of
# constantly-accessed data contending with bash_completion, which is
# realistic but less repeatable.
no prefetch
$ for i in {0..4};do dropcache ; time BASH_COMPLETION_DISABLE_PREFETCH=1 .
./bash_completion;churn_hd; done 2>&1 |
grep '^real' --line-buffered |
perl -ne '/m(.*)s/ and $tot+=$1 and ++$c; print; END { print " avg: ",
$tot/$c, "\n"; }'
real 0m1.383s
real 0m0.605s
real 0m0.893s
real 0m0.797s
real 0m0.677s
avg (of last 4 only): 0.743
My first version of prefetch: just cat
prefetch with: ( exec cat $glob &>/dev/null </dev/null )& disown $!
for i in {0..4}; do dropcache; time . ./bash_completion; churn_hd; done 2>&1 |
grep '^real'
real 0m0.761s 0m0.557s 0m0.821s 0m0.558s 0m0.749s (collapsed
for readability)
avg: 0.6892
with hot cache I get: avg: 0.0906
Trying to get fancier: fork off cat, and then access all the inodes
of the files we will want, to get a deeper read queue depth.
prefetch with
(
shopt -s failglob # cat doesn't run at all with no matches, even if
nullglob is on.
# I was worried that cat could get stuck reading from stdin,
# but ( cat & ) redirects stdin from /dev/null because subshells don't
have job control.
exec &>/dev/null # quash output of cat and any bash failglob errors
cat $glob & # contents
true $glob/ # inodes. just expanding this glob will stat(2), no ls
-dL needed.
) &
disown $! # don't pollute job control
real 0m1.431s 0m0.665s 0m0.594s 0m0.713s 0m0.570s
avg (of last 4 only): 0.6355
real 0m0.701s 0m0.557s 0m0.713s 0m0.594s 0m0.797s
avg: 0.6724
real 0m0.594s 0m0.977s 0m0.558s 0m0.737s 0m0.593s
avg: 0.6918
real 0m0.893s 0m0.570s 0m0.749s 0m0.606s 0m0.569s
avg: 0.6774
avg of averages: 0.669s (still excluding the 1.4sec outlier)
observations from strace:
$ for i in {0..4}; do sync; dropcache; (ls -dLF /; cat /dev/null; ) > /dev/null;
strace -o prefetch.$i.glob-inode.cat-bg.strace -tt -s 256 -f -e
trace='!rt_sigprocmask,rt_sigaction' \
bash -c "time . ./bash_completion";
churn_hd;
done 2>&1 | ...
real 0m1.135s 0m0.591s 0m0.590s 0m0.697s 0m0.588s
avg: 0.6165 (excluding the outlier). Faster I think because the
clock doesn't start until bash has loaded, so that gets libc and all
that cached.
in the cached case, strace bash -c 'time . ./bash_completion'
real 0m0.299s
user 0m0.103s
sys 0m0.064s
cat's read() system calls weren't always returning at the same time
as bash's. While bash was chewing on some of the bigger files, or
esp. ones that made it stat(2) other things to look for files from
some of the scripts sourced (e.g. grup did a lot of stuff), the cat
process got ahead of bash and was able to get files into cache so
bash's read(2) returned right away when it got there. This is exactly
the behaviour I was trying to get from a prefetch process. IDK if
drop_caches isn't clearing inode caches or something, but all the
stat(2) system calls from bash in the prefetch subshell happen with no
delay. Or maybe almost all the relevant inodes are near each other on
disk, and got read together in a block? Anyway, once one is done,
it's just boom, sequence of stat calls while the other processes are
stuck on something.
I had been doing inode prefetch by running ls -dL on the glob, but
that always slowed things down, by maybe 0.08s, compared to just cat
prefetch. Regardless of whether I forked cat & and exec ls, or vice
versa. But I think the main problem was just ls finding all its
libraries and stuff, and the startup overhead. Expanding $glob/ to
make bash stat everything to see if it's a directory is a pretty neat
idea, IMHO. And it might help in a case where not all the inodes load
together. If they did, just exec cat in the background subshell get
it done as part of opening the files. I'm leaving it in on the theory
that inodes might be near each other even if they don't all come in in
the same disk read, so it's better to prefetch all the stat info
before reading file contents.
On cygwin, the extra CPU time from the stat system calls, even in a
background process, might be a bad thing. cygwin stat is really slow,
last I heard. And so is fork. cygwin users might want to set
BASH_COMPLETION_DISABLE_PREFETCH=1, if this patch makes it in.
On a single-core CPU, it could be a very small slowdown. cat is
doing a decent amount of system calls. Copying RAM around isn't a big
deal, the files are all very small, none big enough for cat to need
more than one size=65536 read. Esp. since write(2) to /dev/null just
returns without doing anything, no extra copying there.
There is a fadvise(1), which would be perfect. (FADVISE_WILLNEED
would does exactly what cat does, but without copying the data to
userspace, or writing it. It blocks until the data is cached, if the
readahead queue fills up.) However, the current implementation is
written in perl, so it's not useful to start it for a handful of small
files. It probably takes about as much disk IO to start it as
bash_completion load-time does total.
I tested having the prefetch thread wait to finish stat(2)ing all the
files before running cat, and it performs essentially the same. I
think I'll go with this version, since if there is a significant
amount of disk activity from the inodes, that + cat's read requests
could make an annoying hiccup of disk load that might interfere with
something else the user had running. Prob. better to err on the side
of being less agressive with read queue depth. Also, this way forks
only once.
( # prefetch with
shopt -s failglob # don't even run cat if no matches
exec &>/dev/null
true $glob/
exec cat $glob
) & disown $!
real 0m1.143s 0m0.605s 0m0.677s 0m0.725s 0m0.546s
avg: 0.63825 (last 4 only)
real 0m0.689s 0m0.558s 0m0.785s 0m0.606s 0m0.833s
avg: 0.6942
real 0m0.629s 0m0.773s 0m0.581s 0m0.749s 0m0.594s
avg: 0.6652
average of averages: 0.666s
This is the version in the patch I'm attaching.
I also tested with (ls -dLF /; cat /dev/null; ) >/dev/null ahead of
the timing timed part, to get some essential stuff cached again.
prefetch:
(...; true $glob/
exec cat $glob ) & disown $!
$ for i in {0..4}; do sync; dropcache; (ls -dLF /; cat /dev/null; ) >/dev/null;
time . ./bash_completion; churn_hd; done 2>&1 | ...
real 0m1.033s 0m0.472s 0m0.581s 0m0.545s 0m0.749s
avg: 0.58675 (last 4)
real 0m0.533s 0m0.689s 0m0.509s 0m0.665s 0m0.533s
avg: 0.5858
real 0m0.665s 0m0.521s 0m0.713s 0m0.533s 0m0.617s
avg: 0.6098
real 0m0.485s 0m0.605s 0m0.545s 0m0.628s 0m0.581s
avg: 0.5688
avg of averages: 0.588s
no-prefetch with that ls and cat outside the timed part:
$ for i in {0..4}; do sync; dropcache; (ls -dLF /; cat /dev/null; ) >/dev/null;
time BASH_COMPLETION_DISABLE_PREFETCH=1 . ./bash_completion; churn_hd;
done 2>&1 | grep '^real' --line-buffered | ...
real 0m1.195s
real 0m0.641s
real 0m0.700s
real 0m0.665s
real 0m0.701s
avg: 0.67675 (only last 4)
real 0m0.617s
real 0m0.706s
real 0m0.617s
real 0m0.700s
real 0m0.641s
avg: 0.6562
real 0m0.821s
real 0m0.629s
real 0m0.845s
real 0m0.653s
real 0m0.917s
avg: 0.773
real 0m0.677s
real 0m0.773s
real 0m0.605s
real 0m1.169s
real 0m0.617s
avg: 0.685 (excluding outlier)
avg of averages: 0.698s
So, in this case prefetch is saving 0.11s, out of 0.7, or a 15%
speedup. Well that's not as good as I thought it was doing, but it's
pretty decent.
It's a good thing bash doesn't support inline assembly,
or I'd be at this for weeks... :P
Seriously though, a lot of people are stuck waiting for bash for a
second or so, and speeding it up a bit is worth putting effort into,
IMO.
--
#define X(x,y) x##y
Peter Cordes ; e-mail: X(peter@cor , des.ca)
"The gods confound the man who first found out how to distinguish the hours!
Confound him, too, who in this place set up a sundial, to cut and hack
my day so wretchedly into small pieces!" -- Plautus, 200 BC
From b703f34d537104a98b60c98f92c263e61077e9ed Mon Sep 17 00:00:00 2001 From: Peter Cordes <[email protected]> Date: Wed, 3 Dec 2014 13:59:31 -0400 Subject: [PATCH 1/3] _longopt: fix parsing --help output that has -- in the description This fixes parsing of things like grep --help: -r, --recursive like --directories=recurse using \([^-]\|-[^-]\)* instead of .* at the front of the pattern makes the greedy match at the front stop at the first --, rather than getting --directories= from the -r line. Also move option completion ahead of the logic that checks previous arg to see if this arg should be limited to a file or directory. Too smart for its own good in such a naive function, crossed up by things like ls --directory or grep --files-with-matches. The sed in the case $prev block doesn't need this, because it puts $prev into the pattern, and it's already presumably a valid option. It will get the right --option wherever it is in the line containing it. Also turns out that bash sorts and uniquifies the results itself, so sort -u isn't needed. Still not perfect, misses --silent from the help output line: -q, --quiet, --silent suppress all normal output Could maybe loop over the --matches in sed, now that we have a sufficiently non-greedy regex to match things in front of --options, but then you'd need a full-blown sed program with pattern and hold space... yuck. Or maybe use awk? Or hardcode a pattern that can match up to 3 long options on one line? This also breaks on commands with weird --help output, like if they for some reason have --something BEFORE an option name. You could start to work around that, with another group like --[^-A-Za-z0-9] to match a -- that isn't at the start of an option, but that's just gratuitously unreadable. Do that for more robustness if anyone ever turns it into a sed program that loops over --option matches on a single line. --- bash_completion | 21 +++++++++++++++------ 1 file changed, 15 insertions(+), 6 deletions(-) diff --git a/bash_completion b/bash_completion index 55c9e48661028cee71301fd244c12d516303d437..a1dbb48e8bb69be538cb6ae9bb4485b6fa4ec698 100644 --- a/bash_completion +++ b/bash_completion @@ -1777,6 +1777,20 @@ _longopt() local cur prev words cword split _init_completion -s || return + # Check for options first: some programs have options like + # --directory=recursive that don't take directory args + # It's more likely the user knows what they're doing, + # for this naive --help parsing function. + if [[ "$cur" == -* ]]; then + COMPREPLY=( $( compgen -W "$( LC_ALL=C $1 --help 2>&1 | \ + sed -ne 's/\([^-]\|-[^-]\)*\(--[-A-Za-z0-9]\{1,\}=\{0,1\}\).*/\2/p' )" \ + -- "$cur" ) ) + # initial part of that regex matches only up to before the first --, + # to avoid tripping on " -r, --recursive like --directory=recursive" in grep --help, for example. + [[ $COMPREPLY == *= ]] && compopt -o nospace + return 0 + fi + case "${prev,,}" in --help|--usage|--version) return 0 @@ -1807,12 +1821,7 @@ _longopt() $split && return 0 - if [[ "$cur" == -* ]]; then - COMPREPLY=( $( compgen -W "$( LC_ALL=C $1 --help 2>&1 | \ - sed -ne 's/.*\(--[-A-Za-z0-9]\{1,\}=\{0,1\}\).*/\1/p' | sort -u )" \ - -- "$cur" ) ) - [[ $COMPREPLY == *= ]] && compopt -o nospace - elif [[ "$1" == @(mk|rm)dir ]]; then + if [[ "$1" == @(mk|rm)dir ]]; then _filedir -d else _filedir -- 2.1.3
From bede4c22106bdd601718019861ac8f017139069c Mon Sep 17 00:00:00 2001 From: Peter Cordes <[email protected]> Date: Wed, 3 Dec 2014 14:00:58 -0400 Subject: [PATCH 2/3] upstart support for service completion initctl list works for unprivileged users. Wasn't sure what file to check to detect that upstart was present, but /sbin should always be mounted, and upstart itself provides /sbin/upstart-dbus-bridge, and it's not a conffile in /etc that someone could move if they wanted to on their local system. And it's absolutely not going to have a name conflict with anything from another package. :) I think it's important to check that the system is using an upstart init, so you don't run initctl when completing in a root shell on another kind of system, and maybe do something like generating system log messages. --- bash_completion | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/bash_completion b/bash_completion index a1dbb48e8bb69be538cb6ae9bb4485b6fa4ec698..fd3bc41aaa160fdb3437db419d034c2be045ccc4 100644 --- a/bash_completion +++ b/bash_completion @@ -1137,6 +1137,10 @@ _services() COMPREPLY+=( $( systemctl list-units --full --all 2>/dev/null | \ awk '$1 ~ /\.service$/ { sub("\\.service$", "", $1); print $1 }' ) ) + if [[ -x /sbin/upstart-dbus-bridge ]]; then + COMPREPLY+=( $( initctl list 2>/dev/null | cut -d' ' -f1 ) ) + fi + COMPREPLY=( $( compgen -W '${COMPREPLY[@]#${sysvdirs[0]}/}' -- "$cur" ) ) } -- 2.1.3
From ba68b737e6ccf0be3d8a5ab729eb3ab5e04fd2c1 Mon Sep 17 00:00:00 2001 From: Peter Cordes <[email protected]> Date: Wed, 3 Dec 2014 14:02:38 -0400 Subject: [PATCH 3/3] speed up loading the compat dir with disk prefetch Fork off a prefetch thread to make sure the HD isn't sitting idle while there's still data we're going to need. tail(1) might spend less CPU copying stuff around in RAM (it would make fewer system calls writing /dev/null), but POSIX tail only takes one arg. There's a fadvise(1) which would be perfect if it was standard, and not written in perl! I'm seeing a moderate speedup for this change, about 15% on Linux 3.13 with a magnetic HD on an idle system, after a echo 3 | sudo tee /proc/sys/vm/drop_caches > /dev/null And no slowdown with hot caches. (dual core CPU) Other changes: Had to move the necessary stuff up near the top of the file. Was able to greatly simplify the loop over BASH_COMPLETION_COMPAT_DIR by using the glob in the first place, instead of ls and then filtering. Took out the check for [[ -r $i ]] before sourcing. If you have files in /etc/bash_completion.d that aren't readable, you might not even notice if bash_completion silently ignores them. It's not like anything else uses the directory, so don't be too quiet when there is a problem. I've even seen packages put completions in subdirectories (e.g. unison) Could change from -f to -e to get warnings for that. A package could legitimately have a helper function or something in a subdir, though. --- bash_completion | 48 ++++++++++++++++++++++++++++++++++-------------- 1 file changed, 34 insertions(+), 14 deletions(-) diff --git a/bash_completion b/bash_completion index fd3bc41aaa160fdb3437db419d034c2be045ccc4..f2b7db58877c6800fe4e9a46d6d84dcab80ea013 100644 --- a/bash_completion +++ b/bash_completion @@ -47,9 +47,41 @@ readonly BASH_COMPLETION_COMPAT_DIR # _blacklist_glob='@(acroread.sh)' +# Glob for matching various backup files. +# +_backup_glob='@(#*#|*@(~|.@(bak|orig|rej|swp|dpkg*|rpm@(orig|new|save))))' + # Turn on extended globbing and programmable completion shopt -s extglob progcomp +# source (or prefetch from disk) compat completion directory definitions +_load_compat_dir() +{ + [[ -d $BASH_COMPLETION_COMPAT_DIR ]] || return + local i glob="$BASH_COMPLETION_COMPAT_DIR/!($_backup_glob|Makefile*|$_blacklist_glob)" + + if [[ $1 == prefetch ]]; then + if [[ ! $BASH_COMPLETION_DISABLE_PREFETCH ]]; then + ( # fork a background subshell to let main continue ASAP + exec &>/dev/null + true $glob/ # inodes. expanding this glob will stat(2) + exec cat $glob # contents + ) & + disown $! + fi + else + for i in $glob; do + # If there are unreadable files, user probably wants to know, + # so don't check -r + [[ -f $i ]] && . "$i" + done + fi +} +# called again near the end of this file, and then unset +_load_compat_dir prefetch + + + # A lot of the following one-liners were taken directly from the # completion examples provided with the bash 2.04 source distribution @@ -1105,10 +1137,6 @@ _gids() fi } -# Glob for matching various backup files. -# -_backup_glob='@(#*#|*@(~|.@(bak|orig|rej|swp|dpkg*|rpm@(orig|new|save))))' - # Complete on xinetd services # _xinetd_services() @@ -1999,16 +2027,8 @@ _xfunc() "$@" } -# source compat completion directory definitions -if [[ -d $BASH_COMPLETION_COMPAT_DIR && -r $BASH_COMPLETION_COMPAT_DIR && \ - -x $BASH_COMPLETION_COMPAT_DIR ]]; then - for i in $(LC_ALL=C command ls "$BASH_COMPLETION_COMPAT_DIR"); do - i=$BASH_COMPLETION_COMPAT_DIR/$i - [[ ${i##*/} != @($_backup_glob|Makefile*|$_blacklist_glob) \ - && -f $i && -r $i ]] && . "$i" - done -fi -unset i _blacklist_glob +_load_compat_dir source +unset _blacklist_glob _load_compat_dir # source user completion file [[ ${BASH_SOURCE[0]} != ~/.bash_completion && -r ~/.bash_completion ]] \ -- 2.1.3
signature.asc
Description: Digital signature
_______________________________________________ Bash-completion-devel mailing list [email protected] http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/bash-completion-devel
