> > > apt-file -x search bin/dig$
> > >
> > > ...does not return anything. While:
> >
> > Here the 'grep' is done against lines in *Contents*.gz in your cache
> > directory. This file format is:
> > /file/path                  section/package

Actually, it is:

path/to/file                     section/package1,section/other/package2,...

so this complicates the matching.  For one thing, users will routinely
want to type "/usr/bin/prog" which you handle by stripping off the
leading slash.  Secondly, if a user enters in a strict regex for the
package name, then it may not match if the package name is second etc.
 And files can have spaces in the names (I checked and this isn't just
a theoretical; there are quite a few files with spaces).

This leading slash stripping is too severe of a default without noting
it.  For instance, I routinely want to find header files so I would
try this intuitively:

$ apt-file -x search "/ldap\.h$"

and quickly find out that '$' doesn't work.  Because of the silent
first slash removal, I get lots of other "matches" too when I remove
the '$'.  If you noted this in the man page, then I think a common
idiom would be:

$ apt-file -F -x search //ldap\.h

when you actually want the leading slash and terminating at the end of the name.

> > your pattern assumes that "bin" is the section and 'dig" the package
> > name.
>
> Har, har, it is not my problem when you do it internaly in this way. I
> have read the manpage and the help output and all it says leads me to an
> assumption that I can specify a regexp of a _file_ name when looking for
> a _file_ with apt-*file*.

I agree here.  It's very unintuitive when you expect the output and
input to be similar.
It also doesn't mention what type of regex is used.  You would have to
look at the source
or trial/error.

> Either implement it in a way expected by the user (spliting the fields

I looked into splitting the fields and reordering them and I think
this is too time consuming.
Even an extra regex when you loop through that much data really adds up.

I wanted to have an intuitive version where it would do this (note:
You do need this type of regex because there are filenames with spaces
in the middle plus you don't want to capture trailing spaces).

($file, $packages) = /^(\S+(?:[\S\s]*\S+)?)\s+(\S+)\s*$/;

and then I could do:

$file =~ /$user_regex/;

so a '$' would work and so on.  This really increased the runtime
though and I don't think it is acceptable.

> in Contents input and reordering them) or at least document it properly
> (and do not call it simply regular expresion, it is to generic and
> misleading, describe what it actually is (pattern for a line in Contents)).

Additionally, your regex is inserted into another regex.  You cannot
match the beginning of the line or package names because of this.  Try
it out with -v and see what it puts in addition to yours.

> > > apt-file search bin/dig | grep bin/dig$
> > > dnsutils: usr/bin/dig
> > >
> > > works pretty well.
> >
> > Well here the grep is done against apt-file output, thus it works fine.
>
> Of course. But it is everything but user-friendly since the user (me)
> expects that the regexp matchin is applied in exactly the same way as in
> the presented output.
>
> You know that it works differently but you do not share your knowledge.

But after saying all of this, apt-file is useful.  I took a look into
the performance tonight and I think you can really speed it up in the
default configuration (without regex).  You can also speed it up with
POSIX regex but that's more annoying (I added a new switch to allow
POSIX regex).

http://pastebin.ca/41919

I pastebin'd a patch that uses zegrep instead of zcat and looping with
perl.  This is a dramatic runtime increase (order reversed to show old
vs new).  It also changes the way the regexes work so that a regex
user has more control over the results with a file search.

% date ; /usr/bin/apt-file search nvidia | wc ; date
Fri Feb 17 01:19:00 EST 2006
    206     412   15018
Fri Feb 17 01:19:14 EST 2006

% date ; perl ./apt-file.both search nvidia | wc ; date
Fri Feb 17 01:18:51 EST 2006
    206     412   15018
Fri Feb 17 01:18:55 EST 2006

I also added a dependency on zpcregrep so you can keep the perl style
regex.  This isn't the same type of speed up though:

% date ; perl ./apt-file.both -x search nvidia | wc ; date
Fri Feb 17 01:19:23 EST 2006
    206     412   15018
Fri Feb 17 01:19:32 EST 2006


These are tests for regex on files with all three:

% date ; perl ./apt-file -x -p search "//ldap\.h" | wc ; date
Fri Feb 17 01:22:16 EST 2006
      8      16     418
Fri Feb 17 01:22:20 EST 2006

% date ; /usr/bin/apt-file -x search "//ldap\.h" | wc ; date
Fri Feb 17 01:22:36 EST 2006
      8      16     418
Fri Feb 17 01:22:49 EST 2006

% date ; perl ./apt-file -x search "//ldap\.h" | wc ; date
Fri Feb 17 01:22:59 EST 2006
      8      16     418
Fri Feb 17 01:23:10 EST 2006

Reply via email to