> > > apt-file -x search bin/dig$ > > > > > > ...does not return anything. While: > > > > Here the 'grep' is done against lines in *Contents*.gz in your cache > > directory. This file format is: > > /file/path section/package
Actually, it is: path/to/file section/package1,section/other/package2,... so this complicates the matching. For one thing, users will routinely want to type "/usr/bin/prog" which you handle by stripping off the leading slash. Secondly, if a user enters in a strict regex for the package name, then it may not match if the package name is second etc. And files can have spaces in the names (I checked and this isn't just a theoretical; there are quite a few files with spaces). This leading slash stripping is too severe of a default without noting it. For instance, I routinely want to find header files so I would try this intuitively: $ apt-file -x search "/ldap\.h$" and quickly find out that '$' doesn't work. Because of the silent first slash removal, I get lots of other "matches" too when I remove the '$'. If you noted this in the man page, then I think a common idiom would be: $ apt-file -F -x search //ldap\.h when you actually want the leading slash and terminating at the end of the name. > > your pattern assumes that "bin" is the section and 'dig" the package > > name. > > Har, har, it is not my problem when you do it internaly in this way. I > have read the manpage and the help output and all it says leads me to an > assumption that I can specify a regexp of a _file_ name when looking for > a _file_ with apt-*file*. I agree here. It's very unintuitive when you expect the output and input to be similar. It also doesn't mention what type of regex is used. You would have to look at the source or trial/error. > Either implement it in a way expected by the user (spliting the fields I looked into splitting the fields and reordering them and I think this is too time consuming. Even an extra regex when you loop through that much data really adds up. I wanted to have an intuitive version where it would do this (note: You do need this type of regex because there are filenames with spaces in the middle plus you don't want to capture trailing spaces). ($file, $packages) = /^(\S+(?:[\S\s]*\S+)?)\s+(\S+)\s*$/; and then I could do: $file =~ /$user_regex/; so a '$' would work and so on. This really increased the runtime though and I don't think it is acceptable. > in Contents input and reordering them) or at least document it properly > (and do not call it simply regular expresion, it is to generic and > misleading, describe what it actually is (pattern for a line in Contents)). Additionally, your regex is inserted into another regex. You cannot match the beginning of the line or package names because of this. Try it out with -v and see what it puts in addition to yours. > > > apt-file search bin/dig | grep bin/dig$ > > > dnsutils: usr/bin/dig > > > > > > works pretty well. > > > > Well here the grep is done against apt-file output, thus it works fine. > > Of course. But it is everything but user-friendly since the user (me) > expects that the regexp matchin is applied in exactly the same way as in > the presented output. > > You know that it works differently but you do not share your knowledge. But after saying all of this, apt-file is useful. I took a look into the performance tonight and I think you can really speed it up in the default configuration (without regex). You can also speed it up with POSIX regex but that's more annoying (I added a new switch to allow POSIX regex). http://pastebin.ca/41919 I pastebin'd a patch that uses zegrep instead of zcat and looping with perl. This is a dramatic runtime increase (order reversed to show old vs new). It also changes the way the regexes work so that a regex user has more control over the results with a file search. % date ; /usr/bin/apt-file search nvidia | wc ; date Fri Feb 17 01:19:00 EST 2006 206 412 15018 Fri Feb 17 01:19:14 EST 2006 % date ; perl ./apt-file.both search nvidia | wc ; date Fri Feb 17 01:18:51 EST 2006 206 412 15018 Fri Feb 17 01:18:55 EST 2006 I also added a dependency on zpcregrep so you can keep the perl style regex. This isn't the same type of speed up though: % date ; perl ./apt-file.both -x search nvidia | wc ; date Fri Feb 17 01:19:23 EST 2006 206 412 15018 Fri Feb 17 01:19:32 EST 2006 These are tests for regex on files with all three: % date ; perl ./apt-file -x -p search "//ldap\.h" | wc ; date Fri Feb 17 01:22:16 EST 2006 8 16 418 Fri Feb 17 01:22:20 EST 2006 % date ; /usr/bin/apt-file -x search "//ldap\.h" | wc ; date Fri Feb 17 01:22:36 EST 2006 8 16 418 Fri Feb 17 01:22:49 EST 2006 % date ; perl ./apt-file -x search "//ldap\.h" | wc ; date Fri Feb 17 01:22:59 EST 2006 8 16 418 Fri Feb 17 01:23:10 EST 2006