Package: apt
Version: 2.9.12
Severity: normal

APT should normalize HTTP URIs before sending requests as described in
RFC 3986 section 6.

One example of why this is important is lack of Path Segment
Normalization:
https://datatracker.ietf.org/doc/html/rfc3986#section-6.2.2.3

"The complete path segments "." and ".." are intended only for use
 within relative references (Section 4.1) and are removed as part of
 the reference resolution process (Section 5.2).  However, some
 deployed implementations incorrectly assume that reference resolution
 is not necessary when the reference is already a URI and thus fail to
 remove dot-segments when they occur in non-relative paths."

I have seen this in practice when trying to use an APT repo hosted on a
web server that itsels is not normalizing URIs correctly. If the
Packages file includes filenames such as ./testpackage_1.0.0-1_all.deb,
which is basically the default behavior when creating Packages with
dpkg-scanpackages . /dev/null, APT simply concatentes the filename and
sends requests such as GET /./testpackage_1.0.0-1_all.deb. If the
webserver does not remove the ./ part as it should, 404 errors are
served and the repository does not work.

Here is a basic HTTP web server in Python showing the problem:

from http.server import HTTPServer, BaseHTTPRequestHandler

class SimpleHTTPRequestHandler(BaseHTTPRequestHandler):
    def do_GET(self):
        if self.path == '/Packages':
            self.send_response(200)
            self.send_header('Content-Length', '318')
            self.end_headers()
            self.wfile.write(b"""Package: testpackage
Version: 1.0.0-1
Architecture: all
Maintainer: John Doe <j...@example.org>
Installed-Size: 228
Filename: ./testpackage_1.0.0-1_all.deb
Size: 17658
SHA256: d4b2c11779e761692dd040f641d57edcc15be60f40b73507f684f138f86f8240
Section: admin
Priority: extra
Description: Test Package
 Nothing interesting

""")
        else:
            self.send_response(404)
            self.end_headers()
            self.wfile.write(b'404 Not Found')

if __name__ == "__main__":
    httpd = HTTPServer(('', 8000), SimpleHTTPRequestHandler)
    httpd.serve_forever()

To reproduce, add the following to sources.list:

  deb [trusted=yes] http://localhost:8000/ /

Then run `apt update` and `apt download testpackage`. The HTTP request
sent by APT will be:

  GET /./testpackage_1.0.0-1_all.deb

Instead of:

  GET /testpackage_1.0.0-1_all.deb

Debugging this sort of issues is made harder by the fact that HTTP
clients normally do normalize the URI, hence hiding the problem when
trying things such as:

  curl http://localhost:8000/./testpackage_1.0.0-1_all.deb

In the specific case of curl, the --path-as-is option can be used to
skip normalization and reproduce.

Thanks,
  Emanuele

Reply via email to