Andrea Pappacoda writes ("Bug#1106071: [RFC PATCH dgit v1] tag2upload: add pristine-tar support"): > I did two things:
Thanks. This seems to be going in the right direction. Very exciting! I have a bunch of comments about various details etc. Please don't be discouraged by my pickiness. It's no reflection on the quality of your work. Getting feedback like this from me is entirely normal and expected :-). > 1. Added a new pristine-tar=commit_id metadata field to the signed tag > generated by git-debpush This seems right to me. Should this be a "critical extension" in X.509 terminology? This would be the first such, (--quilt ought to have been but is grandfathered) but I think probably if someone hypothetically somehow sends this option to a t2u service where this support has not yet been deployed (or has been withdrawn!) what should happen? Probably it should fail. So how about [dgit ... !pristine-tar=... ...] or [dgit ... +pristine-tar=... ...] or some such? The new metadata item will need to be documented in tag2upload(5). > 2. If a pristine-tar commitid is present, the git repo is hard-reset to > it, then the tarball name is obtained from the tree, and > `pristine-tar checkout` is invoked to generate the tarball. By "the tarball name is obtained from the tree" you mean that the orig tarball name is obtained from the pristine-tar commit's tree. > I believe this is safe security-wise because the commit id represented > the expected status of the pristine-tar branch on the developer's > machine is signed at the time of the upload. If for some reason the > branch gets in-between the upload, and the expected commit is lost, > things will just fail instead of generating a "wrong" tarball. Do you > see flaws in this reasoning? I think this is correct from my understanding of pristine-tar. Subject to what I write below about pristine-tar vs HEAD. > Note: I did not test that my implementation is actually correct and > working. I did not add any new test either. I believe there are no > regressions in old tests but I'm testing this on a plane and didn't look > too much into the log output. Right. > Let me know that you think! Bye :) > --- a/git-debpush > +++ b/git-debpush ... > +pristine_tar_info='' > +if pristine_tar_commit=$(git rev-parse --verify --quiet > 'refs/heads/pristine-tar'); then > + pristine_tar_info=" pristine-tar=$pristine_tar_commit" > +fi Don't we want to do this only if the current upstream version actually has any informaation in pristine-tar ? If not, then the user didn't import the tarball, presumably ? What happens if another maintainer did this new upstream version, and our user hasn't got that up-to-date pristine-tar branch yet? I think ideally we'd use the orig from the archive, but there is a risk of lossage if the uploads come too close together - see ##1109130. > #**** Useful sanity checks **** I don't know what those might be. I guess we could leave them for now and see what things go wrong in reality. Ie, I don't think this is a blocker. > [dgit distro=$distro split$quilt_mode_text] > -[dgit please-upload source=$source version=$version$upstream_info] > +[dgit please-upload source=$source > version=$version$upstream_info$pristine_tar_info] We might want to linewrap this a bit more. That line is getting quite long. I guess we could put all the upstream and pristine-tar info in a separate line maybe? > git_tag_main_opts_args=(-m "$tagmessage" "$debian_tag" "$branch_commit") > diff --git a/infra/dgit-repos-server b/infra/dgit-repos-server > index f6a3716c..e797123b 100755 > --- a/infra/dgit-repos-server > +++ b/infra/dgit-repos-server This part looks correct. Except: > + "pristinetar=$t2u_pristinetar", I think this option should be called "pristine_tar". > -x git deborig "$s_u" > +if [ -n "$s_pristinetar" ]; then > + old_head=$(git rev-parse --verify HEAD) > + git reset --hard -- "$s_pristinetar" I don't think this is right, is it ? We want to rewind the local pristine-tar branch to that commit, not the HEAD. I'm assuming pristine-tar implicitly uses refs/heads/pristine-tar rather than HEAD. We should definitely have a test case simulating an attack where the salsa pristine-tar branch has got strange stuff in it. > + # Only get the first tarball file matching the current package version. > + # pristine-tar does not disallow having two tarballs for the same package > + # version using different compression algorithms, so this will just pick > a > + # random one. It might make sense to error out instead in such cases. We definitely don't want it to be random! Let's call this an error for now. If this turns out to be a problem in practice, and users don't want to just git rm the unwanted info from their pristine-tar branch, then I guess we'll need to add the tarball name to the tag or something. > + tarball=$(git ls-tree --name-only | while IFS= read -r file; do If you're using git-ls-tree then you need to use the nul-separated output format I think. Urgh. Feel free to switch implementation language. This part of the code needs to be super-careful and cope with all weird and unexpected git objects. For that matter, we should probably try to defend pristine-tar from "weird stuff" so we should probably check that the tarball metadata file is indeed a *file*. Ian. -- Ian Jackson <ijack...@chiark.greenend.org.uk> These opinions are my own. Pronouns: they/he. If I emailed you from @fyvzl.net or @evade.org.uk, that is a private address which bypasses my fierce spamfilter.