Mark Knecht posted on Mon, 04 Aug 2014 15:04:12 -0700 as excerpted: > As the line in that favorite song goes "Paranoia strikes deep"...
FWIW, while my lists sig is the proprietary-master quote from Richard Stallman below, since the (anti-)patriot bill was passed in the reaction to 9-11, my private email sig is a famous quote from Benjamin Franklin: "They that can give up essential liberty to obtain a little temporary safety, deserve neither liberty nor safety." So "I'm with ya..." > <NOTE> > I am NOT trying to start ANY political discussion here. I hope no one > will go too far down that path, at least here on this list. There are > better places to do that. > > I am also NOT suggesting anything like what I ask next has happened, > either here or elsewhere. It's just a question. > > Thanks in advance. > </NOTE> > > I'm currently reading a new book by Glen Greenwald called "No Place To > Hide" which is about Greenwald's introduction to Edward Snowden and the > release of all of the confidential NSA documents Snowden acquired. This > got me wondering about Gentoo, or even just Linux in general. If the > underlying issue in all of that Snowden stuff is that the NSA has the > ability to intercept and hack into whatever they please, then how do I > know that the source code I build on my Gentoo machines hasn't been > modified by someone to provide access to my machine, networks, etc.? These are good questions to ask, and to have some idea of the answers to, as well. Big picture, at some level, you pretty much have to accept that you /don't/ know. However, there's /some/ level of security... tho honestly a bit less on Gentoo than on some of the other distros (see below), tho it'd still not be /entirely/ easy to subvert at least widely (for an individual downloader is another question), but it could be done. > Essentially, what is the security model for all this source code and how > do I verify that it hasn't been tampered with in some manner? > > 1) That the code I build is exactly as written and accepted by the OS > community? At a basic level, source and ebuild integrity, protecting both from accidental corruption (where it's pretty good) and from deliberate tampering (where it may or may not be considered "acceptable", but if someone with the resources wanted to bad enough, they could subvert), is what ebuild and sources digests are all about. The idea is that the gentoo package maintainer creates hash digests of multiple types for both the ebuild and the sources, such that should the copy that a gentoo user gets not match the copy that a gentoo maintainer created, the package manager (PM, normally portage), if configured to do so (mainly FEATURES=strict, also see stricter and assume-digests, plus the webrsync- gpg feature mentioned below) will error out and refuse to emerge that package. But there are serious limits to that protection. Here's a few points to consider: 1) While the ebuilds and sources are digested, those digests do *NOT* extend to the rest of the tree, the various files in the profile directory, the various eclasses, etc. So in theory at least, someone could mess with say the package.mask file in profiles, or one of the eclasses, and could potentially get away with it. But see point #3 as there's a (partial) workaround for the paranoid. 2) Meanwhile, since hashing (unlike gpg signing) isn't designed to be secure, primarily protecting against accidental damage not so much deliberate compromise, with digest verification verifying that nothing changed in transit but not who did the digest in the first place, there's some risk that one or more gentoo rsync mirrors could be compromised or be run by a bad actor in the first place. Should that occur, the bad actor could attempt to replace BOTH the digested ebuild and/or sources AND the digest files, updating the latter to reflect his compromised version instead of the version originally digested by the gentoo maintainer. Similarly, someone such as the NSA could at least in theory do the same thing in transit, targeting a specific user's downloads while leaving everyone else's downloads from the same mirror alone, so only the target got the compromised version. While there's a reasonable chance someone would catch a bad mirror, if a single downloader is specifically targeted, unless they're specifically validating against other mirrors as well and/or comparing digests (over a secure channel) against those someone else downloaded, there's little chance they'd detect the problem. So even digest-protected files aren't immune to compromise. But as I said above, there's a (partial) workaround. See point #3. 3) While #1 applies to the tree in general when it is rsynced, gentoo does have a somewhat higher security sync method for the paranoid and to support users behind firewalls which don't pass rsync. Instead of running emerge sync, this method uses the emerge-webrsync tool, which downloads the entire main gentoo tree as a gpg-signed tarball. If you have FEATURES=webrsync-gpg set (see the make.conf manpage, FEATURES, webrsync-gpg), portage will verify the gpg signature on this tarball. The two caveats here are (1) that the webrsync tarball is generated only once per day, while the main tree is synced every few minutes, so the rsynced tree is going to be more current, and (2) that each snapshot is the entire tree, not just the changes, so for those updating daily or close to it, fetching the full tarball every day instead of just the changes will be more network traffic. Tho I think the tarball is compressed (I've never tried this method personally so can't say for sure) while the rsync tree isn't, so if you're updating monthly, I'd guess it's less traffic to get the tarball. The tarball is gpg-signed which is more secure than simple hash digests, but the signature covers the entire thing, not individual files, so the granularity of the digests is better. Additionally, the tarball signing is automated, so while a signature validation pretty well ensures that the tarball did indeed come from gentoo, should someone compromise gentoo infrastructure security and somehow get a bad file in place, the daily snapshot tarball would blindly sign and package up the bad file along with all the rest. So sync-method bottom line, if you're paranoid or simply want additional gpg-signed security, use emerge-webrsync along with FEATURES=webrsync-gpg, instead of normal rsync-based emerge sync. That pretty well ensures that you're getting exactly the gentoo tree tarball gentoo built and signed, which is certainly far more secure than normal rsync syncing, but because the tarballing and signing is automated and covers the entire tree, there's still the possibility that one or more files in that tarball are compromised and that it hasn't been detected yet. Meanwhile, I mentioned above that gentoo isn't as secure in this regard as a number of other Linux distros. This is DEFINITELY the case for normal rsync syncers, but even for webrsync-gpg syncers it remains the case to some extent. Unfortunately, in practice it seems that isn't likely to change in the near-term, and possibly not in the medium or longer term either, unless some big gentoo compromise is detected and makes the news. THEN we're likely to see changes. Alternatively, when that big pie-in-the-sky main gentoo tree switch from cvs (yes, still) to git eventually happens, the switch to full-signing will be quite a bit easier, tho there will still be policies to enforce, etc. But they've been talking about the switch to git for years, as well, and... incrementally... drawing closer, including the fact that major portions of gentoo are actually developed in git-based overlays these days. But will the main tree ever actually switch to git? Who knows? As of now it's still pie-in-the-sky, with no nailed down plans. Perhaps at some point somebody and some gentoo council together will decide it's time and move whatever mountains or molehills remain to get it done, and at this point I think that's mostly what it'll take, perhaps not, but unless that somebody steps up and makes that push come hell or high water, assuming gentoo's still around by then, come 2025 we could still be talking about doing it... someday... Back to secure-by-policy gpg-signing... The problem is that while we've known what must be done, and what other distros have already done, for years, and while gentoo has made some progress down the security road, in the absence of that ACTIVE KNOWN COMPROMISE RIGHT NOW immediate threat, other things simply continue to be higher priority, while REAL gentoo security continues to be back-burnered. Basically, what must be done, thru all the way to policy enforcement and refusing gentoo developer commits if they don't match policy, is enforce a policy that every gentoo dev has a registered gpg key (AFAIK that much is already the case), and that every commit they make is SIGNED by that personal developer key, with gentoo-infra verification of those signatures, rejecting any commit that doesn't verify. FWIW, there's GLEPs detailing most of this. They've just never been fully implemented, tho incrementally, bits and pieces have been, over time. As I said, other distros have done this, generally when they HAD to, when they had that compromise hitting the news. Tho I think a few distros have implemented such a signed-no-exceptions policy when some OTHER distro got hit. Gentoo hasn't had that happen yet, and while the infrastructure is generally there to sign at least individual package commits, and some devs actually do so (you can see the signed digests for some packages, for instance), that hasn't been enforced tree-wide, and in fact, there's a few relatively minor but still important policy questions to resolve first, before such enforcement is actually activated. Here's one such signing-policy question to consider. Currently, package maintainer devs make changes to their ebuilds, and later, after a period of testing, arch-devs keyword a particular ebuild stable for their arch. Occasionally arch-devs may add a bit of conditional code that applies to their arch only, as well. Now consider this. Suppose a compromised package is detected after the package has been keyworded stable. The last several signed commits to that package were keywording only, while the commit introducing the compromise was sometime earlier. Question: Are those arch-devs that signed their keywording-only commits responsible too, because they signed off on the package, meaning they now have to inspect every package they keyword, checking for compromises that might not be entirely obvious to them, or are they only responsible for the keywording changes they actually committed, and aren't obligated to actually inspect the rest of the ebuild they're now signing? OK, so we say that they're only responsible for the keywording. Simple enough. But what about this? Suppose they add an arch-conditional that combined with earlier code in the package results in a compromise. But the conditional code they added looks straightforward enough on its own, and really does solve a problem on that arch, and without that code, the original code looks innocently functional as well. But together, anyone installing that package on that arch is now open to the world. Both devs signed, the code of both devs is legit and looks innocent enough on its own, but taken together, they result in a bad situation. Now it's not so clear that an arch-dev shouldn't have to inspect and sign for the results of the package after his commit, is it? Yet enforcing that as policy will seriously slow-down arch stable keywording, and some archs can't keep up as it is, so such a policy will be an effective death sentence for them as a gentoo-stable supported arch. Certainly there are answers to that sort of question, and various distros have faced and come up with their own policy answers, often because in the face of a REAL DISTRO COMPROMISE making the news, they've had no other choice. To some extent, gentoo is lucky in that it hasn't been faced with making those hard choices yet. But the fact is, all gentoo users remain less safe than we could be, because those hard choices haven't been made and enforced... because we've not been forced to do so. Meanwhile, even were we to have done so, there's still the possibility that upstream development might be compromised. Every year or two, some upstream project or another makes news due to some compromise or another. Sometimes vulnerable versions have been distributed for awhile, and various distros have picked them up. In an upstream-compromise situation like that, there's little a distro can do, with the exception of going slow enough that their packages are all effectively outdated, which also happens to be a relatively effective counter to this sort of issue since if a several years old version changes it'll be detected right away, and (one hopes) most compromises to a project server will be detected within months at the longest, so anything a year or more old should be relatively safe from this sort of issue, simply by virtue of its age. Obviously the people and enterprise distros willing to run years outdated code do have that advantage, and that's a risk that people wishing to run reasonably current code simply have to take as a result of that choice, regardless of the distro they chose to get that current code from. But even if you choose to run an old distro so aren't likely to be hit by current upstream compromises, that has and enforces a full signing policy so every commit can be accounted for, and even if none of those developers at either the distro or upstream levels deliberately breaks the trust and goes bad, there's still the issue below... > 2) That the compilers and interpreters don't do anything except build > the code? There's a very famous in security circles paper that effectively proves that unless you can absolutely trust every single layer in the build line, including the hardware layer (which means its sources) and the compiler and tools used to build your operational tools, and the compiler and tools used to build them, and... all the way back... you simply cannot absolutely trust the results, period. I never kept the link, but it seems the title actually stuck in memory well enough for me to google it: "Reflections on Trusting Trust" =:^) Here's the google link: https://www.google.com/search?q=%22reflections+on+trusting+trust%22 That means that in ordered to absolutely prove the gcc (for example) on our own systems, even if we can read and understand every line of gcc source, we must absolutely prove the tools on the original installation media and in the stage tarballs that we used to build our system. Which means we must not only have the code to them and trust the builders, but we must have the code and trust the builders of the tools they used, and the builders and tools of those tools, and... Meanwhile, the same rule effectively applies to the hardware as well. And while Richard Stallman may run a computer that is totally open source hardware and firmware (down to the BIOS or equivalent), for which he has all the schemantics, etc, most of us run at least some semi-proprietary hardware of /some/ sort. Which means even if we /could/ fully understand the sources ourselves, without them and without that full understanding, at that level, we simply have to trust... someone... basically, the people who design and manufacture that hardware. Thus, in practice, (nearly) everyone ends up drawing the line /somewhere/. The Stallmans of the world draw it pretty strictly, refusing to run anything which at minimum has replaceable firmware which doesn't itself have sources available. (As Stallman defines it, if the firmware is effectively burned in such that the manufacturer themselves can't update it, then that's good enough for the line he draws. Tho that leads to absurdities such as an OpenMOKO phone that at extra expense has the firmware burned onto a separate chip such that it can't be replaced by anyone, in ordered to be able to use hardware that would otherwise be running firmware that the supplier refuses to open-source -- because the extra expense to do it that way means the manufacturer can't replace the firmware either, so it's on the OK side of Stallman's line.) Meanwhile, I personally draw the line at what runs at the OS level on my computer. That means I won't run proprietary graphics drivers or flash, but I will and do load source-less firmware onto the Radeon-based graphics hardware I do run, in ordered to use the freedomware kernel drivers for the same hardware that I refuse to run the proprietary frglx drivers on. Other people are fine running flash and/or proprietary graphics drivers, but won't run a mostly-proprietary full OS such as MS Windows or Apple OSX. Still others prefer to run open source where it fits their needs, but won't go out of their way to do so if proprietary works better for them, and still others simply don't care either way, running whatever works best regardless of the freedom or lack thereof of its sources. Anyway, when it comes to hardware and compiler, in practice the best you can do is run a FLOSS compiler such as gcc, while trusting the tools you used to build the first ancestor, basically, the gcc and tools in the stage tarballs, as well as whatever you booted (probably either a gentoo- installer or another distro) in ordered to chroot into that unpacked stage and build from there. Beyond that, well... good luck, but you're still going to end up drawing the line /somewhere/. > There's certainly lots of other issues about security, like protecting > passwords, protecting physical access to the network and machines, root > kits and the like, etc., but assuming none of that is in question (I > don't have any reason to think the NSA has been in my home!) ;-) I'm > looking for info on how the code is protected from the time it's signed > off until it's built and running here. > > If someone knows of a good web site to read on this subject let me know. > I've gone through my Linux life more or less like most everyone went > through life 20 years ago, but paranoia strikes deep. Indeed. Hope the above was helpful. I think it's a pretty accurate picture from at least my own perspective, as someone who cares enough about it to at least spend a not insignificant amount of time keeping up on the current situation in this area, both for linux in general, and for gentoo in particular. -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman