I agree with Dave’s assertion that this should be in a separate Random library. Not every project needs random numbers and there could possibly be a SecureRandom that exclusively uses CSPRNGs for it’s functionality.
I also agree that trapping is not a preferred behavior. Optionals are slightly better, but in many instances the developer doesn’t care if the random number is secure or non-reproducible. They really just want some number within a specified range that “seems” random enough for that instant. SecureRandom numbers could be optional or trap as lacking entropy might significantly effect the usage of the random number. > On Oct 4, 2017, at 10:33 AM, Félix Cloutier via swift-evolution > <[email protected]> wrote: > > Anything that hasn't killed the process seems fine, and you have to start > from `main` for anything else. On iOS, you can be suspended at any time, but > the program will only continue from the point that it was suspended if it > hasn't been torn down; otherwise, it has to restart from the beginning and > reload the known UI state that it is responsible for saving. Unless we go out > of our way to destroy the PRNG, it won't go away from under the program's > feet. I'm not aware of any OS that will core dump programs on shutdown and > try to rehydrate them on reboot. > > Félix > >> Le 4 oct. 2017 à 03:05, Xiaodi Wu <[email protected] >> <mailto:[email protected]>> a écrit : >> >> On Wed, Oct 4, 2017 at 04:55 Xiaodi Wu <[email protected] >> <mailto:[email protected]>> wrote: >> Seems like the API would be actively hiding he possibility of failure so >> that you’d have to be in the know to prevent it. Those who don’t know about >> it would be hunting down a ghost as they’re trying to debug, especially if >> their program crashes rarely, stochastically, and non-reproducibly because a >> third party library calls random() in code they can’t see. I think this >> makes trapping the least acceptable of all options. >> >> I agree with Felix’s concern, which is why I brought up the question, but >> ultimately the issue is unavoidable. It’s not down to global instance or >> not. If your source of random numbers is unseedable and may mix in >> additional entropy at any time, then it may fail at any time because when a >> hardware restart might happen may be transparent to the process. The user >> must know about this or else we are laying a trap (pun intended). >> >> On Wed, Oct 4, 2017 at 04:49 Jonathan Hull <[email protected] >> <mailto:[email protected]>> wrote: >> @Xiaodi: What do you think of the possibility of trapping in cases of low >> entropy, and adding an additional global function that checks for entropy so >> that conscientious programmers can avoid the trap and provide an alternative >> (or error message)? >> >> Thanks, >> Jon >> >> >>> On Oct 4, 2017, at 2:41 AM, Xiaodi Wu <[email protected] >>> <mailto:[email protected]>> wrote: >>> >>> >>> On Wed, Oct 4, 2017 at 02:39 Félix Cloutier <[email protected] >>> <mailto:[email protected]>> wrote: >>> I'm really not enthusiastic about `random() -> Self?` or `random() throws >>> -> Self` when the only possible error is that some global object hasn't >>> been initialized. >>> >>> The idea of having `random` straight on integers and floats and collections >>> was to provide a simple interface, but using a global CSPRNG for those >>> operations comes at a significant usability cost. I think that something >>> has to go: >>> >>> Drop the random methods on FixedWidthInteger, FloatingPoint >>> ...or drop the CSPRNG as a default >>> Drop the optional/throws, and trap on error >>> >>> I know I wouldn't use the `Int.random()` method if I had to unwrap every >>> single result, when getting one non-nil result guarantees that the program >>> won't see any other nil result again until it restarts. >>> >>> From the perspective of an app that can be suspended and resumed at any >>> time, “until it restarts” could be as soon as the next invocation of >>> `Int.random()`, could it not? >>> >>> >>> Félix >>> >>>> Le 3 oct. 2017 à 23:44, Jonathan Hull <[email protected] >>>> <mailto:[email protected]>> a écrit : >>>> >>>> I like the idea of splitting it into 2 separate “Random” proposals. >>>> >>>> The first would have Xiaodi’s built-in CSPRNG which only has the interface: >>>> >>>> On FixedWidthInteger: >>>> static func random()throws -> Self >>>> static func random(in range: ClosedRange<Self>)throws -> Self >>>> >>>> On Double: >>>> static func random()throws -> Double >>>> static func random(in range: ClosedRange<Double>)throws -> Double >>>> >>>> (Everything else we want, like shuffled(), could be built in later >>>> proposals by calling those functions) >>>> >>>> The other option would be to remove the ‘throws’ from the above functions >>>> (perhaps fatalError-ing), and provide an additional function which can be >>>> used to check that there is enough entropy (so as to avoid the crash or >>>> fall back to a worse source when the CSPRNG is unavailable). >>>> >>>> >>>> >>>> Then a second proposal would bring in the concept of RandomSources >>>> (whatever we call them), which can return however many random bytes you >>>> ask for… and a protocol for types which know how to initialize themselves >>>> from those bytes. That might be spelled like 'static func random(using: >>>> RandomSource)->Self'. As a convenience, the source would also be able to >>>> create FixedWidthIntegers and Doubles (both with and without a range), and >>>> would also have the coinFlip() and oneIn(UInt)->Bool functions. Most types >>>> should be able to build themselves off of that. There would be a default >>>> source which is built from the first protocol. >>>> >>>> I also really think we should have a concept of Repeatably-Random as a >>>> subprotocol for the second proposal. I see far too many shipping apps >>>> which have bugs due to using arc4Random when they really needed a >>>> repeatable source (e.g. patterns and lines jump around when you resize >>>> things). If it was an easy option, people would use it when appropriate. >>>> This would just mean a sub-protocol which has an initializer which takes a >>>> seed, and the ability to save/restore state (similar to CGContexts). >>>> >>>> The second proposal would also include things like shuffled() and >>>> shuffled(using:). >>>> >>>> Thanks, >>>> Jon >>>> >>>> >>>> >>>>> On Oct 3, 2017, at 9:31 PM, Alejandro Alonso <[email protected] >>>>> <mailto:[email protected]>> wrote: >>>>> >>>>> I really like the schedule here. After reading for a while, I do agree >>>>> with Brent that stdlib should very primitive in functionality that it >>>>> provides. I also agree that the most important part right now is >>>>> designing the internal crypto on which the numeric types use to return >>>>> their respected random number. On the discussion of how we should handle >>>>> not enough entropy with the device random, from a users perspective it >>>>> makes sense that calling .random should just give me a random number, but >>>>> from a developers perspective I see Optional being the best choice here. >>>>> While I think blocking could, in most cases, provide the user an easier >>>>> API, we have to do this right and be safe here by providing a value that >>>>> indicates that there is room for error here. As for the generator >>>>> abstraction, I believe there should be a bare basic protocol that sets a >>>>> layout for new generators and should be focusing on its requirements. >>>>> >>>>> Whether or not RandomAccessCollection and MutableCollection should get >>>>> .random and .shuffle/.shuffled in this first proposal is completely up in >>>>> the air for me. It makes sense, to me, to include the .random in this >>>>> proposal and open another one .shuffle/.shuffled, but I can see arguments >>>>> that should say we create something separate for these two, or include >>>>> all of it in this proposal. >>>>> >>>>> - Alejandro >>>>> >>>>> On Sep 27, 2017, 7:29 PM -0500, Xiaodi Wu <[email protected] >>>>> <mailto:[email protected]>>, wrote: >>>>>> >>>>>> On Wed, Sep 27, 2017 at 00:18 Félix Cloutier <[email protected] >>>>>> <mailto:[email protected]>> wrote: >>>>>>> Le 26 sept. 2017 à 16:14, Xiaodi Wu <[email protected] >>>>>>> <mailto:[email protected]>> a écrit : >>>>>>> >>>>>> >>>>>>> On Tue, Sep 26, 2017 at 11:26 AM, Félix Cloutier >>>>>>> <[email protected] <mailto:[email protected]>> wrote: >>>>>>> >>>>>>> It's possible to use a CSPRNG-grade algorithm and seed it once to get a >>>>>>> reproducible sequence, but when you use it as a CSPRNG, you typically >>>>>>> feed entropy back into it at nondeterministic points to ensure that >>>>>>> even if you started with a bad seed, you'll eventually get to an >>>>>>> alright state. Unless you keep track of when entropy was mixed in and >>>>>>> what the values were, you'll never get a reproducible CSPRNG. >>>>>>> >>>>>>> We would give developers a false sense of security if we provided them >>>>>>> with CSPRNG-grade algorithms that we called CSPRNGs and that they could >>>>>>> seed themselves. Just because it says "crypto-secure" in the name >>>>>>> doesn't mean that it'll be crypto-secure if it's seeded with time(). >>>>>>> Therefore, "reproducible" vs "non-reproducible" looks like a good >>>>>>> distinction to me. >>>>>>> >>>>>>> I disagree here, in two respects: >>>>>>> >>>>>>> First, whether or not a particular PRNG is cryptographically secure is >>>>>>> an intrinsic property of the algorithm; whether it's "reproducible" or >>>>>>> not is determined by the published API. In other words, the distinction >>>>>>> between CSPRNG vs. non-CSPRNG is important to document because it's >>>>>>> semantics that cannot be deduced by the user otherwise, and it is an >>>>>>> important one for writing secure code because it tells you whether an >>>>>>> attacker can predict future outputs based only on observing past >>>>>>> outputs. "Reproducible" in the sense of seedable or not is trivially >>>>>>> noted by inspection of the published API, and it is rather immaterial >>>>>>> to writing secure code. >>>>>> >>>>>> >>>>>> Cryptographically secure is not a property that I'm comfortable applying >>>>>> to an algorithm. You cannot say that you've made a cryptographically >>>>>> secure thing just because you've used all the right algorithms: you also >>>>>> have to use them right, and one of the most critical components of a >>>>>> cryptographically secure PRNG is its seed. >>>>>> >>>>>> A cryptographically secure algorithm isn’t sufficient, but it is >>>>>> necessary. That’s why it’s important to mark them as such. If I'm a >>>>>> careful developer, then it is absolutely important to me to know that >>>>>> I’m using a PRNG with a cryptographically secure algorithm, and that the >>>>>> particular implementation of that algorithm is correct and secure. >>>>>> >>>>>> It is a *feature* of a lot of modern CSPRNGs that you can't seed them: >>>>>> >>>>>> You cannot seed or add entropy to std::random_device >>>>>> >>>>>> Although std::random_device may in practice be backed by a software >>>>>> CSPRNG, IIUC, the intention is that it can provide access to a hardware >>>>>> non-deterministic source when available. >>>>>> >>>>>> You cannot seed or add entropy to CryptGenRandom >>>>>> You can only add entropy to /dev/(u)random >>>>>> You can only add entropy to BSD's arc4random >>>>>> >>>>>> Ah, I see. I think we mean different things when we say PRNG. A PRNG is >>>>>> an entirely deterministic algorithm; the output is non-random and the >>>>>> algorithm itself requires no entropy. If a PRNG is seeded with a random >>>>>> sequence of bits, its output can "appear" to be random. A CSPRNG is a >>>>>> PRNG that fulfills certain criteria such that its output can be >>>>>> appropriate for use in cryptographic applications in place of a truly >>>>>> random sequence *if* the input to the CSPRNG is itself random. >>>>>> >>>>>> The examples you give above *incorporate* a CSPRNG, environment entropy, >>>>>> and a set of rules about when to mix in additional entropy in order to >>>>>> produce output indistinguishable from a random sequence, but they are >>>>>> *not* themselves really *pseudorandom* generators because they are not >>>>>> deterministic. Not only do such sources of random numbers not require an >>>>>> interface to allow seeding, they do not even have to be publicly >>>>>> instantiable: Swift need only expose a single thread-safe instance (or >>>>>> an instance per thread) of a single type that provides access to >>>>>> CryptGenRandom/urandom/arc4random, since after all the output of >>>>>> multiple instances of that type should be statistically >>>>>> indistinguishable from the output of only one. >>>>>> >>>>>> What I was trying to respond to, by contrast, is the design of a >>>>>> hierarchy of protocols CSPRNG : PRNG (or, in Alejandro's proposal, >>>>>> UnsafeRandomSource : RandomSource) and the appropriate APIs to expose on >>>>>> each. This is entirely inapplicable to your examples. It stands to >>>>>> reason that a non-instantiable source of random numbers does not require >>>>>> a protocol of its own (a hypothetical RNG : CSPRNG), since there is no >>>>>> reason to implement (if done correctly) more than a single publicly >>>>>> non-instantiable singleton type that could conform to it. For that >>>>>> matter, the concrete type itself probably doesn't need *any* public API >>>>>> at all. Instead, extensions to standard library types such as Int that >>>>>> implement conformance to the protocol that Alejandro names >>>>>> "Randomizable" could call internal APIs to provide all the necessary >>>>>> functionality, and third-party types that need to conform to >>>>>> "Randomizable" could then in turn use `Int.random()` or >>>>>> `Double.random()` to implement their own conformance. In fact, the >>>>>> concrete random number generator type doesn't need to be public at all. >>>>>> All public interaction could be through APIs such as `Int.random()`. >>>>>> >>>>>> >>>>>> Just because we can expose a seed interface doesn't mean we should, and >>>>>> in this case I believe that it would go against the prime objective of >>>>>> providing secure random numbers. >>>>>> >>>>>> >>>>>> If we're talking about a Swift interface to a non-deterministic source >>>>>> of random numbers like urandom or arc4random, then, as I write above, >>>>>> not only do I agree that it doesn't need to be seedable, it also does >>>>>> not need to be instantiable at all, does not need to conform to a >>>>>> protocol that specifically requires the semantics of a non-deterministic >>>>>> source, does not need to expose any public interface whatsoever, and >>>>>> doesn't itself even need to be public. (Does it even need to be a type, >>>>>> as opposed to simply a free function?) >>>>>> >>>>>> In fact, having reasoned through all of this, we can split the design >>>>>> task into two. The most essential part, which definitely should be part >>>>>> of the stdlib, would be an internal interface to a cryptographically >>>>>> secure platform-specific entropy source, a public protocol named >>>>>> something like Randomizable (to be bikeshedded), and the appropriate >>>>>> implementations on Boolean, binary integer, and floating point types to >>>>>> conform them to Randomizable so that users can write `Bool.random()` or >>>>>> `Int.random()`. The second part, which can be a separate proposal or >>>>>> even a standalone core library or third-party library, would be the >>>>>> protocols and concrete types that implement pseudorandom number >>>>>> generators, allowing for reproducible pseudorandom sequences. In other >>>>>> words, instead of PRNGs and CSPRNGs being the primitives on which >>>>>> `Int.random()` is implemented; `Int.random()` should be the standard >>>>>> library primitive which allows PRNGs and CSPRNGs to be seeded. >>>>>>> If your attacker can observe your seeding once, chances are that they >>>>>>> can observe your reseeding too; then, they can use their own >>>>>>> implementation of the PRNG (whether CSPRNG or non-CSPRNG) and reproduce >>>>>>> your pseudorandom sequence whether or not Swift exposes any particular >>>>>>> API. >>>>>> >>>>>> On Linux, the random devices are initially seeded with machine-specific >>>>>> but rather invariant data that makes /dev/urandom spit out predictable >>>>>> numbers. It is considered "seeded" after a root process writes POOL_SIZE >>>>>> bytes to it. On most implementations, this initial seed is stored on >>>>>> disk: when the computer shuts down, it reads POOL_SIZE bytes from >>>>>> /dev/urandom and saves it in a file, and the contents of that file is >>>>>> loaded back into /dev/urandom when the computer starts. A scenario where >>>>>> someone can read that file is certainly not less likely than a scenario >>>>>> where /dev/urandom was deleted. That doesn't mean that they have kernel >>>>>> code execution or that they can pry into your process, but they have a >>>>>> good shot at guessing your seed and subsequent RNG results if no >>>>>> stirring happens. >>>>>> >>>>>> Sorry, I don't understand what you're getting at here. Again, I'm >>>>>> talking about deterministic algorithms, not non-deterministic sources of >>>>>> random numbers. >>>>>> >>>>>>> Secondly, I see no reason to justify the notion that, simply because a >>>>>>> PRNG is cryptographically secure, we ought to hide the seeding >>>>>>> initializer (because one has to exist internally anyway) from the >>>>>>> public. Obviously, one use case for a deterministic PRNG is to get >>>>>>> reproducible sequences of random-appearing values; this can be useful >>>>>>> whether the underlying algorithm is cryptographically secure or not. >>>>>>> There are innumerably many ways to use data generated from a CSPRNG in >>>>>>> non-cryptographically secure ways and omitting or including a public >>>>>>> seeding initializer does not change that; in other words, using a >>>>>>> deterministic seed for a CSPRNG would be a bad idea in certain >>>>>>> applications, but it's a deliberate act, and someone who would >>>>>>> mistakenly do that is clearly incapable of *using* the output from the >>>>>>> PRNG in a secure way either; put a third way, you would be hard pressed >>>>>>> to find a situation where it's true that "if only Swift had not made >>>>>>> the seeding initializer public, this author would have written secure >>>>>>> code, but instead the only security hole that existed in the code was >>>>>>> caused by the availability of a public seeding initializer mistakenly >>>>>>> used." The point of having both explicitly instantiable PRNGs and a >>>>>>> layer of simpler APIs like "Int.random()" is so that the less >>>>>>> experienced user can get the "right thing" by default, and the >>>>>>> experienced user can customize the behavior; any user that instantiates >>>>>>> his or her own ChaCha20Random instance is already calling for the power >>>>>>> user interface; it is reasonable to expose the underlying primitive >>>>>>> operations (such as seeding) so long as there are legitimate uses for >>>>>>> it. >>>>>> >>>>>> Nothing prevents us from using the same algorithm for a CSPRNG that is >>>>>> safely pre-seeded and a PRNG that people seed themselves, mind you. >>>>>> However, especially when it comes to security, there is a strong >>>>>> responsibility to drive developers into a pit of success: the most >>>>>> obvious thing to do has to be the right one, and suggesting to >>>>>> cryptographically-unaware developers that they have everything they need >>>>>> to manage their own seed is not a step in that direction. >>>>>> >>>>>> I'm not opposed to a ChaCha20Random type; I'm opposed to explicitly >>>>>> calling it cryptographically-secure, because it is not unless you know >>>>>> what to do with it. It is emphatically not far-fetched to imagine a >>>>>> developer who thinks that they can outdo the standard library by using >>>>>> their own ChaCha20Random instance after it's been seeded with time() if >>>>>> we let them know that it's "cryptographically secure". If you're a power >>>>>> user and you don't like the default, known-good CSPRNG, then you're >>>>>> hopefully good enough to know that ChaCha20 is considered a >>>>>> cryptographically-secure algorithm without help labels from the >>>>>> language, and you know how to operate it. >>>>>> >>>>>>> I'm fully aware of the myths surrounding /dev/urandom and /dev/random. >>>>>>> /dev/urandom might never run out, but it is also possible for it not to >>>>>>> be initialized at all, as in the case of some VM setups. In some older >>>>>>> versions of iOS, /dev/[u]random is reportedly sandboxed out. On systems >>>>>>> where it is available, it can also be deleted, since it is a file. The >>>>>>> point is, all of these scenarios cause an error during seeding of a >>>>>>> CSPRNG. The question is, how to proceed in the face of inability to >>>>>>> access entropy. We must do something, because we cannot therefore >>>>>>> return a cryptographically secure answer. Rare trapping on invocation >>>>>>> of Int.random() or permanently waiting for a never-to-be-initialized >>>>>>> /dev/urandom would be terrible to debug, but returning an optional or >>>>>>> throwing all the time would be verbose. How to design this API? >>>>>> >>>>>> If the only concern is that the system might not be initialized enough, >>>>>> I'd say that whatever returns an instance of a global, framework-seeded >>>>>> CSPRNG should return an Optional, and the random methods that use the >>>>>> global CSPRNG can trap and scream that the system is not initialized >>>>>> enough. If this is a likely error for you, you can check if the CSPRNG >>>>>> exists or not before jumping. >>>>>> >>>>>> Also note that there is only one system for which Swift is officially >>>>>> distributed (Ubuntu 14.04) on which the only way to get entropy from the >>>>>> OS is to open a random device and read from it. >>>>>> >>>>>> Again, I'm not only talking about urandom. As far as I'm aware, every >>>>>> API to retrieve cryptographically secure sequences of random bits on >>>>>> every platform for which Swift is distributed can potentially return an >>>>>> error instead of random bits. The question is, what design for our API >>>>>> is the most sensible way to deal with this contingency? On rethinking, I >>>>>> do believe that consistently returning an Optional is the best way to go >>>>>> about it, allowing the user to either (a) supply a deterministic >>>>>> fallback; (b) raise an error of their own choosing; or (c) trap--all >>>>>> with a minimum of fuss. This seems very Swifty to me. >>>>>> >>>>>> >>>>>>>> * What should the default CSPRNG be? There are good arguments for >>>>>>>> using a cryptographically secure device random. (In my proposed >>>>>>>> implementation, for device random, I use Security.framework on Apple >>>>>>>> platforms (because /dev/urandom is not guaranteed to be available due >>>>>>>> to the sandbox, IIUC). On Linux platforms, I would prefer to use >>>>>>>> getrandom() and avoid using file system APIs, but getrandom() is new >>>>>>>> and unsupported on some versions of Ubuntu that Swift supports. This >>>>>>>> is an issue in and of itself.) Now, a number of these facilities >>>>>>>> strictly limit or do not guarantee availability of more than a small >>>>>>>> number of random bytes at a time; they are recommended for seeding >>>>>>>> other PRNGs but *not* as a routine source of random numbers. >>>>>>>> Therefore, although device random should be available to users, it >>>>>>>> probably shouldn’t be the default for the Swift standard library as it >>>>>>>> could have negative consequences for the system as a whole. There >>>>>>>> follows the significant task of implementing a CSPRNG correctly and >>>>>>>> securely for the default PRNG. >>>>>>> >>>>>>> Theo give a talk a few years ago >>>>>>> <https://www.youtube.com/watch?v=aWmLWx8ut20> on randomness and how >>>>>>> these problems are approached in LibreSSL. >>>>>>> >>>>>>> Certainly, we can learn a lot from those like Theo who've dealt with >>>>>>> the issue. I'm not in a position to watch the talk at the moment; can >>>>>>> you summarize what the tl;dr version of it is? >>>>>> >>>>>> I saw it three years ago, so I don't remember all the details. The gist >>>>>> is that: >>>>>> >>>>>> OpenBSD's random is available from extremely early in the boot process >>>>>> with reasonable entropy >>>>>> LibreSSL includes OpenBSD's arc4random, and it's a "good" PRNG (which >>>>>> doesn't actually use ARC4) >>>>>> That implementation of arc4random is good because it is fool-proof and >>>>>> it has basically no failure mode >>>>>> Stirring is good, having multiple components take random numbers from >>>>>> the same source probably makes results harder to guess too >>>>>> Getrandom/getentropy is in all ways better than reading from random >>>>>> devices >>>>>> >>>>>> Vigorously agree on all points. Thanks for the summary. >>>>>> >>>> >>> >> > > _______________________________________________ > swift-evolution mailing list > [email protected] > https://lists.swift.org/mailman/listinfo/swift-evolution
_______________________________________________ swift-evolution mailing list [email protected] https://lists.swift.org/mailman/listinfo/swift-evolution
