The security of random number generators (part 1)
Cryptography as we know it today was born in the seventies, when Diffie and Helmann invented public key cryptosystems. The gist of such systems can be summarized in two closely related constructs: one-way functions and trap doors.
Trap door are used for asymmetrical operations (encryption and signature) and one-way functions are used for generating pseudorandom numbers.
Pseudorandom numbers are instrumental to all cryptosystems, symmetrical or not: we need them for picking large primes, generating secrets (symmetrical keys, initialization vectors, ...) , authenticating (pre-shared secrets, passwords, ...), preventing replayability (nonces, seeds), padding blocks in a seamless way, enforcing uniqueness (for correlations...)
In most post-quantum cryptography schemes selected by the NIST, their role is essential for sampling vectors securely in high-dimensional lattices.
Beyond cryptography, pseudorandom numbers have many other mission-critical security benefits: network protocols security (TCP sequences for instance), side-channel attacks prevention, chaos engineering...
It might be counterintuitive, but poor randomness does also help mitigate IT security risks at times, as I recently advocated in the first law of generative AI.
Given such huge responsibilities, how do we know if a pseudorandom number generator (PRNG) is secure?
When Diffie and Helmann published their paper, we didn't know. The authors where fully aware of this, and they acknowledged the need of a proper cryptanalysis [2]. But at that time, the problem was not an urgent one... Why? Because we didn't even know how to build a practical pubic key cryptosystem that could meet real-life problems.
For that, we had to wait until 1977, when Rivest, Shamir and Adleman published[1] their famous RSA algorithm: finally, we had an implementation of a real cryptosystem. The math behind it was clear (the assumptions and limitations were understood, in particular the assumption that factoring primes was hard) and foolproof, but we were still unsure of the robustness of the underlying machinery: one-way functions and trap doors.
In 1982, Yao formulated a test [3] to determine whether a one-way function generating pseudorandom numbers was undistinguishable from a uniform random source in polynomial time. Stating the test in terms of algorithmic constraints was a major breakthrough that framed the "strength" of randomness in a realistic way, because we could restrict the search to the one class of generators which make sense: the class producing random sequences of length n in reasonable time poly(n), which couldn't be distinguished from uniform random sequences in poly(n) time.
Yao's test was theoretical and not constructive, but the key point to understand is that
we can safely use imperfect random generators, as long as the imperfection can't be spotted by a polynomial time algorithm.
(By imperfection, we follow Yao's formulation and we mean that the generator is distinguishable from a random uniform distribution)
The second point to understand from Yao is that, from a practical standpoint, no single test is foolproof. In fact,
an infinite number of polynomial time tests is required to make sure that a pseudorandom generator is absolutely secure.
To overcome this big limitation, the standard practice since Yao has been to submit PRNGs to a battery of statistical tests (running in polynomial time, of course). One very popular and interesting test is the Birthday spacings tests by Marsiglia [4].
A fresh take on Yao's theory of pseudorandom numbers
When I read Yao's 1982 paper for the first time a few weeks ago, I was immediately struck be something which I think went mostly unnoticed for 42 years. Let me share this finding with you today!
The fifth chapter of the paper is dedicated to a theory of pseudorandom numbers. In this chapter, Yao claims that such theory covers two main concerns:
(a) defining what constitutes a random sequence, a topic he believes has been thoroughly explored,
(b) generating pseudorandom numbers, which he considers an entirely new area.
While this division appears logical, I believe that, in retrospect, it led to a focus primarily on the latter (generation) without sufficient attention to the former (definition). This perspective is supported by another statement in the same section:
Here is a confusion: on one hand, in introduction, Yao asserts that generation is distinct from definition, yet on the other hand, in the image above, he acknowledges that generation is not the primary concern—the property of the source is.
This raises the question: what is the property of the source, if not the definition of a random sequence?
To me, we can only conclude that (a) and (b) are one and the same thing, which Yao failed to realize somehow. Then, Yao proceeds into developing (b) as the core of his theory. But he actually reinvents the wheel, because (a) what already thoroughly explored as we saw.
What wheel, exactly? As I explained in depth last year in my tribute to Per Martin-Löf, the Swedish mathematician gave a formal characterization of random sequences based on Kolmogorov complexity in 1966. This is what Yao is reinventing.
Yao clearly acknowledges Martin-Löf's work, but fails to realize that what Martin-Löf framed in terms of Kolmogorov complexity, he reframes 16 years later in terms of Shannon entropy.
Here is the takeaway from this finding:
Yao's tests are all included in Martin-Löf's universal test of randomness.
One can also define a Martin-Löf universal test of randomness for polynomial statistical tests, and call it a universal poly(n) test of randomness, capturing all of Yao's tests.
The main differentiator between Martin-Löf's formulation and Yao's is that Yao focuses on poly(n) algorithms whereas Martin-Löf doesn't make any assumption on algorithmic complexity.
In the eighties, the hierarchy of algorithmic class complexities was just beginning to take shape, and Yao played a prominent role in its understanding and construction. Poly(n) fits naturally into this hierarchy, which Martin-Löf was unaware of.
Another important differentiator is that Yao's notion of indistinguishability between two distributions opened the door to a practical approach to building and cryptanalyzing one way functions.
The rest of the story is the same:
Finally, and for the anecdote, allow me to report that, in his 1966 paper, Martin-Löf provided an example of a statistical test, this test was the simplest one could imagine and it can be implemented using an algorithm of linear [5] complexity (which is part of poly(n)).
Recommended by LinkedIn
The wake of quantum computers
This situation could have persisted forever if it was not for quantum computing... Over the past few years, we've seen a small, but increasing number of vendor solutions selling QRNG: quantum-based random generators.
What threat are these solutions trying to mitigate?
Since you've read this article with utmost attention, you know that PRNGs are one-way functions.
There is a problem with one-way functions: they are invertible. Well... they aren't on classical computers, of course. They are invertible on quantum computers, because of a very well-known algorithm called Grover search.
To be more precise, we must say that one-way functions (considered in the context of algorithmic complexity classes that we discussed earlier) are invertible on classical computer in poly(n) time or more, whereas they are invertible with a quadratic boost using Grover on a quantum computer.
Can QRNGs help mitigate the function inversion risk brought by Grover search?
Here is the perfect illustration of Yao's test utility (despite the fact that it is not constructive): it gives a straightforward answer to that question: "Yes, QRNGs can solve that problem".
Can you see why? In QRNGs, randomness is (supposedly) pure, meaning that the source distribution is random uniform. It is therefore indistinguishable from... itself. The quadratic boost brought by Grover is not a concern anymore. QED.
More importantly, can we prevent quantum computers from inverting pseudorandom sequences without resorting to QRNGs?
As far as we know, the answer is: "Yes, we can. We don't need QRNGs to solve that problem".
Here again, Yao's tests provide this invaluable and immediate answer.
To understand why, suppose there exists an algorithm which is able to distinguish a classical PRNG from a random distribution. There are 2 cases to consider:
Conclusion
The security of pseudorandom numbers is greatly indebted to Martin-Löf and Yao. 42 years after Yao devised his theory of pseudorandom numbers, the test he provided is still completely up-to-date and relevant to answer cutting-edge questions like: do we need Quantum RNGs?
And we've seen that,
even if Quantum RNGs are a possible mitigation against the quantum threat, so are classical RNGs!
In Part 2, we'll explore some key refinements brought to PRNG's security after Yao's seminal work. In particular, we'll look at the all-important concept of seeding.
References
[2] New directions in cryptography (Diffie & Helmann), 1976, chapter VII (historical perspective).
[5] Below is an overview of the test in Python: for any given m, it is of linear complexity. When looping over all possible m, it if quadratic. Please remember that Martin-Löf tests are fit for asymptotically large sequences, so the example below is only given for educational purposes
p_m= 2^(-m) is the probability of a specific pattern of length m occurring in a random sequence
For a sequence of length n, the expected number of occurrences of a pattern of length m is E_m=p_m×(n−m+1)
sigma_m calculates how many times any pattern of length m is expected to deviate from the mean given by the central limit theorem.
The significance level ϵ dictates how strict the test is, with smaller ϵ (larger m) leading to stricter criteria.
f(m,n) determines the max allowed deviation, depending on the significance level.
Here is a sample run:
Senior Principal Security Certifications Expert at nCipher Security
4moNice read. In my experience, after years of designing and certifying random number generators for HSMs: For cryptographically secure PRNGs, the NIST deterministic random bit generators in SP 800-90A are based on well-understood primitives and are the industry standard today ... ok, despite the Dual EC DRBG incident in the past 😊 The PRNG should be re-seeded often and, whenever possible, using a well designed hardware entropy source. This way, your system has two security anchors: the computational strength from the PRNG and the information theoretical strength from the entropy source. I am not yet convinced that a QRNG is needed over a classical entropy source, but looking forward to reading part 2!
Senior Security Specialist at Microsoft - aka.ms/gsd = Get Security Deployed
4moRichard Diver - this might be of interest?
Vice President Global Chief Information Security Officer 2018 Global CISO of the year
4moDear Christophe Parisel, thank you for this great reading. You point (as usual) a good topic and an emerging security challenge. QRNGs should provide a better assurance of "true randomness". Making them (QRNG) crucial for generating crypto keys which require the highest level of security. As technology advances, our need for QRNGs may expand into even more areas. In a few words: "the post quantum RNG, will be the Quantum RNGs". Some experts may say that everyday applications will survive with classical RNG... However, I'm not quite convinced. As we know, the cyber threats work now adopting the 'detect and attack' opportunistic mode, with less focus on big, or small stakes on the victims' side. As Jean de la Fontaine would have said: "Whether you are powerful or miserable" ...we are all potential cyber targets. In this respect, like for any weakness hard to fix, RNG will be a security challenge.
Chief Executive Officer at Light Rider Inc
4moWe shouldn't fear them. We should embrace this technology as it will advance all aspects of our lives. ANU is a great starting point to learn more about QRNG and it's applications. https://qrng.anu.edu.au/
Senior Security Specialist at Microsoft - aka.ms/gsd = Get Security Deployed
4moAnother very intriguing article Christophe Parisel, thanks for sharing