The distribution of maximal prime gaps in Cramer's probabilistic model of primes

In the framework of Cramer's probabilistic model of primes, we explore the exact and asymptotic distributions of maximal prime gaps. We show that the Gumbel extreme value distribution exp(-exp(-x)) is the limit law for maximal gaps between Cramer's random primes. The result can be derived from a general theorem about intervals between discrete random events occurring with slowly varying probability monotonically decreasing to zero. A straightforward generalization extends the Gumbel limit law to maximal gaps between prime constellations in Cramer's model.


Introduction
In this paper we apply extreme value theory to Cramér's probabilistic model of primes [4]. But first let us say a few words about two mathematicians who pioneered these topics in mathematics. The Swedish mathematician Harald Cramér (1893Cramér ( -1985 made long-lasting contributions in statistics and number theory. His model of primes continues to serve as a heuristic tool leading to new insights in the distribution of primes. Cramér wrote: In investigations concerning the asymptotic properties of arithmetic functions, it is often possible to make an interesting heuristic use of probability arguments. If, e. g., we are interested in the distribution of a given sequence S of integers, we then consider S as a member of an infinite class C of sequences, which may be concretely interpreted as the possible realizations of some game of chance. It is then in many cases possible to prove that, with a probability = 1, a certain relation R holds in C, i. e. that in a definite mathematical sense "almost all" sequences of C satisfy R. Of course we cannot in general conclude that R holds for the particular sequence S, but results suggested in this way may sometimes afterwards be rigorously proved by other methods. [4, p. 25] It is difficult to ascertain whether Harald Cramér had ever met in person his contemporary, the German-American mathematician Emil Julius Gumbel (1891Gumbel ( -1966, one of the founders of extreme value theory [8,9] which is used today for describing phenomena in vastly diverse areas, ranging from actuarial science to hydrology to number theory. In Statistics of Extremes (1958) Gumbel observed: . . . many engineers and practical statisticians . . . are inclined to believe that, after all, nearly everything should be normal, and whatever turns out not to be so can be made normal by a logarithmic transformation. This is neither practical nor true. [9, p. 345] In Les valeurs extrêmes des distributions statistiques (1935) Gumbel showed that extreme values taken from a sequence of i.i.d. random variables with an exponential distribution obey the double exponential limit law (now known as the Gumbel distribution) [8]. He reconfirmed the earlier result of Fisher and Tippett [5] that the same limit law also holds for extreme values of i.i.d. random variables with a normal distribution -and generalized it to a much wider class of initial distributions, the so called exponential type. To wit: Pour les valeurs extrêmes . . . on arriveà la distribution doublement exponentielle pourvu que la distribution initiale appartienne au type exponentiel . . . Cette theorie est susceptible de nombreuses applications, puisque en particulier, la distribution de Gauss et la distribution exponentielle appartiennent au type exponentiel. [8, p. 154-155] We will draw upon the work of both Cramér and Gumbel: In Section 4.2 we show that the limiting distribution of maximal prime gaps in Cramér's probabilistic model is the Gumbel extreme value distribution. In the hindsight, the result is not very surprising: we know from computation and distribution fitting that the actual maximal prime gaps are indeed nicely approximated by the Gumbel distribution [12]. (What is somewhat surprising is that Cramér or Gumbel have not themselves published a similar result long ago, perhaps in the 1930s or 1940s. Did they possibly deem it obvious? We might never know.) Once we establish the existence of a limit law for maximal prime gaps in Cramér's model with n urns, additional questions naturally arise (they are to be addressed elsewhere): • Is the Gumbel distribution, after proper rescaling, also an (almost sure) limit law for record prime gaps observed in a single infinite sequence of Cramér's random primes? (Cf. [18] and [2, p. 193].) • Is the Gumbel distribution, after proper rescaling, also a limit law for record gaps between true primes? (The latter question appears particularly difficult.) The existence of the limiting Gumbel distribution for the (non-identically distributed) maximal gaps between Cramér's random primes is also interesting in view of Mejzler's theorem [10, p. 201]. For non-identically distributed independent random variables, Mejzler's theorem states that the limiting distribution (if it exists at all) can be any distribution with a log-concave cdf. Thus, prime gaps in Cramér's model give us an example of non-identically distributed random variables which nevertheless possess a limit law that is allowed in i.i.d. situations as well. the k-th random "prime" (RP) in Cramér's model U n the n-th urn producing RPs with probability 1 log n in Cramér's model R n a random variable: the longest uninterrupted run of RCs ≤ n G n a random variable: the maximal gap between RPs ≤ n; G n := R n + 1 π(x) the prime counting function, the total number of primes p k ≤ x Π(x) the RP counting function, the total number of RPs

Definitions of gaps and runs
Prime gaps are distances between two consecutive primes, p k − p k−1 [15, 22, OEIS A001223].
In the context of Cramér's model (see next section) "prime" gaps refer to distances between consecutive RPs, P k − P k−1 .
Maximal gaps between primes are usually defined as gaps that are strictly greater than all preceding gaps [15, 22, A005250]. However, for Cramér's model, we will use the term maximal gap in the statistical sense 1 defined below. Note that Cramér's model does not guarantee that there are any "primes" P k > 2 at all. To make sure that maximal gaps are 1 Number theorists may use the terms maximal gaps and record gaps as synonyms [3], while statisticians make a distinction between maximal and record values [2,14,18]. Resnick [18,Theorem 8] shows that the distinction is quite profound: in the i.i.d. case, the limit laws for records and maxima cannot be the same. Nevertheless, for a wide class of sequences of non-identically distributed random variables (the F α model) the same extreme value distribution (e. g. Gumbel distribution) can be the limit law for both records and maxima [2, p. 193]. Clearly, Cramér's maximal "prime" gaps near, say, urn U 10 are distributed not identically to those near urn U 100 . In Sect. 4.2 we show that the limiting distribution of Cramér's maximal "prime" gaps is indeed the Gumbel distribution. Computational evidence suggests that the Gumbel distribution is also the a.s. limit law for record gaps in Cramér's model; we will discuss the distribution of records elsewhere. defined in all cases, we first define the longest run of random "composites" (RCs) ≤ n: R n = the longest run of consecutive RCs ≤ n (allowing runs of length 0).
We now define the maximal gap between RPs ≤ n simply as Compare our definition of G n to the definition of maximal prime gaps for true primes [15]: maximal prime gap up to n = max Clearly, the same gap/run relation holds for true primes as well 2 : maximal prime gap = 1 + the longest run of composites below n.
Of course, in the probabilistic model, both R n and G n are random variables. In what follows, we will investigate the distribution of maximal gaps G n .

Setting up the model
Cramér [4] sets up the model of primes as follows: With respect to the ordinary prime numbers, it is well known that, roughly speaking, we may say that the chance that a given integer n should be a prime is approximately 1 log n . This suggests that by considering the following series of independent trials we should obtain sequences of integers presenting a certain analogy with the sequence of ordinary prime numbers p n . Let U 1 , U 2 , U 3 , . . . be an infinite series of urns containing black and white balls, the chance of drawing a white ball from U n being 1 log n for n > 2, while the composition of U 1 and U 2 may be arbitrarily chosen. We now assume that one ball is drawn from each urn, so that an infinite series of alternately black and white balls is obtained. If P n denotes the number of the urn from which the n-th white ball in the series was drawn, the numbers P 1 , P 2 , . . . will form an increasing sequence of integers, and we shall consider the class C of all possible sequences (P n ). Obviously the sequence S of ordinary prime numbers (p n ) belongs to this class. We shall denote by Π(x) the number of those P n which are ≤ x, thus forming an analogy to the ordinary notation π(x) for the number of primes p n ≤ x. [4, pp. 25-26] Cramér's model, as stated, is underdetermined: the content of urns U 1 and U 2 is arbitrary. To compute exact distributions of maximal gaps, we will assume that • urn U 1 is empty -it produces neither "primes" nor "composites"; • urn U 2 always produces white balls (i. e. the number 2 is certain to be "prime").
Computations of Oliveira e Silva, Herzog, and Pardi [16] have verified that Thus the conjecture appears quite likely to be true. 4 Maximal gaps between Cramér's random primes 4

.1 The exact distribution of maximal gaps
We continue where Cramér left off. To obtain exact distributions of maximal gaps between Cramér's random primes, for now we will restrict ourselves to finite sets of n consecutive urns U 1 , U 2 , . . . , U n . When our set of urns is small, we can compute the exact distributions of maximal gaps by hand, even without a computer.
For example, for n = 3 we have only three urns: U 1 , U 2 , U 3 . Of these, only U 3 produces random results: • a white ball (RP) with probability q 3 = 1 log 3 ≈ 0.91, or • a black ball (RC) with probability 1 − q 3 = 1 − 1 log 3 ≈ 0.09. Thus, for the longest run of RCs, R 3 , we have R 3 = 0 with probability 0.91, and R 3 = 1 with probability 0.09. Consequently, for the maximal gap between RPs, we have G 3 = 1 with probability 0.91, and G 3 = 2 with probability 0.09. (Recall that, by definition, G n = R n +1.) One can visualize the exact distributions of maximal gaps in the form of histograms. With the help of a computer, we can find the exact distributions up to, say, n = 40 urns. Figure 1 shows the respective computer-generated distributions of maximal gaps.

The limiting distribution of maximal gaps
In Figure 1, the exact distributions (histograms) of maximal gaps between RPs appear to approach the pdf curves of the Gumbel distribution with scale α n = n/ li n and mode µ = α n log li n. We can restate this observation in a more precise form: Theorem 1. In Cramér's model with n urns, the Gumbel distribution exp(−e −z ) is the limiting distribution of maximal gaps G n between RPs: there exist a n > 0 and b n such that lim n→∞ P (G n ≤ x ≡ a n z + b n ) = exp(−e −z ), where a n ∼ α n = n li n , b n ∼ α n log li n.
Equivalently, for R n (the longest runs of RCs) we have lim n→∞ P (R n ≤ a n z + b n ) = exp(−e −z ).
We will sketch two proofs of Theorem 1. The first proof will use the following lemmas.
Lemma of Common Median. Suppose two Gumbel distributions have a common median and different scales a ± ε, where 0 < ε < a 2 . Then the cdfs of these Gumbel distributions differ by no more than ε a . Lemma of Common Scale.
Suppose two Gumbel distributions have a common scale a ≥ 1 and medians M ± δ. Then these Gumbel cdfs differ by no more than δ a . Denote by F n (x) the cdf of the exact distribution of maximal gaps in Cramér's model with n urns U 1 , . . . , U n (n ≥ 2), as defined in Section 3.1. Denote by I n (n ≥ 10) the largest interval of the x axis such that for all x ∈ I n we have where one can take ε = 3/2, δ = 1. (See Fig. 2. The sequence {M n } is OEIS A235492 [22].) Hereafter we often use the following expressions giving the Gumbel distribution cdf in terms of its scale a, mode µ and median M:

First proof of Theorem 1
Suppose the number of urns n is large. We will examine two cases: (a) x ∈ I n ; (b) x / ∈ I n .
Case (a) First, we consider F n (x) for x ∈ I n , i. e., (log n) −1 ≤ F n (x) ≤ 1 − (log n) −1 . Let us estimate M n , the median of F n (x). As before, let α n = n/ li n and ε = 3/2. We can approximate Cramér's proportions of white balls in urns using two different procedures: (i) set the white-to-black-balls ratio in all urns to 1 α n + ε , then increase the percentages of white balls in all urns or, alternatively, (ii) set the white-to-black-balls ratio in all urns to 1 α n − ε , then reduce the percentages of white balls in all but a small subset of urns.
Observe that increasing the percentage of white balls pushes the median of maximal gaps to the left, while reducing this percentage pushes the median to the right. Therefore, M n must be somewhere between the medians of (Exp(x; α n − ε)) li n and (Exp(x; α n + ε)) li n : median (Exp(x; α n − ε)) li n M n median (Exp(x; α n + ε)) li n .
Since α n = O(log n) while ε = O(1), the above lower and upper bounds are asymptotic to each other and to the median of (Exp(x; α n )) li n , therefore we must have M n ∼ median (Exp(x; α n )) li n as n → ∞.
That is to say, as n → ∞, the median M n of F n (x) must be asymptotic to the median of the limiting distribution of maxima of ⌊li n⌋ i.i.d. random variables with the exponential distribution Exp(x; α n ). But this limiting distribution is precisely the Gumbel distribution Gumbel(x; α n , α n log li n) [6,8,11], so M n ∼ median Gumbel(x; α n , α n log li n) ≈ α n log li n + 0.3665α n .
On the other hand, it follows from Lemmas that F n (x) is squeezed in the ∆ n -neighborhood of the Gumbel cdf 2 −e − x−Mn αn , where ε and δ are defined as in Squeeze Lemma, ∆ n = ε + δ α n , and α n = n li n ∼ log n → ∞ as n → ∞.
To sum it up: the median M n is asymptotic to the median of Gumbel(x; α n , α n log li n), while the limiting shape of F n (x) is dictated by the fact that F n (x) is in ∆ n -neighborhood of the Gumbel cdf 2 −e − x−Mn αn , with lim n→∞ ∆ n = 0. This completes the proof for case (a).
Taking into account the monotonicity of cdfs, we can conclude from Lemmas that, for large n,

Second proof of Theorem 1
We begin by proving the theorem for longest runs R n of "composites". In Cramér's model, urns U n (n ≥ 3) produce white balls with a monotonically decreasing, slowly varying probability 1/ log n → 0 as n → ∞. We observe that the more general Theorem 2 (see Appendix) is applicable to our situation. Theorem 2 tells us that, as n → ∞, the limiting distribution of longest runs exists; it is the Gumbel distribution with the scale and mode parameters determined by EΠ(n), the expected total number of RPs ≤ n: scale a n ∼ n EΠ(n) = n li n + O(1) ∼ n li n mode b n ∼ n log EΠ(n) EΠ(n) = n li n + O(1) log(li n + O(1)) ∼ n li n log li n.
Here we have used the fact that, for n ≥ 3, the expected total number of RPs ≤ n is Thus we can use the above scale and mode as rescaling parameters a n and b n , to obtain the standard Gumbel distribution exp(−e −z ). Note that 1 ≪ a n ≪ b n as n → ∞; so the distribution rescaling formula z = x − b n a n will produce approximately equal values of z, no matter whether we are rescaling the longest runs R n or maximal gaps G n ≡ R n + 1.

Properties of the distribution of maximal gaps
Let us look at the the properties of the distribution of maximal gaps. We can readily see that the exact distribution of maximal gaps is discrete and bounded, while the limiting Gumbel distribution is continuous, smooth, and unbounded. Below we discuss two properties that are common to the exact and asymptotic distributions of maximal gaps: log-concavity and unimodailty.

Log-concavity
Log-concavity of the limiting distribution of maximal gaps. It is well known (and easy to check given the formulas for the distribution's pdf and cdf) that the Gumbel distribution pdf and cdf are log-concave. Note that, in general, if f (x) is a continuous distribution pdf and F (x) is the corresponding cdf, then the following implications are true: Log-concavity of exact distributions of maximal gaps. A direct computational check shows that all the exact distribution functions we have computed in Section 4.1 are also log-concave. (However, as expected, the log-concavity is not preserved if we use a faster Monte-Carlo algorithm rather than the exact formula for computing the finite distributions.)

Unimodality
The Gumbel distribution Gumbel(x; a, µ) is unimodal for any µ and any a > 0: it has a unique mode (most probable value), namely, µ. What about the exact distribution of maximal gaps between RPs in Cramér's model with n urns? By Theorem 1, the Gumbel distribution is the limiting distribution of maximal gaps between RPs in Cramér's model; therefore it is reasonable to expect that, for large n, in Cramér's model with n urns the exact distribution of maximal gaps between RPs is also unimodal. On the other hand, for moderate values of n one can check by direct computation that each of the exact distributions of maximal gaps between RPs is unimodal. Here we will state without a formal proof the following Unimodality Lemma. In Cramér's model with n urns, the exact distribution of maximal gaps between RPs is unimodal for each n > 1.
5 Appendix: Maximal intervals between random events occurring with slowly varying probability ℓ(t) → 0 In this appendix we give a general theorem whose particular case (ℓ(t) = 1/ log t for t ≥ 3) has been used in Section 4.2.2. We say that a function ℓ(t) > 0 is slowly varying if it is defined for positive t, and lim t→∞ ℓ(λt) ℓ(t) = 1 for any fixed λ > 0 [10, p. 362].
Theorem 2. Consider biased coins with tails probability ℓ(k) at the k-th toss, 0 < ℓ(k) < 1, where ℓ(t) is a smooth, slowly varying, monotonically decreasing function, and lim t→∞ ℓ(t) = 0. Then, after a large number n of tosses, the asymptotic distribution of the longest runs of heads R n is the Gumbel distribution: there exist a n > 0 and b n such that where a n ∼ n EΠ(n) , b n ∼ n log EΠ(n) EΠ(n) .
Here EΠ(n) is the mathematical expectation of the total number of tails Π(n) observed during the first n tosses: It may be surprising that the asymptotic distribution does exist at all for discrete events occurring with a slowly varying probability ℓ(t) → 0. In contrast, for a biased coin with a constant positive probability of tails, the limiting distribution of the longest run of heads does not exist [20, p. 203].
In the lemmas below, we assume the conditions of Theorem 2 and take λ to be arbitrarily large (λ ≫ 1). The lemmas are easy to prove using the theory of regularly varying functions [10, p. 361-367].
However, it is well known that for the geometrically distributed runs of heads with a constantbias coin (i. e. the tails probability q = const) the longest runs R do not have a limiting distribution [1,20]: with the geometric-to-exponential approximation error O(q) preventing the convergence of the exact longest run distributions to the Gumbel distribution. Nevertheless, in our original setup with slowly varying bias of the coin, instead of a constant q we have q n ∼ a −1 n ∼ ℓ(n) ∼ EΠ(n) n → 0 as n → ∞.
For the sequence S(t, λt) to contain the absolute longest run of heads R n up to n = ⌊λt⌋, we must take λ very large, so λ−1 λ → 1, and from equation (1) we have log m ≈ log EΠ(n). Now eqs. (2) and (3) yield the limiting distribution of R n under the theorem conditions: lim n→∞ P (R n ≤ a n z + b n ) = exp(−e −z ), where a n ∼ n EΠ(n) , b n ∼ n log EΠ(n) EΠ(n) .