Chapter 14: Random Generation #
By a small sample we may judge of the whole piece.
— Miguel de Cervantes (Don Quixote de la Mancha)
Note: This chapter can be downloaded as a Sage Jupyter notebook from this link (or run in the browser at this link).
Instead of analytically determining asymptotics, another way to study the behaviour of combinatorial classes is to generate random objects of large size.
Uniform Random Generation #
A uniform generation algorithm for a combinatorial class $\mathcal{C}$ is a randomized algorithm that takes $n \in \mathbb{N}$ and returns an element in $\mathcal{C}$ of size $n,$ where every element of $\mathcal{C}_n$ is returned with probability $1/c_n.$
How do we generate randomness? Typically, we assume that we have access to a function rand()
that returns a random real number in $[0,1).$
It is possible to construct rand
using random coin flips (this is a topic for a computer science course). Whether there exists “true randomness” is also a question for computer science (or maybe philosophy).
We can use rand
to generate a random integer in $\{0,\dots,n\}$ by defining
$$\text{rand}(0\dots n) = \lfloor (n+1) \cdot \text{rand}() \rfloor$$
Our goal is to generate random algorithms that are correct, efficient, and easy to understand.
Below we discuss random generation using

Direct algorithms

Bijections

Recursive sampling

Boltzmann sampling (which is no longer uniform random generation)
Other common approaches to random generation include rejection sampling (randomly generate objects in a subset of the class you want until desired property holds) and ranking / unranking algorithms (efficiently order the elements in your class and then compute the $k$th element where $k$ random).
Direct Algorithms #
If the class is simple enough, we can directly implement a random generation algorithm.
Example: Find a uniform generation algorithm to compute a random binary string of length $n.$
Click for Answer
We directly generate binary digits one by one. In Sage code,
def rbin(n): # Every string appears with probability 1/2^n return [floor(2*RR.random_element(0,1)) for k in [1..n]]
where
RR.random_element(0,1)
returns a random real number in $(0,1).$ For instance, runningrbin(10)
returns (on the run saved in the Sage notebook corresponding to this chapter)[0, 1, 0, 1, 1, 0, 1, 0, 0, 0]
.
Bijections #
If we can randomly generate objects in a class $\mathcal{A},$ and know a bijection from $\mathcal{A}$ to another class $\mathcal{B},$ then we can randomly generate objects in the class $\mathcal{B}$ using the bijection.
Example: Find a uniform generation algorithm to compute a random subset of $\{1,\dots,n\}.$
Click for Answer
Recall we have a bijection $f$ from binary strings of length $n$ to subsets of $\{1,\dots,n\}$ where $f(b_1\cdots b_n) = \{k : b_k = 1\}.$ Thus, we can randomly generate subsets using the following code
def rset(n): b = rbin(n) return [k+1 for k in range(n) if b[k]==1]
For instance, running
rset(10)
returns (on the run saved in the Sage notebook corresponding to this chapter)[1, 2, 4, 5, 8, 10]
.
Random Sampling #
How can we use combinatorial specifications to randomly generate elements?
As usual, we start with our base cases. In pseudocode,
def genϵ(n):
if n = 0 return ϵ
else return NULL
def genZ(n):
if n = 1 return Z
else return NULL
Combinatorial Sum #
Suppose $\mathcal{A} = \mathcal{B} + \mathcal{C}$ and we have uniform generation algorithms genB(n)
and genC(n)
. How do we build genA(n)
?
If $\alpha \in \mathcal{A}_n$ then $\mathbb{P}[\alpha \in \beta_n] = \frac{b_n}{a_n} = \frac{b_n}{b_n+c_n}$ so we define
def genBplusC(n):
x = RR.random_element(0,1)
if x < b(n)/(b(n) + c(n)) return genB(n)
else return genC(n)
If $\alpha \in \mathcal{B}$ then it is returned with probability $$\frac{b_n}{b_n+c_n} \cdot \frac{1}{b_n} = \frac{1}{b_n+c_n} = \frac{1}{a_n},$$ and a similar argument works when $\alpha \in \mathcal{C},$ so we have a uniform random generation algorithm.
Combinatorial Product #
Suppose $\mathcal{A} = \mathcal{B} \times \mathcal{C}$ and we have uniform generation algorithms genB(n)
and genC(n)
. How do we build genA(n)
?
The probability that $\alpha \in \mathcal{A}_n$ is $\alpha = (\beta,\gamma)$ and $\beta=k$ is $\frac{b_kc_{nk}}{a_n}.$ Thus, we first randomly generate the size that $\beta$ should have, then generate objects of the appropriate size from $\mathcal{B}$ and $\mathcal{C}.$
def genBtimesC(n):
x = RR.random(0,1)
k = 0
a(n) = sum(b(k)*c(nk), k=0..n)
s = b(0)c(n)/a(n)
while x>s:
k = k + 1
s = s + b(k)*c(nk)/a(n)
return [genB(k),genC(nk)]
The probability that $(\beta,\gamma) \in \mathcal{B}_k \times \mathcal{C}_{nk}$ is returned is $$ \frac{b_k \cdot c_{nk}}{a_n} \cdot \frac{1}{b_k} \cdot \frac{1}{c_{nk}} = \frac{1}{a_n},$$ so we have a uniform random generation algorithm.
Sequence #
If $\mathcal{B}$ has no objects of size $0$ then $A = \text{SEQ}(B)$ is equivalent to $$\mathcal{A} = \epsilon + \mathcal{B} \times \mathcal{A},$$ so the sequence construction can be captured by sum and product using a recursive algorithm.
Uniform Random Binary Trees #
Let’s combine the above discussion to create a uniform generation algorithm to for (unlabelled) binary rooted trees. Translating the specification $$\mathcal{B} = \epsilon + \mathcal{Z} \times \mathcal{B} \times \mathcal{B}$$ into Sage code gives the following.
# Recursive Generation Algorithm
# B = E + Z x B x B
def genTree(n):
# Return ϵ from first part of +
if n==0:
return 'ϵ'
Bn = binomial(2*n,n)/(n+1) # Counting sequence
x = RR.random_element(0,1) # Random real in (0,1)
# Find the size k of the first element in a pair of trees of size n1
k = 0
s = (binomial(2*k,k)/(k+1) * binomial(2*(nk1),nk1)/(nk))/Bn
while x > s:
k = k+1
s = s + (binomial(2*k, k)/(k+1) * binomial(2*(nk1), nk1)/(nk))/Bn
# Return a tuple from the product
return ['Z', genTree(k), genTree(nk1)]
For instance, running genTree(3)
gives
Out: ['Z', ['Z', 'ϵ', 'ϵ'], ['Z', 'ϵ', 'ϵ']]
which corresponds to a binary tree whose root has two leaves for children.
Note: All outputs of our randomized algorithms are from the run saved in the Sage notebook corresponding to this chapter. It is highly recommend that readers run the code themselves to interactively play around with these algorithms!
In order to visualize our random binary trees, we rewrite the above code to use Sage’s builtin BinaryTree
data structure.
# Change code to output 'BinaryTree' data structure
# which we use for plotting with Sage commands
def genTreePlot(n):
# Return ϵ from first part of +
if n==0:
return None
Bn = binomial(2*n,n)/(n+1) # Counting sequence
x = RR.random_element(0,1) # Random real in (0,1)
# Find the size k of the first element in a pair of trees of size n1
k = 0
s = (binomial(2*k,k)/(k+1) * binomial(2*(nk1),nk1)/(nk))/Bn
while x > s:
k = k+1
s = s + (binomial(2*k, k)/(k+1) * binomial(2*(nk1), nk1)/(nk))/Bn
# Return a tuple from the product
return BinaryTree([genTreePlot(k), genTreePlot(nk1)])
# Code to plot tree (with various plot options set)
def plotTree(BT):
BT.graph().show(layout='tree',tree_orientation='down',figsize=(10,10),vertex_size=20, fig_tight=False, vertex_labels=False)
Running ascii_art(genTreePlot(10))
gives the following tree
A larger tree, with more advanced plotting, is obtained with the command plotTree(genTreePlot(100))
In addition to generating pretty pictures, we can also use this random generation algorithm to approximate statistics on binary trees.
For instance, recall that we previously used bivariate generating functions and LIFT to prove that the average number of leaves among the binary trees of size $n$ approaches $n/4$ as $n\rightarrow\infty.$ Running
from numpy import mean
N = 160
pt = [str(genTreePlot(N)).count('[., .]') for k in range(100)]
mean(pt)
returns the average number of leaves
Out: 40.7
among 100 randomly generated trees of size $160.$
In this manner, the automatic translation of specifications into random sampling algorithms gives a powerful tool for estimating the behaviour of combinatorial parameters (or a sanity check for analytically determined results). For instance, computing a histogram for the number of leaves in the trees computed above
plot(histogram(pt,bins=max(pt)  min(pt), density=True, range=[min(pt),max(pt)]))
suggests that the number of leaves approaches a normal distribution (a result that can also be proven analytically, using more advanced methods).
Boltzmann Sampling #
Recursive sampling is powerful, but can be slow for complicated objects. Instead of saying we need an object of size $n,$ we can look for an object of size (say) $[0.9n,1.1n]$ and (in general) do better. The elegant framework for handling this is Boltzmann sampling.
Recall that if $F(z) = \sum f_nz^n$ has radius of convergence $R$ then $F(p)$ converges to a complex number for $p < R.$ We call the points $v \in (0,R)$ the admissible values of $F.$
Let $\mathcal{A}$ be a combinatorial class and $v$ an admissible value for $A(x).$
A Boltzmann model at $v$ assigns each $\alpha \in \mathcal{A}$ the probability $$\mathbb{P}_v(\alpha) = \frac{v^{\alpha}}{A(v)}.$$
Note that $$ \sum_{\alpha \in \mathcal{A}} \mathbb{P}_v(\alpha) = \frac{\sum_{\alpha \in \mathcal{A}} v^{\alpha}}{A(v)} = \frac{A(v)}{A(v)}=1.$$ and $\mathbb{P}_v(\alpha)$ depends only on $\alpha.$ A Boltzmann generation algorithm with parameter $v$ is a randomized algorithm that returns $\alpha \in \mathcal{A}$ with probability $\mathbb{P}_v(\alpha).$
The value $v$ is a parameter: we can “tune” it to get objects of roughly the size we want. The expected size of the Boltzmann algorithm with parameter $v$ is $$ \sum_{\alpha \in \mathcal{A}} \alpha \mathbb{P}_v(\alpha) = \frac{\sum_{\alpha \in \mathcal{A}} \alpha v^{\alpha}}{A(v)} = \frac{vA^\prime(v)}{A(v)},$$ so we typically solve $n=\frac{vA^\prime(v)}{A(v)}$ for $v\in(0,R),$ if possible.
Example: For rooted binary trees with generating function $$B(z) = \frac{1\sqrt{14z}}{2z}$$ we have seen in past chapters that $R = 1/4.$ The following table shows the probabilities of generating trees of size $n=0,1,2,3$ for different values of $v$ approaching $R.$ The expected sizes of the trees generated are approximately
 $0.14$ if $v = 0.1$
 $0.62$ if $v = 0.2$
 $24.5$ if $v = 0.2499$
$v$ \ $n$  0  1  2  3 

0.1  0.88  0.09  0.01  0.001 
0.2  0.72  0.14  0.02  0.005 
0.2499  0.51  0.13  0.03  0.01 
Building Boltzmann Samplers #
The neutral class $\epsilon$ has generating function $1$ so its Boltzmann generator always returns a single $\epsilon$ with probability $v^0/1 = 1.$ The atomic class has generating function $z$ so its Boltzmann generator always returns a single atom with probability $v^1/v = 1.$
BOLTϵ(v): return ϵ
BOLTZ(v): return Z
What about our constructions?
Combinatorial Sum #
Suppose $\mathcal{A} = \mathcal{B} + \mathcal{C}$ and we have Boltzmann generation algorithms BOLTB(v)
and BOLTC(v)
. How do we build BOLTA(v)
?
The probability that $\alpha \in \mathcal{A}$ comes from $\mathcal{B}$ should be $$ \sum_{\beta \in \mathcal{B}} \frac{v^{\beta}}{A(v)} = \frac{B(v)}{A(v)} \in (0,1).$$
Note: Any admissible value for $A(z)$ is also admissible for $B(z).$
def BOLTA(v):
u = RR.random(0,1)
if u < B(v)/A(v) return BOLTB(v)
else return BOLTC(v)
Combinatorial Product #
Suppose $\mathcal{A} = \mathcal{B} \times \mathcal{C}$ and we have Boltzmann generation algorithms BOLTB(v)
and BOLTC(v)
. How do we build BOLTA(v)
?
If $\alpha = (\beta,\gamma)$ then $$\frac{v^{\alpha}}{A(v)} = \frac{v^{\beta + \gamma}}{A(v)} = \frac{v^{\beta}}{A(v)} \cdot \frac{v^{\gamma}}{A(v)}.$$
This means each component is independent.
def BOLTA(v):
return [BOLTB(v), BOLTC(v)]
This is much easier than recursive sampling.
Sequence #
If $\mathcal{A}=\text{SEQ}(B)$ we can still use the recursive trick discussed above, but can also be more direct.
The probability that a specific tuple $(\beta_1,\dots,\beta_k)$ appears is $$\frac{v^{\beta_1 + \cdots + \beta_k}}{1/(1B(v))} = \frac{v^{\beta_1}}{B(v)} \cdots \frac{v^{\beta_k}}{B(v)} \cdot B(v)^k \cdot (1B(v)),$$ so, the probability that a $k$tuple appears should be $$\sum_{\beta_1,\dots,\beta_k \in \mathcal{B}} \frac{v^{\beta_1}}{B(v)} \cdots \frac{v^{\beta_k}}{B(v)} \cdot B(v)^k \cdot (1B(v)) = B(v)^k (1B(v)).$$
This is known as a geometric distribution.
def BOLTA(v):
k = Geometric(B(v))
return ktuple [BOLTB(v),...,BOLTB(v)]
We can sample from geometric distributions using a simple algorithm.
# Output natural number k with P(k returned) = λ^k*(1λ)
def Geometric(λ):
# Check if λ in (0,1)
if not (λ > 0 and λ <1):
return "Error (λ must be in (0,1))"
u = RR.random_element(0,1) # Random real in (0,1)
k = 0
s = 1  λ
while s < u:
k = k+1
s = s + λ^k*(1λ)
return k
Boltzmann Binary Trees #
Translating the specification $$\mathcal{B} = \epsilon + \mathcal{Z} \times \mathcal{B} \times \mathcal{B},$$ into a Sage Boltzmann sampler immediately gives the following.
def BoltzmannTree(x):
u = RR.random_element(0,1)
T = (1sqrt(14*x))/(2*x)
if u < 1/T: return 'ϵ' # Sum term
return ['Z', BoltzmannTree(x), BoltzmannTree(x)] # Product term
Note how much simpler this is compared to the random sampling algorithm above. Running
[BoltzmannTree(0.2) for k in range(10)]
gives 10 randomly generated trees
[
'ϵ',
['Z', ['Z', 'ϵ', 'ϵ'], 'ϵ'],
['Z', 'ϵ', 'ϵ'],
['Z', 'ϵ', 'ϵ'],
['Z',['Z', 'ϵ',['Z',['Z', 'ϵ', 'ϵ'],['Z',['Z', 'ϵ', 'ϵ'],['Z', 'ϵ',['Z', 'ϵ',['Z', 'ϵ',['Z', 'ϵ', 'ϵ']]]]]]],'ϵ'],
'ϵ',
'ϵ',
['Z', 'ϵ', 'ϵ'],
'ϵ',
'ϵ'
]
of which five have size 0, three have size 1, one has size 2, and one has size 10. Although the size of the tree returned is not fixed, recall that the output is always uniformly random with respect to all the binary trees of the same size as the output.
The fact that we see many trees of size 0 is not a coincidence. Indeed, randomly generating 1000 trees with parameter value $v=0.24$ relatively close to $R=0.25$ gives the probability histogram
xx = 0.24
pt = [str(BoltzmannTreePlot(xx)).count('[') for k in range(1000)]
pexp = plot(histogram(pt,bins=max(pt), density=True, range=[min(pt),max(pt)]))
pexp
Thus, close to 60% of the generated trees will have size 0!
Boltzmann Theory and Marked Binary Trees #
The problem in our last example is that the distribution for the expected size of a generated binary tree, defined by $$ E(z) = \frac{zB^\prime(z)}{B(z)} $$ is too concentrated: the expected size $E(v)$ goes to infinity as $v\rightarrow 0.25,$ but only because there is a very small chance of generating a massive tree.
The theory of Boltzmann sampling was worked out in a wonderful paper of Duchon, Flajolet, Louchard, and Schaffer, from which we have adapted much of our presentation. In particular, under some natural assumptions we can repeatedly call a Boltzmann sampling algorithm until we get close to a desired size (within any constant fraction) and this approach has linear complexity in the desired size.
Theorem (DuchonFlajoletLouchardSchaffer 2004): Fix $\epsilon>0$ and let $\mathcal{C}$ be a combinatorial class such that $C(z)$ has a finite radius of convergence $R>0,$ $$\lim_{z\rightarrow R^} \frac{zC^\prime(z)}{C(z)} = \infty,$$ and a technical “variance condition” holds.
Let $z_n$ be the smallest positive root of $n = \frac{zC^\prime(z)}{C(z)}$ and repeatedly call
BOLTC(z_n)
until it returns something with size in $[(1\epsilon)n,(1+\epsilon)n].$ This happens with probability going to $1$ as $n\rightarrow\infty$ and on average takes $O(n)$ total operations.
Remark: Random generation has lots of practical applications (for instance, bugs in the Glasgow Haskell Compiler have been found by randomly generating large “λterms”).
To improve Boltzmann sampling for a specific class, we can consider related classes with less peaked distributions.
For instance, a marked rooted binary tree is a binary tree where one node is “marked”. Since all trees of size $n$ are marked in $n$ ways, we can randomly generate a rooted binary tree by generating a marked tree and forgetting which node is marked. Marking is useful for randomly generating object of large size, because objects of larger size can be marked in more ways.
If $\mathcal{B}^*$ is the class of marked rooted binary trees then $$ \mathcal{B}^* = \underbrace{\mathcal{Z}^* \times \mathcal{B} \times \mathcal{B}}_\text{root marked} \;+\; \underbrace{\mathcal{Z} \times \mathcal{B}^* \times \mathcal{B}}_\text{marker in left subtree} \;+\; \underbrace{\mathcal{Z} \times \mathcal{B} \times \mathcal{B}^*}_\text{marker in right subtree} $$ where, as usual, $\mathcal{B} = \epsilon + \mathcal{Z} \times \mathcal{B} \times \mathcal{B}.$
This leads to the following Boltzmann sampling algorithm (written using the Sage BinaryTree data structure for plotting).
def BoltzmannPointedTreePlot(x):
var('z')
u = RR.random_element(0,1)
T = (1sqrt(14*z))/(2*z)
pointedT = z*T.diff(z)
T = T.subs(z=x)
pointedT = pointedT.subs(z=x)
# handle union
if u < x*T^2/pointedT:
return BinaryTree([BoltzmannTreePlot(x), BoltzmannTreePlot(x)])
if u < x*T^2/pointedT + x*pointedT*T/pointedT:
return BinaryTree([BoltzmannPointedTreePlot(x), BoltzmannTreePlot(x)])
return BinaryTree([BoltzmannTreePlot(x), BoltzmannPointedTreePlot(x)])
Now we have a size distribution that is much less bunched around the origin.
xx = 0.24
pt = [str(BoltzmannPointedTreePlot(xx)).count('[') for k in range(1000)]
plot(histogram(pt,bins=max(pt), density=True, range=[min(pt),max(pt)]))
The following Boltzmann sampler repeats until it is within 10% of the input size.
# Helper function
def treeSize(BT): return str(BT).count('[')
# Repeatedly call Boltzmann sampler until tree of right size found
def BoltzmannReject(n):
T = (1sqrt(14*z))/(2*z)
pointedT = z*T.diff(z)
eq = (z*pointedT.diff(z)/pointedT) == n
xx = eq.find_root(0,1/4)
BT = BoltzmannPointedTreePlot(xx)
while treeSize(BT)<0.9*n or treeSize(BT)>1.1*n:
BT = BoltzmannPointedTreePlot(xx)
return BT
For instance, running
plotTree(BoltzmannReject(300))
returns the following tree of size 311 in around 250 milliseconds
We also note that our (marked) Boltzmann sampling algorithm is more efficient than uniform random sampling. For instance, running
timeit('BoltzmannReject(1000)')
timeit('genTreePlot(1000)')
shows that Boltzmann sampling takes approximately 765 milliseconds to generate a tree of size within 10% of 1000, while recursive sampling takes approximately 2 seconds. This difference grows as the size of the trees gets larger.