Chapter 10: Labelled Constructions #

One reason why Combinatorics has been slow to become accepted as part of mainstream Mathematics is the common belief that it consists of a bag of isolated tricks, a number of areas ... with little or no connexion between them. We shall see that they have numerous threads weaving them together into a beautifully patterned tapestry.

— Richard K. Guy

I wonder which is preferable, to walk around all your life swollen up with your own secrets until you burst from the pressure of them, or to have them sucked out of you, every paragraph, every sentence, every word of them, so at the end you’re depleted of all that was once as precious to you as hoarded gold, as close to you as your skin ... and must spend the rest of your days like an empty sack flapping in the wind, an empty sack branded with a bright fluorescent label so that everyone will know what sort of secrets used to be inside you?

— Margaret Atwood (The Blind Assassin)

Previously we’ve seen how to enumerate combinatorics objects by expressing them recursively using empty (neutral) elements, atoms, and various constructions (sum, product, sequence, etc.). In combinatorial contexts, it is often desirable to label the atomic elements in a class: many natural objects admit easy specifications using labelled atoms, and labelled enumeration is often easier than the unlabelled case as labelling breaks symmetries.

A combinatorial object of size $n$ is labelled by giving each of its atomic parts a distinct label from $[n]=\{1,\dots,n\}.$

Example: Let $\mathcal{G}$ be the combinatorial class of abstract graphs with size given by number of vertices (note that unlike most past examples, here we consider graphs defined only by their vertices and edge relations) and consider the unlabelled graph

with size 4. To assign the numbers $\{1,2,3,4\}$ to the vertices of this graph it is sufficient to give a number to the unique vertex of degree 3, as all other vertices have the same edge relations (they are all connected to this unique vertex and not each other). Thus, this unlabelled graph corresponds to the four labelled graphs

Similarly, the unlabelled graph

corresponds to the three labelled graphs

since each labelled graph is uniquely determined by which labelled vertices are adjacent to 1.

A combinatorial class is labelled by labelling each of its objects, where every object is repeated with all non-equivalent labellings.

Example: The unlabelled class $\text{SEQ}(\mathcal{Z})$ consists of all tuples of atoms $$ \text{SEQ}(\mathcal{Z}) = \{\epsilon, \; \bullet, \; \bullet \bullet, \; \bullet\bullet \bullet, \; \dots \}$$ so the labelled version contains the objects

The labelled class $\text{SEQ}(\mathcal{Z})$ represents all permutations (in 1-line notation): the objects of size $n$ are all sequences of $\{1,\dots,n\}$ where each number appears exactly once.

Note that as an unlabelled class $\mathcal{C} = \text{SEQ}(\mathcal{Z})$ has counting sequence $c_n = 1$ while as a labelled class it has counting sequence $c_n=n!.$ Corresponding to this difference, we define a modified generating function for labelled objects. More precisely, the exponential generating function of a sequence $(c_n)$ is the formal power series $$ C(z) = \sum_{n \geq 0} \frac{c_n}{n!}z^n $$ whose coefficients are scaled by $n!.$

Example: The exponential generating function for the class $\mathcal{S}$ of permutations is $$ S(z) = \sum_{n \geq 0} \frac{n!}{n!} z^n = \sum_{n \geq 0}z^n = \frac{1}{1-z}.$$ Thus, the ordinary generating function of $\text{SEQ}(\mathcal{Z})$ as an unlabelled class equals its exponential generating function as a labelled class. We show below that this holds much more generally.

Factorials appear frequently in labelled enumeration due to the ability to permute labels, however as seen in our first example above you cannot in general just multiply the number of unlabelled objects of size $n$ by $n!$ to get the number labelled objects. The number of labelled objects in a class depends on how many labellings are equivalent, which corresponds to the symmetries present in each object.

Labelled Constructions #

As for unlabelled objects, we can build labelled classes recursively. One major takeaway from this section is that the exponential generating functions of labelled classes behave the same as the ordinary generating functions of unlabelled classes under our basic constructions.

The labelled atomic class $\mathcal{Z}=\{ 1 \}$ is the combinatorial class containing one object of size $1$ and no objects of any other size. The neutral class $\epsilon$ is the combinatorial class containing one labelled object of size $0$ and no objects of any other size.

Analogously to the unlabelled case, we have that the exponential generating functions of the atomic class and neutral class are $z$ and $1,$ respectively.

Labelled Sum #

As in the unlabelled case, the (labelled) sum $\mathcal{A} + \mathcal{B}$ of two labelled combinatorial classes $\mathcal{A}$ and $\mathcal{B}$ is the new labelled class whose objects are the disjoint union of the objects in $\mathcal{A}$ and $\mathcal{B},$ and whose size function is inherited from $\mathcal{A}$ and $\mathcal{B}.$

Example: As labelled classes we have

Lemma 1: If $\mathcal{C} = \mathcal{A} + \mathcal{B}$ as labelled classes then their exponential generating functions satisfy $C(z) = A(z) + B(z).$

Proof of Lemma 1

An element of $\mathcal{C}$ with size $n$ is either an element of $\mathcal{A}$ with size $n$ or an element of $\mathcal{B}$ with size $n.$ Since we take the disjoint union, the counting sequences of these classes satisfy $c_n = a_n + b_n$ and $$ \begin{aligned} C(z) &= \sum_{n \geq 0} \frac{c_n}{n!}z^n \\[+2mm] &= \sum_{n \geq 0} \frac{\left(a_n + b_n\right)}{n!} z^n \\[+2mm] &= \sum_{n \geq 0} \frac{a_n}{n!} z^n + \sum_{n \geq 0}\frac{b_n}{n!} z^n \\[+4mm] &= A(z) + B(z). \end{aligned} $$

Labelled Product #

Let $\mathcal{A}$ and $\mathcal{B}$ be two labelled combinatorial classes. To mirror the unlabelled case, we want the objects of the labelled product $\mathcal{A} \times \mathcal{B}$ to consist of pairs $(a,b)$ with $a\in \mathcal{A}$ and $b\in \mathcal{B}.$ One immediate problem, however, is that we need to “merge” the labellings of $a$ and $b$ to get a proper labelling of $(a,b).$

Example: The pair of elements

is not properly labelled, as the label 1 appears twice.

A consistent relabelling of a labelled pair $(a,b)$ of size $|a|+|b|$ is an assignment of $\{1,2,\dots,|a|+|b|\}$ to the atomic elements in $a$ and $b$ such that each number is used exactly once, and

the new labels on the elements of $a$ are in the same relative order as they were in the original labelling of $a$
the new labels on the elements of $b$ are in the same relative order as they were in the original labelling of $b$

We let $a \star b$ denote the set of all consistent relabellings of $(a,b).$

Example: The set

equals

It does not, for instance, have the element

since the labels in the second object are not in increasing order.

The labelled product $\mathcal{A} \times \mathcal{B}$ of the labelled classes $\mathcal{A}$ and $\mathcal{B}$ is the class whose objects form the union $$ \mathcal{A} \times \mathcal{B} = \bigcup_{\substack{a \in \mathcal{A} \ b \in \mathcal{B}}} (a \star b)$$ of all consistent relabellings of the pairs of objects in $\mathcal{A}$ and $\mathcal{B},$ with size function $$|(\alpha,\beta)| = |\alpha|_\mathcal{A} + |\beta|_\mathcal{B}.$$

Example: The labelled product of the classes

is the class $\mathcal{A} \times \mathcal{B}$ containing the objects

Lemma 2: If $\mathcal{C} = \mathcal{A} \times \mathcal{B}$ as labelled classes then their exponential generating functions satisfy $C(z) = A(z) B(z).$

Proof of Lemma 2

For any $0 \leq k \leq n$ there are $a_k$ objects in $\mathcal{A}$ with size $k$ and $b_{n-k}$ objects in $\mathcal{B}$ with size $n-k.$ If $\alpha$ has size $k$ and $\beta$ has size $n-k$ then, because we keep the same relative order of labels, a consistent relabelling of $(\alpha,\beta)$ is uniquely defined by assigning $k$ elements of $\{1,\dots,n\}$ to the atoms in $\alpha.$ Thus, there are $\binom{n}{k}$ ways to consistently relabel such a pair of objects, and $$ \begin{aligned} C(z) = \sum_{n \geq 0} c_nz^n &= \sum_{n \geq 0}\left(\sum_{k=0}^n \binom{n}{k}a_kb_{n-k}\right)\frac{z^n}{n!} \\[+4mm] &= \sum_{n \geq 0}\left(\sum_{k=0}^n \frac{n!}{k!(n-k)!} \; a_kb_{n-k}\right)\frac{z^n}{n!} \\[+4mm] &= \sum_{n \geq 0}\left(\sum_{k=0}^n \frac{a_k}{k!} \; \frac{b_{n-k}}{(n-k)!}\right)z^n \\[+4mm] &= A(z)B(z). \end{aligned} $$

Labelled Power and Sequence #

As in the unlabelled case, for any positive integer $k$ and labelled class $\mathcal{A}$ we define the labelled power class $$\mathcal{A}^k = \underbrace{\mathcal{A} \times \mathcal{A} \times \cdots \times \mathcal{A}}_{k \text{ times}}$$ and, when $\mathcal{A}$ has no objects of size zero, the labelled sequence class $$\text{SEQ}(\mathcal{A}) = \epsilon + \mathcal{A} + \mathcal{A}^2 + \mathcal{A}^3 + \cdots. $$ Since the labelled power and sequences classes are defined in terms of the labelled sum and product, Lemmas 1 and 2 above imply that the exponential generating functions of these classes equal the ordinary generating functions $A(z)^k$ and $\frac{1}{1-A(z)}$ of their unlabelled counterparts.

Labelled Set #

Let $\mathcal{A}$ be a labelled combinatorial class. For any positive integer $k$ the class $\text{SET}_k(\mathcal{A})$ consists of all objects in $\mathcal{A}^k$ where two tuples are considered equal if they are equal up to a permutation.

Example: As elements of $\text{SEQ}_3(\mathcal{A})$ the objects $$ \begin{aligned} &(a_1,a_2,a_3) \quad (a_1,a_3,a_2) \quad (a_2,a_1,a_3) \\[+2mm] &(a_2,a_3,a_1) \quad (a_3,a_1,a_2) \quad (a_3,a_2,a_1) \end{aligned} $$ are all considered equal.

If $\mathcal{C} = \text{SET}_k(\mathcal{A})$ then, since there are $k!$ permutations of a $k$-tuple $(a_1,\dots,a_k)$ and no elements $a_i$ and $a_j$ can be equal because their atoms must have different labels, we have the exponential generating function relation $$C(z) = \frac{A(z)^k}{k!}.$$ When $\mathcal{A}$ has no objects of size zero then we define $$ \text{SET}(\mathcal{A}) = \epsilon + \text{SET}_1(\mathcal{A}) + \text{SET}_2(\mathcal{A}) + \text{SET}_3(\mathcal{A}) + \cdots $$ The definition of power series composition and the series $\exp(z) = \sum_{k \geq 0}\frac{z^k}{k!}$ immediately gives the following result.

Lemma 3: If $\mathcal{C} = \text{SET}(\mathcal{A})$ then the exponential generating functions of these classes satisfy $$C(z) = \sum_{k \geq 0} \frac{A(z)^k}{k!} = \exp(A(z)).$$

Remark: Recall that in the unlabelled case the power set construction resulted in the much more complicated ordinary generating function $$ \exp\left( \sum_{k\geq 1} \frac{(-1)^{k-1}}{k}A(z^k) \right).$$ This increased complexity comes from the fact that symmetries in the unlabelled case make going from an ordered tuple to a set difficult, and these are broken by adding labels.

Labelled Cycle #

Let $\mathcal{A}$ be a labelled combinatorial class. For any positive integer $k$ the class $\text{CYC}_k(\mathcal{A})$ consists of all objects in $\mathcal{A}^k$ where two tuples are considered equal if they are equal up to a cyclic shift.

Example: As elements of $\text{CYC}_3(\mathcal{A})$ the objects $$ (a_1,a_2,a_3) \quad (a_2,a_3,a_1) \quad (a_3,a_1,a_2) $$ are all considered equal, but these are not equal to $(a_1,a_3,a_2)$ since it is not a cyclic shift of these elements.

If $\mathcal{C} = \text{SEQ}_k(\mathcal{A})$ then, since there are $k$ cyclic shifts of a $k$-tuple $(a_1,\dots,a_k)$ and no elements $a_i$ and $a_j$ can be equal because their atoms must have different labels, we have the exponential generating function relation $$C(z) = \frac{A(z)^k}{k}.$$ When $\mathcal{A}$ has no objects of size zero then we define $$ \text{CYC}(\mathcal{A}) = \text{CYC}_1(\mathcal{A}) + \text{CYC}_2(\mathcal{A}) + \text{CYC}_3(\mathcal{A}) + \cdots $$

Remark: Unlike for $\text{SEQ}$ and $\text{SET}$ we do not include an object of size zero in the $\text{CYC}$ construction.

The definition of power series composition and the series $\log\left(\frac{1}{1-z}\right) = \sum_{k \geq 1}\frac{z^k}{k}$ immediately gives the following result.

Lemma 4: If $\mathcal{C} = \text{CYC}(\mathcal{A})$ then the exponential generating functions of these classes satisfy $$C(z) = \sum_{k \geq 1} \frac{A(z)^k}{k} = \log\left(\frac{1}{1-A(z)}\right).$$

Example: Recall from Chapter 2 that a permutation can be viewed in disjoint cycle notation as a set of disjoint labelled cycles. Thus, if $\mathcal{S}$ denotes the class of permutations then $$ \mathcal{S} = \text{SET}(\text{CYC}(\mathcal{Z})).$$ Note that $$ S(z) = \exp\left(\log\left(\frac{1}{1-z}\right)\right) = \frac{1}{1-z}$$ matching the exponential generating function seen above (in fact, this gives a combinatorial proof of this identity for formal power series!).

Example: A non-planar rooted labelled tree of size $n$ is a rooted tree on $n$ vertices whose nodes are labeled with $\{1,\dots,n\}$ (each number appears exactly once) where the order of a node’s children don’t matter.

If $\mathcal{T}$ is the labelled class of non-planar rooted trees then $$ \mathcal{T} = \mathcal{Z} \times \text{SET}(\mathcal{T}), $$ where we use the $\text{SET}$ construction to represent the (possibly empty) unordered collection of children of a node.

Example: Use LIFT to find the number of non-planar rooted labelled trees on $n \geq 1$ nodes. Then, find the number of non-planar non-rooted labelled trees on $n$ nodes.

Click for Answer

The specification above implies that the EGF $T(z)$ for the class of non-planar rooted labelled trees satisfies $$ z = \frac{T(z)}{e^{T(z)}} = \frac{T(z)}{G(T(z))} $$ where $G(u)=e^u.$ LIFT thus implies that the number of non-planar rooted labelled trees on $n$ nodes is $$ n! [z^n]T(z) = \frac{n!}{n}[u^{n-1}]G(u)^n = (n-1)![u^{n-1}]e^{un} = n^{n-1}. $$ Because there are $n$ ways to root a trees on $n$ nodes, the number of non-planar non-rooted labelled trees on $n$ nodes is thus $n^{n-2}$ when $n \geq 1.$

Restricted Constructions #

As in the unlabelled case, we use the notation $$\text{SEQ}_{\text{property}}(\mathcal{A}), \quad \text{SET}_{\text{property}}(\mathcal{A}), \;\; \text{and} \;\; \text{CYC}_{\text{property}}(\mathcal{A})$$ to denote the sum of terms in the sequence, set, and cycle constructions where a certain property holds. For instance,

$\text{SET}_{\leq r}(\mathcal{A})$ contains the sets of elements in $\mathcal{A}$ with at most $r$ elements, with generating function $$ \sum_{k=0}^r \frac{A(z)^k}{k!} $$
$\text{CYC}_{\leq r}(\mathcal{A})$ contains the cycles of elements in $\mathcal{A}$ with at most $r$ elements, with generating function $$ \sum_{k=1}^r \frac{A(z)^k}{k} $$
$\text{SET}_{\text{even}}(\mathcal{A})$ is the union of $\text{SET}_k(\mathcal{A})$ for even $k$ and contains all sets of elements in $\mathcal{A}$ with even size
$\text{CYC}_{\text{odd}}(\mathcal{A})$ is the union of $\text{CYC}_k(\mathcal{A})$ for odd $k$ and contains all cycles of elements in $\mathcal{A}$ with odd size

The generating function of a restricted sequence class is the same in the labelled case as in the unlabelled case. The generating functions of set and cycle constructions with parity restrictions are given by the following exercises.

Exercise: Define the series $$\cosh(z) = \frac{e^z+e^{-z}}{2} \qquad\text{and}\qquad \sinh(z) = \frac{e^z-e^{-z}}{2}$$ (which may be familiar to the reader from the study of differential equations) and let $\mathcal{A}$ be a combinatorial class with no objects of size $0.$

Prove that the generating function of $\mathcal{B} = \text{SET}_{\text{even}}(\mathcal{A})$ is $$B(z) = \cosh(A(z)).$$

Prove that the generating function of $\mathcal{C}=\text{SET}_{\text{odd}}(\mathcal{A})$ is $$C(z) = \sinh(A(z)).$$

Exercise: Let $\mathcal{A}$ be a combinatorial class with no objects of size $0.$

Prove that the generating function of $\mathcal{B} = \text{CYC}_{\text{even}}(\mathcal{A})$ is $$B(z) = -\frac{1}{2}\log \left(1-A(z)^2\right).$$

Prove that the generating function of $\mathcal{C} = \text{CYC}_{\text{odd}}(\mathcal{A})$ is $$ C(z) = \frac{1}{2}\log \left(\frac{1+A(z)}{1-A(z)}\right).$$

Hint: Recall the property $\log (PQ) = \log(P) + \log(Q)$ of the formal logarithm from Chapter 4.

Applications of Labelled Enumeration #

We now survey some of the interesting combinatorial objects that can be studied using labelled enumeration.

Restricted Permutations #

As seen above, permutations can be viewed both as a labelled sequence (corresponding to 1-line notation) and a set of cycles (corresponding to disjoint cycle notation). Our labelled constructions allow us to study permutations with various restrictions.

Example: A fixed point in a permutation is a cycle of length 1 (equivalently, when viewing a permutation as a map from $[n]$ to $[n]$ a fixed point is a point that is sent to itself) and a derangement is a permutation with no fixed points. Find the exponential generating function for the class $\mathcal{D}$ of derangements.

Click for Answer

By definition $\mathcal{D}$ consists of permutations with no cycles of length one, with specification $$ \mathcal{D} = \text{SET}(\text{CYC}_{\geq 2}(\mathcal{Z})).$$ Since the generating function of $\text{CYC}_{\geq 2}(\mathcal{Z})$ is $$ \sum_{k \geq 2} \frac{z^k}{k} = \log\left(\frac{1}{1-z}\right)-z$$ we have $$D(z) = \exp\left(\log\left(\frac{1}{1-z}\right)-z\right) = \frac{e^{-z}}{1-z}.$$

Exercise: Find the EGF for the class of permutations whose cycles have odd lengths only, then find the EGF for the class of permutations whose cycles have even lengths only.

Our setup allows us to be flexible with the restrictions we consider.

Example: Fix a positive integer $r.$

(a) Find the EGS for permutations with no cycles of length $r.$

(b) Find the EGS for permutations where all cycles have length larger than $r.$

Click for Answer

(a) If $\mathcal{A}$ is the class of permutations with no $r$-cycles then $$ \mathcal{A} = \text{SET}(\text{CYC}_{\neq r}(\mathcal{Z})).$$ Since $\text{CYC}_{\neq r}(\mathcal{Z})$ has EGF $$ \sum_{\substack{j \geq 1 \\[+0.5mm] j \neq r}} \frac{z^j}{j} = \sum_{j \geq 1}\frac{z^j}{j} - \frac{z^r}{r} = \log\left(\frac{1}{1-z}\right)-\frac{z^r}{r},$$ we have $$A(z) = \exp\left(\log\left(\frac{1}{1-z}\right)-\frac{z^r}{r}\right) = \frac{e^{-z^r/r}}{1-z}.$$

(b) Similarly, if $\mathcal{B} = \text{SET}(\text{CYC}_{>r}(\mathcal{Z}))$ then since $\text{CYC}_{>r}(\mathcal{Z})$ has EGF $$ \sum_{j >r}\frac{z^j}{j} = \log\left(\frac{1}{1-z}\right)-\sum_{j=1}^r\frac{z^j}{j}$$ we have $$B(z) = \frac{e^{-z-z^2/2-\cdots-z^r/r}}{1-z}.$$

Remark: Although extracting coefficients in the last example can be difficult, the results in Part 4 of these notes immediately allow us to deduce asymptotic behaviour from these EGFs. In particular, the probability that a random permutation of size $n$ has no $r$ cycles approaches $e^{-1/r}$ as $n\rightarrow\infty,$ and the probability that a random permutation of size $n$ has no cycles of length at most $r$ is $e^{-1-1/2-\cdots-1/r}$ as $n\rightarrow\infty.$

Set Partitions #

Recall from Chapter 5 that a set partition of size $n$ is a decomposition of $[n] = \{1,\dots,n\}$ into a disjoint union of non-empty sets called blocks.

Example: The five set partitions of size 3 are $$ \begin{aligned} \{1,2,&3\} \qquad \{1,2\} \cup \{3\} \qquad \{1,3\} \cup \{2\}\\[+3mm] &\{1\} \cup \{2,3\} \qquad \{1\} \cup \{2\} \cup \{3\} \end{aligned} $$ One of these set partitions has a single block, three of the set partitions have two blocks, and one of the set partitions has three blocks.

Although set partitions can be enumerated using ordinary generating functions, it is much more natural to study them using labelled classes and exponential generating functions.

Example: Find the EGF for the class of set partitions.

Click for Answer

A set partition is a set of non-empty labelled sets. Thus, if $\mathcal{S}$ is the labelled class of set partitions then $$ \mathcal{S} = \text{SET}(\text{SET}_{\geq 1}(\mathcal{Z}))$$ so $S(z) = e^{e^z - 1}.$

Example: Fix a positive integer $r.$ Find the EGF for set partitions with $r$ blocks, then use this EGF to prove that the number $\left\{ n \atop r \right\}$ of set partitions of size $n$ with $r$ blocks is $$ \left\{ n \atop r \right\} = \frac{1}{r!} \sum_{j=0}^r (-1)^{r-j}\binom{r}{j} j^n. $$

Click for Answer

If $\mathcal{S}^{(r)}$ is the labelled class of set partitions with $r$ blocks then $$ \mathcal{S}^{(r)} = \text{SET}_r(\text{SET}_{\geq 1}(\mathcal{Z}))$$ so $$ S^{(r)}(z) = \frac{(e^z - 1)^r}{r!}.$$ Thus, $$ \begin{aligned} \left\{ n \atop r \right\} &= n! [z^n]S^{(r)}(z) \\[+3mm] &= n! [z^n]\frac{(e^z - 1)^r}{r!} \\[+3mm] &= \frac{n!}{r!} [z^n] \sum_{j \geq 0} \binom{r}{j}(-1)^{r-j}e^{zj} \\[+3mm] &= \frac{1}{r!}\sum_{j \geq 0} \binom{r}{j}(-1)^{r-j}j^n \end{aligned} $$ as claimed.

Remark: Recall that we derived the Stirling number of the second kind $\left\{ n \atop r \right\}$ in Chapter 5 using its ordinary generating function $$\frac{z^r}{(1-z)(1-2z)\cdots (1-rz)}$$ derived using a bijection to a certain class of strings. In addition to a faster proof of this result, our labelled constructions allow us to derive additional information, such as the average number of blocks in a set partition (see the next section).

Functional Graphs #

A mapping of size $n$ is a function $f:\{1,\dots,n\} \rightarrow \{1,\dots,n\}.$ A direct counting argument proves that there are $n^n$ mappings of size $n$ (since each of the $n$ inputs gets sent to one of $n$ possible values), but interpreting mappings using labelled constructions allows for a more refined study.

The key to enumerating restricted mappings is that a mapping can be viewed as a directed graph on the nodes $[n]=\{1,\dots,n\}$ with an edge $x \rightarrow y$ if and only if $f(x)=y.$

Example: The mapping $f:[6]\rightarrow[6]$ defined by

$z$ 1 2 3 4 5 6

$f(z)$ 5 2 5 2 6 3

corresponds to the directed graph

$z$	1	2	3	4	5	6
$f(z)$	5	2	5	2	6	3

The class of functional graphs constructed by applying this process to all mappings can be expressed in terms of rooted trees.

Map Enumeration Theorem: Let $\mathcal{M}$ be the class of mappings, $\mathcal{K}$ be the class of connected functional graphs, and $\mathcal{T}$ be the class of rooted labelled trees. Then $\mathcal{M} = \text{SET}(\mathcal{K})$ and $\mathcal{K} = \text{CYC}(\mathcal{T})$ so that $$ M(z) = \frac{1}{1-T(z)}$$ where $T(z)$ satisfies $T(z) = ze^{T(z)}.$

Proof of Mapping Enumeration Theorem

The equality $\mathcal{M} = \text{SET}(\mathcal{K})$ follows from the fact that a general functional graph is a set of its connected components, each of which is a connected functional graph.

Because a mapping has a unique output for any input, every node in a functional graph has out degree 1. Thus, a connected functional graph has exactly one cycle (if there were two then there must be a path from one to the other, meaning one vertex in one of the cycles would have out degree at least two). Coming into the cycle at any vertex is a (possibly empty) connected cycle-free graph – i.e., a tree, so $\mathcal{K} = \text{CYC}(\mathcal{T}).$ This labelled tree is rooted as the element in the cycle is distinguished (note that the directions of the edges in the underlying functional graph go from leaves to root).

Example: The functional graph in our last example can be decomposed as

which corresponds to the set

Our rules for labelled constructions thus imply $$ M(z) = \exp\left(\log \frac{1}{1-T(z)} \right) = \frac{1}{1-T(z)} $$ where the exponential generating function $T(z)$ for rooted labelled trees was shown above to satisfy $T(z)=zT(z).$

Modifying the constructions used in this specification allows us enumerate mappings with various restrictions.

Example: An idempotent mapping $f$ is a mapping that satisfies $f(f(z))=f(z)$ for all $z.$ Find the exponential generating function of the class $\mathcal{I}$ of idempotent maps.

Click for Answer

The class of functional graphs corresponding to idempotent maps consist of connected components that are stars: a (potentially empty) set of single vertices pointed at a loop.

Since the class $\mathcal{X}$ of stars has the specification $\mathcal{X} = \mathcal{Z} \times \text{SET}(\mathcal{Z}),$ we see that $\mathcal{I} = \text{SET}(\mathcal{Z} \times \text{SET}(\mathcal{Z})),$ so $$ I(z) = e^{ze^z}.$$

Exercise: Find the EGF for the class of mappings with no fixed points (a fixed point for a mapping $f$ is an element $i$ such that $f(i)=i$).

Interestingly, the Map Enumeration Theorem also gives a semi-combinatorial proof of a binomial identity.

Corollary: For any positive integer $n,$ $$\sum_{k \geq 0}\binom{n-1}{k-1} k! \, n^{-k} = 1.$$

Proof of Corollary

We prove that there are $$ m_n = n^n \sum_{k \geq 0}\binom{n-1}{k-1} k! \, n^{-k} $$ mappings of size $n,$ and the stated identity follows from the fact that $m_n=n^n$ by direct counting. Since $z = \frac{T(z)}{G(T(z))}$ with $G(u)=e^u,$ the general version of LIFT implies that $$ [z^n]T(z)^k = \frac{k}{n}\left[u^{n-k}\right]e^{un} = \frac{k}{n} \cdot \frac{n^{n-k}}{(n-k)!}$$ for any natural number $k.$ Thus, $$ \begin{aligned} m_n = n![z^n]M(z) &= [z^n] \frac{1}{1-T(z)} \\[+3mm] &= n! \sum_{k \geq 0}[z^n]T(z)^k \\[+3mm] &= n! \sum_{k \geq 0} \frac{k}{n} \cdot \frac{n^{n-k}}{(n-k)!} \\[+3mm] &= n^n \sum_{k \geq 0}n^{-k}k!\binom{n-1}{k-1}. \end{aligned} $$

Labelled Classes and Parameters #

Just like for unlabelled objects, we can use multivariate EGFs to track parameters in a combinatorial class. To simplify our presentation we focus on bivariate generating functions tracking 1-dimensional parameters that take natural number values, but the general framework from the unlabelled case translates over analogously.

The (labelled) bivariate EGF of a labelled class $\mathcal{C}$ with respect to a parameter $p : \mathcal{C} \rightarrow \mathbb{N}$ is $$ C(u,z) = \sum_{\sigma \in \mathcal{C}} u^{p(\sigma)}\frac{z^{|\sigma|}}{|\sigma|!} = \sum_{n,k \geq 0} \frac{c_{k,n}}{n!} u^kz^n, $$ where $c_{k,n}$ is the number of objects in $\mathcal{C}$ with size $n$ and parameter value $k.$

Remark: Although we still divide $c_{k,n}$ by $n!$ we do not divide by $k!.$ Roughly, dividing by $n!$ accounts for permutations between the labels of the atoms, and simply tracking a parameter does not require any further compensation. Some sources call $C(u,z)$ a semi-exponential generating function to stress this fact, but we do not use this additional terminology.

Just as in the unlabelled class, we consider inherited parameters that can be specified using a parameterized neutral class $\mu$ that has size zero but represents an increase of one in the parameter value. We can derive bivariate generating functions by adding this parameterized neutral class to the labelled specifications described above to get (labelled) parameterized specifications.

Example: The labelled specification $$\mathcal{S} = \text{SET}(\text{CYC}(\mathcal{Z}))$$ for permutations interpreted as a set of disjoint cycles gives the parameterized labelled specification $$\mathcal{S} = \text{SET}(\mu \times \text{CYC}(\mathcal{Z}))$$ tracking number of cycles as a parameter. This gives the bivariate generating function $$ S(u,z) = e^{u \log \frac{1}{1-z}} = (1-z)^{-u}$$ enumerating permutations by size and number of cycles, where the final expression follows from log rules but formally is just a compact way to represent the composition $e^{u \log \frac{1}{1-z}}.$

Similar to unlabelled objects, we have $$ \frac{d}{du}C(u,z){\Big|}_{u=1} = C_u(1,z) = \sum_{n,k \geq 0} \frac{k c_{k,n}}{n!} z^n $$ so $$ \frac{[z^n]C_u(1,z)}{[z^n]C(1,z)} = \frac{\frac{1}{n!}\sum_{k \geq 0}kc_{k,n}}{\frac{1}{n!}c_n} = \sum_{k \geq 0}k\frac{c_{k,n}}{c_n} $$ gives the expected value $\mathbb{E}_n[p]$ of a tracked parameter, just like in the unlabelled case.

Very Useful Remark: The observation from the unlabelled case that the formal derivatives of our generating functions match their usual calculus derivative still holds. In particular, calculus rules for derivatives and compositions of exponential and logarithmic functions are valid.

Example: Find the average number of cycles among the permutations of size $n.$

Click for Answer

The chain rule implies that the generating function $S(u,z) = e^{u \log \frac{1}{1-z}}$ satisfies $$ \frac{d}{du}S(u,z){\Big|}_{u=1} = e^{u \log \frac{1}{1-z}} \log \frac{1}{1-z}{\Big|}_{u=1} = \frac{1}{1-z}\log \frac{1}{1-z}. $$ Since $\log \frac{1}{1-z} = \sum_{k \geq 1} \frac{z^k}{k},$ this means that the average number of cycles among the permutations of size $n$ is $$ \frac{[z^n]S_u(1,z)}{[z^n]S(1,z)} = [z^n] \frac{1}{1-z}\log \frac{1}{1-z} = \sum_{k=1}^n \frac{1}{k}. $$ We note that $H_n = \sum_{k=1}^n \frac{1}{k}$ is the $n$th Harmonic number. We will see in Part 4 of these notes that $H_n$ behaves like $\log n$ as $n$ goes to infinity, so the average number of cycles grows logarithmically. A refined analysis using more advanced techniques even allows one to prove that the number of cycles approaches a normal distribution as $n$ goes to infinity.

Example: Find the average root degree among the rooted labelled trees with $n$ vertices.

Click for Answer

If $\mathcal{T}$ is the class of rooted labelled trees then we recall the labelled specification $$ \mathcal{T} = \mathcal{Z} \times \text{SET}(\mathcal{T})$$ interpreting a rooted labelled tree as a root followed by a set of children, which are also rooted labelled trees. If $$ \mathcal{R} = \mathcal{Z} \times \text{SET}(\mu \times \mathcal{T})$$ then $R(u,z)$ is the bivariate EGF counting size and number of children of the root (since $\mathcal{R}$ marks only the children in the root and recursively considers children in $\mathcal{T}$ where nothing is marked).

We see that $$ T(z) = ze^{T(z)} \qquad R(u,z) = ze^{uT(z)}, $$ so that $$ \frac{d}{du}R(u,z){\Big|}_{u=1} = zT(z)e^{uT(z)}{\Big|}_{u=1} = zT(z)e^{T(z)} = T(z)^2.$$ Thus, LIFT implies $$ [z^n]R_u(1,z) = [z^n]T(z)^2 = \frac{2}{n}[t^{n-2}]e^{nt} = \frac{2}{n} \cdot \frac{n^{n-2}}{(n-2)!}, $$ so the average root degree is $$ \frac{\frac{2}{n} \cdot \frac{n^{n-2}}{(n-2)!}}{\frac{n^{n-1}}{n!}} = \frac{2(n-1)}{n}. $$ Note that as $n\rightarrow\infty$ this average approaches $2.$

Example: The $n$th Bell number $B_n$ is defined by the exponential generating function $$ e^{e^z-1} = \sum_{n \geq 0} \frac{B_n}{n!} z^n. $$ The Bell numbers also count the number of set partitions of size $n,$ meaning that $B_n$ is the sum $\sum_{r=0}^n\left\{ n \atop r \right\}$ of Stirling numbers of the second kind. Find the average number of blocks among the set partitions of size $n$ in terms of $B_n.$

Click for Answer

Above we have seen the specification $$ \mathcal{S} = \text{SET}(\text{SET}_{\geq 1}(\mathcal{Z}))$$ for the class of set partitions, interpreting a set partition as a set of blocks (which themselves are non-empty sets). Marking blocks gives the parameterized labelled specification $$ \mathcal{S} = \text{SET}(\mu \times \text{SET}_{\geq 1}(\mathcal{Z}))$$ corresponding to the bivariate generating function $$ S(u,z) = e^{u(e^z-1)},$$ which implies $$ \frac{d}{du}S(u,z){\Big|}_{u=1} = (e^z-1)e^{u(e^z-1)}{\Big|}_{u=1} = e^ze^{e^z-1} - e^{e^z-1}.$$ Since $[z^n]S(1,z) = \frac{B_n}{n!}$ we see that $$ [z^n]e^ze^{e^z-1} = [z^n]\frac{d}{dz}S(1,z) = \frac{B_{n+1}}{n!}, $$ so the average number of blocks is $$ \frac{[z^n]e^ze^{e^z-1} - [z^n]e^{e^z-1}}{[z^n]e^{e^z-1}} = \frac{B_{n+1}}{B_n} -1 .$$ We note that although there is no nice closed formula for this ratio, the saddle-point method can be used to show that this grows like $n/\log n$ as $n\rightarrow\infty.$

Additional Problems #

Problem 1: Find the EGFs of the following labelled classes. Express your answer in closed form.
(a) Permutations that are involutions (applying the permutation twice gives the identity).

(b) Derangements with an even number of cycles (i.e., permutations with no fixed points that contain an even number of cycles).

(c) Permutations where every cycle has even length and there are no cycles of size less than $4.$

(d) The class of permutations where one element in each cycle is coloured red (when thinking of a permutation as a set of disjoint cycles).

Problem 2: A rooted labelled forest is a set of rooted labelled trees.
(a) Determine a specification for the combinatorial class of rooted labelled forests and use LIFT to prove that the number of rooted labeled forests on $n$ vertices is $(n+1)^{n-1}.$

(b) Recall that there are $(n+1)^{n-1}$ (non-rooted) labelled trees on $n+1$ vertices. Give a bijective proof of your formula in (a) by giving a bijection between rooted labelled forests on $n$ vertices and (non-rooted) labelled trees on $n+1$ vertices.

Problem 3: Prove the identity $$ \sum_{k=1}^n \binom{n-1}{k-1}n^{n-k} = (n+1)^{n-1} $$ by decomposing the class of rooted labelled forests into a disjoint union of easily described sets counted by the left-hand side.

Problem 4: Let $\mathcal{S}$ be the class of permutations, $\mathcal{I}$ be the class of identity permutations (with one element of every size), and $\mathcal{D}$ be the class of derangements (permutations with no fixed points). Give a specification relating $\mathcal{S},\mathcal{I},$ and $\mathcal{D},$ and use this to derive the EGF $D(z).$

Problem 5: The class $\mathcal{L}$ of binary labelled hierarchies consists of rooted trees whose internal nodes (non-leaves) are unlabelled and have two children, and whose leaves are labelled. The size of a labelled hierarchy is its number of leaves (so the leaves in a hierarchy of size $n$ have the labels $1,\dots,n$).
(a) Prove the labelled combinatorial specification $\mathcal{L} = \mathcal{Z} + \text{SET}_2(\mathcal{L}).$

(b) Prove that the number of binary labelled hierarchies of size $n \geq 2$ is $$(2n-3)!!=(2n-3)(2n-5)(2n-7)\cdots(3)(1).$$

Problem 6: Fix a positive integer $r.$ Find the average number of cycles of length $r$ among the permutations of size $n.$

Problem 7: A surjection of size $n$ is any map from $\{1,\dots,n\}$ onto a set of the form $\{1,\dots,r\}$ (i.e., the image of the map hits every number from $1$ to $r$). Express the average size of the image of a surjection of size $n$ as a ratio of coefficient extractions.

Problem 8: A mapping $f:A\rightarrow B$ is called 2-to-1 if every element in $B$ either never gets mapped to or gets mapped to exactly twice. Let $\mathcal{F}_2$ be the class of 2-to-1 mappings and let $\mathcal{T}_2$ be the class of rooted labelled trees where every node is a leaf or has exactly two children.
(a) Explain why $\mathcal{F}_2 = \text{SET}(\text{CYC}(\mathcal{Z}\times\mathcal{T}_2)).$

(b) Derive the EGF of $\mathcal{T}_2$ and use it to prove that there are $$\binom{2n}{n}n!(2n-1)!!$$ 2-to-1 mappings of size $2n,$ where $$(2n-1)!! = (2n-1)(2n-3)\cdots(3)(1).$$

Problem 9: A triangle tree is a labelled connected graph in which every edge is in exactly one cycle, and this cycle has length $3.$

Show that the number of triangle-trees on $n$ vertices is 0 if $n$ is even and $$ \frac{(2k)!(2k+1)^{k-1}}{k!2^k} $$ is $n=2k+1$ is odd.
Hint: Decompose a triangle tree recursively by removing a vertex.

Problem 10: Prove that the number $p_n$ of permutations of $\{1,\dots,2n\}$ which have no cycles of length larger than $n$ is $$ p_n = (2n)! \left(1 - \sum_{k=n+1}^{2n}\frac{1}{k}\right).$$

A Logic Puzzle Connected to Problem 10

The following puzzle originally appeared in a theoretical computer science paper of Gál and Miltersen. It has since become a famous application of permutation statistics due to the solution presented below (Gál and Miltersen attribute this solution to Sven Skyum).

$2n$ prisoners, each uniquely identified by a number between 1 and $2n,$ are sentenced to death. The director of the prison gives them one last chance to live by playing a game. He has a room in his prison with exactly $2n$ lockers, and will put cards with the numbers from 1 to $2n$ inside the lockers (each number from 1 to $2n$ appears in exactly one locker).

The game is the following. The prisoners are called one by one into the room with the lockers. Each of them can open $n$ of the $2n$ lockers. If a prisoner finds the locker with his number inside, then the prisoner leaves the room and the game continues with the next prisoner (all opened lockers are closed and the room is left in exactly the same state as when the prisoner entered, meaning that no hints can be left). If any of the prisoners fails to recover his number after opening $n$ lockers, they all lose and get killed. If all prisoners find the locker with their number within $n$ guesses, then the prisoners get to live.

They can agree before the beginning of the game on a common strategy, but after that they cannot communicate or leave any hint to other prisoners. If each prisoner randomly picks $n$ drawers then each prisoner has a 1/2 chance of finding their number, and together the prisoners have a $(1/2)^n$ change of staying alive. But, there is a strategy that succeeds for all $n$ with greater than 30% chance! Find such a strategy.

Answer to Logic Puzzle

Try to think about the puzzle above before reading this.

The prisoner identified with the number $i$ should start by opening the locker $i.$ If they find number $j$ inside locker $i$ then either $i=j$ (and they can go) or $i\neq j$ and they should open locker $j$ next. Repeat this process of opening lockers and using what’s inside to determine the next locker to open until their number $i$ is found, or $n$ boxes have been opened. The mapping that takes the prisoners’ numbers to lockers gives a permutation of $1,\dots,2n$ and the strategy succeeds if and only if the permutation has no cycles of length larger than $n.$

There are $(2n)!$ permutations of size $2n,$ so by Question 10 the probability they succeed is $1 - \sum_{k=n+1}^{2n}\frac{1}{k}.$ As we will see in detail in Part 4 of these notes, the Taylor series of the natural logarithm implies that $$1 - \sum_{k=n+1}^{2n}\frac{1}{k} \geq 1 - \log 2 > 0.3$$ for all $n.$

Note that even if the warden knows the prisoners will use this strategy they can still win by agreeing on a uniform random permutation of their numbers beforehand.