Shuffling

From formal point of view, the shuffle product of two sets of strings is just the Cartesian product of labeled structures. We utilize a special symbol, ш, for this operation. This symbol is the Russian letter "sha," which also resembles the Hebrew "shin." Two sets L1 and L2 are shuffled by creating all the word pairs from both: they merged but keeping internal order.

Example: Suppose that we want to shuffle the pair (abs, de), which produces:
\[ \left\{ abcde, \ abdce, \ adbce, \ dabce, \ abdec, \ adbec, \ dabec, \ deabc, adebc, daebc \right\} . \]

Given a deck of n cards, how many times must we shuffle it to make it “random”? Of course, the answer depends upon the method of shuffling which is used and what we mean by “random.” We shall begin the study of this question by considering a standard model for the riffle shuffle.

We begin with a deck of n cards, which we will assume are labelled in increasing order with the integers from 1 to n. A riffle shuffle consists of a cut of the deck into two stacks and an interleaving of the two stacks. For example, if n = 6, the initial ordering is (1,2,3,4,5,6), and a cut might occur between cards 2 and 3. This gives rise to two stacks, namely (1,2) and (3,4,5,6). These are interleaved to form a new ordering of the deck. For example, these two stacks might form the ordering (1,3,4,2,5,6). In order to discuss such shuffles, we need to assign a probability distribution to the set of all possible shuffles. There are several reasonable ways in which this can be done. We will give several different assignment strategies, and show that they are equivalent. (This does not mean that this assignment is the only reasonable one.) First, we assign the binomial probability b(n,1/2, k) to the event that the cut occurs after the k-th card. Next, we assume that all possible interleavings, given a cut, are equally likely. Thus, to complete the assignment of probabilities, we need to determine the number of possible interleavings of two stacks of cards, with k and n - k cards, respectively.

We begin by writing the second stack in a line, with spaces in between each pair of consecutive cards, and with spaces at the beginning and end (so there are n − k + 1 spaces). We choose, with replacement, k of these spaces, and place the cards from the first stack in the chosen spaces. This can be done in

\[ \binom{n}{k} \]
ways. Thus, the probability of a given interleaving should be
\[ \binom{n}{k}^{-1} . \]
Next, we note that if the new ordering is not the identity ordering, it is the result of a unique cut-interleaving pair. If the new ordering is the identity, it is the result of any one of n + 1 cut-interleaving pairs.

We define a rising sequence in an ordering to be a maximal subsequence of consecutive integers in increasing order. For example, in the ordering

\[ ( 2, 3, 5, 1, 4, 7, 6) , \]
there are 4 rising sequences; they are (1), (2,3,4), (5,6), and (7). It is easy to see that an ordering is the result of a riffle shuffle applied to the identity ordering if and only if it has no more than two rising sequences. (If the ordering has two rising sequences, then these rising sequences correspond to the two stacks induced by the cut, and if the ordering has one rising sequence, then it is the identity ordering.) Thus, the sample space of orderings obtained by applying a riffle shuffle to the identity ordering is naturally described as the set of all orderings with at most two rising sequences.

It is now easy to assign a probability distribution to this sample space. Each ordering with two rising sequences is assigned the value

\[ \frac{b(n, 1/2, k)}{\binom{n}{k}} = \frac{1}{2^n} , \]
and the identity ordering is assigned the value
\[ \frac{n+1}{2^n} . \]
There is another way to view a riffle shuffle. We can imagine starting with a deck cut into two stacks as before, with the same probabilities assignment as before i.e., the binomial distribution. Once we have the two stacks, we take cards, one by one, off of the bottom of the two stacks, and place them onto one stack. If there are k1 and k2 cards, respectively, in the two stacks at some point in this process, then we make the assumption that the probabilities that the next card to be taken comes from a given stack is proportional to the current stack size. This implies that the probability that we take the next card from the first stack equals
\[ \frac{k_1}{k_1 + k_2} , \]
and the corresponding probability for the second stack is
\[ \frac{k_2}{k_1 + k_2} . \]
We shall now show that this process assigns the uniform probability to each of the possible interleavings of the two stacks.

Suppose, for example, that an interleaving came about as the result of choosing cards from the two stacks in some order. The probability that this result occurred is the product of the probabilities at each point in the process, since the choice of card at each point is assumed to be independent of the previous choices. Each factor of this product is of the form

\[ \frac{k_i}{k_1 + k_2} , \]
where i = 1 or 2, and the denominator of each factor equals the number of cards left to be chosen. Thus, the denominator of the probability is just n!. At the moment when a card is chosen from a stack that has i cards in it, the numerator of the corresponding factor in the probability is i, and the number of cards in this stack decreases by 1. Thus, the numerator is seen to be k!(n−k)!, since all cards in both stacks are eventually chosen. Therefore, this process assigns the probability
\[ \frac{1}{2^n} \]
to each possible interleaving.

We now turn to the question of what happens when we riffle shuffle s times. It should be clear that if we start with the identity ordering, we obtain an ordering with at most 2s rising sequences, since a riffle shuffle creates at most two rising sequences from every rising sequence in the starting ordering. In fact, it is not hard to see that each such ordering is the result of s riffle shuffles. The question becomes, then, in how many ways can an ordering with r rising sequences come about by applying s riffle shuffles to the identity ordering? In order to answer this question, we turn to the idea of an a-shuffle.

  1. V. Dobrushkin, Methods in Algorithmic Analysis, CRC Press, 2010.
  2. M. Lothaire, Combinatorics on Words, Cambridge University Press (1997) ISBN 0-521-59924-5
  3. Shuffle Algebra Encyclopedia of Mathematics.
  4. Shuffle product of words Sage manual.
  5. Marni Mishna, Mike Zabrocki. Analytic aspects of the shuffle product. Susanne Albers, Pascal Weil. STACS 2008, Feb 2008, Bordeaux, France. IBFI Schloss Dagstuhl, pp.561-572, 2008.
  6. Brad Mann, How many times should you shuffle a deck of cards?, Harvard University.
  7. Seven shuffles math Fun Facts.
  8. Dave Bayer and Persi Diaconis, Trailing the Dovetail Shuffle to its Lair, The Annals of Applied Probability, 1992, Vol. 2, No. 2, 294--313.