Saturday, July 30, 2011

Secret Santa and Derangements

Here's an interesting question. Suppose you're holding a Secret Santa - you write down the names of your friends and yourself and put all the names in a hat. Each person pulls a name out of the hat and they must buy a present for the person whose name they pull out. What is the chance that someone will get their own name?

I recently discovered that if we permute an ordered set in such a way that no element is in its original position, this is called a derangement. So, we can rephrase the question in the first paragraph as what is the chance of a derangement.

If we have one name in our hat, then the chance of a derangement is 0. If we have two names, the chances are 50%. If we have 3 names, then there are 6 permutations. Of these, 4 of them are derangements, so we have a 2/3 chance of a derangement. As we have more names, things get a little complicated. We should be able to use mathematics to solve this, but since I am more adept with writing software than with mathematics, I thought I would write a simulation in Python to calculate the probability and then ask my friend and excellent mathematician, Alasdair, to do the mathematical analysis. So, without further ado, here is a Python program which simulates a Secret Santa for 10, 20, 30, 40, 50 and 100 people and runs
each simulation 10,000 times.

Write a Monte Carlo simulation to estimate the chance
of a derangement.

import random

def event(n):
    Given n people, create an array of n elements
    shuffle the array and then return True if it's
    a derangement, else False.
    a = [i for i in range(n)]

    for i in xrange(n):
        if i == a[i]:
            return False
    return True

simulations = 10000
for people in [10, 20, 30, 40, 50, 100]:
    derangements = 0
    for i in range(simulations):
        if event(people):
            derangements += 1

print 'People: %d  Derangements: %f' % (
    people, float(derangements) / simulations)

Running this program, we get (maybe, because we get different results each time we run it):
People: 10  Derangements: 0.371900
People: 20  Derangements: 0.370300
People: 30  Derangements: 0.370500
People: 40  Derangements: 0.368200
People: 50  Derangements: 0.360000
People: 100  Derangements: 0.372100

It is hard to tell from these results whether we are converging to a value. Ideally we should be doing about a million events for each simulation and using maybe a million people (we are trying to find out where the series converges when the amount of people approaches infinity). I am sure that there will be some very simple value for the final answer. So, over to Alasdair...

To look at the mathematics behind derangements, lets call the number of derangements of n objects D(n). A bit of fiddling will show that D(2)=1, D(3)=2 and D(4)=9. For example, to show that D(4)=9, you can list all the permutations of 4 objects, and find which of those are derangements:

1 2 3 4
1 2 4 3
1 3 2 4
1 3 4 2
1 4 2 3
1 4 3 2
2 1 3 4
2 1 4 3 *
2 3 1 4
2 3 4 1 *
2 4 1 3 *
2 4 3 1
3 2 1 4
3 2 4 1
3 1 4 2 *
3 1 2 4
3 4 1 2 *
3 4 2 1 *
4 1 2 3 *
4 1 3 2
4 2 1 3
4 2 3 1
4 3 1 2 *
4 3 2 1 *

If you like a hard time, then you can find that D(5)=44. Beyond that the amount of work becomes too big for this sort of brute force calculation.

So then, without listing all permutations, how could we go about finding D(5)? Well, start with five objects

1 2 3 4 5

Clearly we can't keep just one object in place and permute the other four. So we have to change the positions of at least two objects, and permute the other three. For example, we could swap 1 and 2:

2 1 * * *

and then there are two derangements of the other three. And in fact this gives as a method of counting all derangements. The number 1 can also swapped with three other numbers:
3 1 * * *
4 1 * * *
5 1 * * *

and in each case the derangements of the rest can be counted. How might we count the number of derangements in the case where say, 4 is in the first place? Like this:

4 * * * *

We consider two cases:
  1. The 4 and the 1 are swapped: 4 * * 1 *
  2. The 1 goes into some other place, such as 4 * 1 * *

Now to count the number of derangements in each case.
  1. The other numbers 2, 3, and 5, have not been moved. So in this case the total number of derangements is the number of derangements of three objects: D(3)=2.
  2. Suppose we try to count the number of derangements now where the 1 is not in the fourth place. This means we have:

    Original: 1 2 3 4 5
    After the swap: 4 2 3 1 5

    Forgetting about the initial four, the number of derangements is the number of derangements of the other four: this will ensure that 1 does not end up in the fourth place. And this number is D(4)=9.

What we have shown now is that the number of derangements with 4 in the first place is D(3)+D(4)=2+9=11. But there are four possible values for the first place (every number other than 1), so the total number of derangements is 4 x (D(3)+D(4)) = 4 x 11 =44. And this is the value of D(5). Applying this argument to the general case leads to the recursive formula


The next few values are easily determined: D(6)=5(D(4)+D(5))=5(9+44)=265. Then D(7)=6(D(5)+D(6))=6(44+265)=1854. And so on.

The question now is to determine a non-recursive formula. And for this we are going to use the inclusion-exclusion principle, which you may have seen in relation to the sizes of set intersections and unions. For two sets, it says that: $$|A\cup B|=|A|+|B|-|A\cap B|$$ That is, the number of elements in a set union is the sum of the set cardinalities, minus the cardinality of the set intersection. The reason for subtracting the set intersection is that when the cardinalities of the two sets are added, each element in the union is counted twice.

For three sets, we have $$|A\cup B\cup C|=|A|+B|+|C|-|A\cap B|-|A\cap C|-|B\cap C|+|A\cap B\cap C|$$ So first the (cardinalities of the) sets are added; then the intersections of all pairs of sets subtracted, and finally the intersection of all sets added. And this principle can be extended to an arbitrary number of sets. First add the cardinalities of all the sets, then subtract the cardinalities of all intersections of pairs of sets, then add the cardinalities of all intersections of triples of sets, then subtract the cardinalities of all intersections of quadruples of sets, then add... you get the idea!

What does this have to do with derangements? Well, consider all the derangements of four objects, and let $$A$$ be the set in which 1 is in its correct place, $$B$$ the set for which 2 is in the correct place, and $$C$$ and $$D$$ set sets with 3 and 4 in the correct place respectively. We can now use the inclusion-exclusion principle to determine the number of permutation where at least one element is in the correct place: that is, the size of the union of $$A$$, $$B$$, $$C$$ and $$D$$. Now the size of each individual set is $$3!$$. The size of each intersection of two sets is $$2!$$ (two elements are fixed, leaving two to move), and there are

$${4\choose 2}=6$$
possible pairs. The size of each intersection of three sets is $$1!$$ (three elements are fixed, leaving only one to move), and there are

$${4\choose 3}=4$$
possible triples. Thus we have

$$A\cup B\cup C\cup D = 4(3!)-{4\choose 2}2!+{4\choose 3}1!+{4\choose 4}0!$$.
Writing each binomial coefficient in terms f factorials and simplifying, this sum becomes

or as

But this value (call it $$v$$) is the number of permutations in which at least one element is fixed. The number of derangements is $$4!-v$$, or

In general, then:

$$D(n)=n!(1-\frac{1}{1!}+\frac{1}{2!}-\frac{1}{3!}+\frac{1}{4!}-\cdots +(-1)^n\frac{1}{n!}).$$
If we divide both sides by $$n!$$, and take the limit as n rockets off to infinity, we have

and the right hand side can be seen to be $$e^{-1}\approx 0.36787944$$, which is what was obtained above by experiment.