You're a fan of pen & paper RPGs and have finally arranged a new playing group, recruited amongst your friends from the mathematical faculty.
As a GameMaster (GM) you've assumed that all your players will be fully equipped with the necessary dice showing 4-sides (d4), 6-sides (d6), 8-sides (d8) 10-sides (d10), 12-sides (d12) and 20-sdies (d20). So your disappointment is great, when you show up for the game and all they have brought with them was a near-infinite amount of ordinary, six-sided dice.
While you luckily didn't need to have 'all' dice in your set, the RPG system you want to play with is based on d100 ( usually realized by using two d10 ), and you have to find a way of using (any number) of d6 and any rule you can think of, to realized a value range 0-99 (or 0 to 99).
One mathematician suggests:
We simply roll twenty of our d6, sum the rolls and subtract 20, that gives us the range (20;120)-20 = (0;100), and we consider both the 0 and the 100 to be 100.
This works, but chances for the various values are rather far from being an equal "fair" distribution, so soon the punch of players start a heated discussion on how they can 'best' achieve a fair distribution. They agree on the following metric:
"Ideally, each value has a chance of 0.01, so the best rule is the one which minimizes $\Sigma(( P_i - 0.01 )^2)$" - it does not matter how many (finite) rolls of d6 are used.
Under these conditions, what is the "fairest" method of getting to a range 0..99 (or 1..100) just using six-sided dice?
Notes:
- I do not know a "best" solution for this (yet)
- the 'algorithm' has to produce any value of the D100 range
- The "fairest" algorithm wins. Only if two algorithms are equally fair, the one with less dice rolls wins.
- the 'algorithm' can consist of any mathematical rule which you can think of, but it must be reasonably "doable" by a punch of mathematicians with nothing but pen & paper (and d6)
- Any finite number of d6 ( or rolls thereof) may be used in the 'algorithm'
Note, that the last rule exclude all algorithms which do not guarantee that a result is reached in a finite number of rolls. (In particular, you can not re-roll until a certain value is rolled. You may reroll 1e106 times though.)
As xnor points out in his answer, this question is basically asking for the way to most evenly distribute $6^n$ results among $100$ bins, and gives a very brief description of the solution. I'll go into a bit more detail here. If you're not interested in the proofs, skip to section 2.3
Summary
In order to construct a "best" algorithm for converting rolls of one die into another, there are two types of optimality to consider:
- Minimum "score." BmyGuest's metric: $$\sum_i ( p_i - 1/100 )^2$$ is basically the chi-squared statistic for goodness-of-fit, between the distribution resulting from the algorithm and a uniform distribution. I'll call the metric applied to a particular algorithm that algorithm's score.
- Minimum number of rolls. I'll consider only the expected number of rolls for an algorithm.
Note I'll be diverging a from typical RPG terminology: when I refer to $\mathsf{3d6}$, for example, I mean that you should roll three six-sided dice without adding them, keeping track of the order the dice were rolled in (usually the dice are treated as indistinguishable from each other, i.e. rolled all at once). This is important because, as you noted, the results in a traditional roll are not equiprobable.
I also consider a $\mathsf{d}n$ to be numbered from $0$ to $n-1$, to make some of the math a little easier. If you're following my algorithms with real dice, just map $n\to 0$ for the input and vice-versa for the output.
In the general case, we want to map $m$ "results" into $n$ "bins." (In the original case, we have $m=6^k$, $n=100$.) Consider an algorithm that maps $r_i$ results to bin $i$.
The probability of landing in bin $i$ is $r_i/m$, so the score is:
$$ \sum_{i=0}^{n-1}\left(\frac{r_i}{m} - \frac{1}{n}\right)^2 $$
1.1 Distribution of an Optimal Algorithm
Since the score is a sum of independent terms we can easily isolate the effect of changing a pair of elements of the solution, $(r_i,r_j)$. Let's decrease $r_i$ by $1$ and increase $r_j$ by $1$, and see under what conditions the score decreases:
$$ \begin{align} \left(\frac{r_i}{m} - \frac{1}{n}\right)^2 + \left(\frac{r_j}{m} - \frac{1}{n}\right)^2 &> \left(\frac{r_i-1}{m} - \frac{1}{n}\right)^2 + \left(\frac{r_j+1}{m} - \frac{1}{n}\right)^2 \\ \frac{r_i^2 + r_j^2}{m^2} - 2\frac{r_i + r_j}{m n} + \frac{2}{n^2} &> \frac{r_i^2 + r_j^2 - 2(r_i - r_j) + 2}{m^2} - 2\frac{r_i + r_j}{m n} + \frac{2}{n^2} \\ 0 &> \frac{2}{m^2}(1 - r_i + r_j) \\ r_i &> 1 + r_j \end{align} $$
This means that if the solution contains two values which differ from each other by more than one, it cannot be optimal (since moving the two values closer together will decrease the score). Therefore, in an optimal solution $r_i$ will take on at most two values, separated from each other by $1$. Let's call them $r$ and $r+1$. Calling the number of bins with $r$ results $x$, and the number of bins with $r+1$ results $x'$, we can write a system of equations:
$$ n = x + x' \\ m = rx + (r+1)x' $$
We can solve for $x$ and $x'$ in terms of $r$:
$$ x' = n - x \\ m = rx + (r+1)(n-x) \\ m = rx + rn - rx + n - x \\ m = (r+1)n - x \\ x = (r+1)n-m $$
Given that $0\leq x \leq n$:
$$ 0 \leq (r+1)n-m \leq n \\ 0 \leq (r+1) - \frac{m}{n} \leq 1 \\ \frac{m}{n} \leq r+1 \leq \frac{m}{n} + 1 \\ \frac{m}{n} - 1 \leq r \leq \frac{m}{n} $$
Since $r$ must be an integer, we have:
$$ r = \left\lfloor \frac{m}{n} \right\rfloor $$
Substituting back in, we see that:
$$ \begin{align} r + 1 &= \left\lfloor \frac{m}{n} \right\rfloor + 1 \\ x &= \left(\left\lfloor \frac{m}{n} \right\rfloor +1 \right)n - m \\ x' &= m - \left\lfloor \frac{m}{n} \right\rfloor n \end{align} $$
1.2 Score of an Optimal Algorithm
The score of the optimal algorithm described above gives us a way to compute the minimum possible score for a given $m$ and $n$:
$$ \sum_{i=0}^{n-1}\left(\frac{r_i}{m} - \frac{1}{n}\right)^2 = x\left(\frac{r}{m}-\frac{1}{n}\right)^2+x'\left(\frac{r+1}{m}-\frac{1}{n}\right)^2 \\ = \left[\left(\left\lfloor \frac{m}{n} \right\rfloor +1 \right)n - m\right]\left(\frac{\lfloor m/n\rfloor}{m}-\frac{1}{n}\right)^2+\left[m - \left\lfloor \frac{m}{n} \right\rfloor n\right]\left(\frac{\lfloor m/n\rfloor + 1}{m}-\frac{1}{n}\right)^2 $$
To simplify this expression, we can make use of the identity:
$$ \left\lfloor \frac{m}{n} \right\rfloor = \frac{m}{n} - \frac{m\ \mathrm{mod}\ n}{n} $$
To prevent the equations from getting too long, I'll use $m' = m\ \mathrm{mod}\ n$:
$$ \left[\left(\frac{m-m'}{n} +1 \right)n - m\right]\left(\frac{m-m'}{mn}-\frac{1}{n}\right)^2+\left[m - \frac{m-m'}{n} n\right]\left(\frac{m-m'+n}{mn}-\frac{1}{n}\right)^2 \\ = (m-m' + n - m)\left(\frac{m-m' - m}{mn}\right)^2+(m - m+m')\left(\frac{m-m'+n-m}{mn}\right)^2 \\ = \frac{(n-m')(-m')^2+(m')(n-m')^2}{(mn)^2} \\ = \frac{(n-m')(m')}{(mn)^2}(m' + n-m') \\ = \frac{1}{n}\left(\frac{m'}{m}\right)\left(\frac{n-m'}{m}\right) $$
Thus we can see that the score decreases as $m$ increases, and has local minima when $m'$ or $n-m'$ are close to zero (i.e. $m$ close to divisible by $n$). This agrees with our intuition.
1.3 Optimal Scores for Mapping a $k\mathsf{d6}$ to a $\mathsf{d100}$
Using the expression above, Here's a table of the best possible scores for each $k \le 10$:
k m (m mod n) score
---------------------------
0 1 1 0.990
1 6 6 0.157
2 36 36 0.0178
3 216 16 2.88e-4
4 1,296 96 2.29e-6
5 7,776 76 3.02e-7
6 46,656 56 1.13e-8
7 279,936 36 2.94e-10
8 1,679,616 16 4.76e-12
9 10,077,696 96 3.78e-14
10 60,466,176 76 4.99e-15
An unstated assumption was that the assignment of results to bins was the same on each roll. Some algorithms achieve a better average score by allowing the results of one roll to influence the next. However, the score of each individual roll is limited by the same maximum.
An algorithm can have an expected number of rolls less than the maximum number of rolls, while maintaining the same score. In fact, an algorithm can have a finite expected number of rolls even with an infinite maximum number. This is possible when the algorithm allows "early termination."
An algorithm maps each input roll (previously "result") to a corresponding output (previously "bin"). If an algorithm maps all rolls beginning with a certain prefix sequence to the same output, once that prefix has been rolled we know the output will be the same regardless of the subsequent rolls and we can stop rolling.
Pay careful attention to the terminology in the following sections: a roll is the roll of a single die, and a sequence or input is a list of rolls input to an algorithm
2.1 Bounding the Expected Number of Rolls
In order to compute the minimum expected number of rolls, I make the assumption that each roll is made with the same type of die. That is, if the first roll is a $\mathsf{d6}$, all the subsequent rolls will also be $\mathsf{d6}$s. Essentially, instead of rolling a large number of different dice, we roll one die over and over. (Note this restriction means that we cannot analyze algorithms that use another source of randomness such as the orientation of the die.) Assume we are using a $\mathsf{d}b$.
The expected number of rolls is the sum, over all inputs, of the probability of each input times its length. If the length of sequence $i$ is $k_i$, then the expected number of rolls $E$ is:
$$ E = \sum_i k_i b^{-k_i} $$
Consider replacing a roll $\{x_1, x_2, x_3, \ldots, x_k\}$ with $b$ rolls all mapping to the same output as the original roll:
$$ \{x_1, x_2, x_3, \ldots, x_k, 0\} \\ \{x_1, x_2, x_3, \ldots, x_k, 1\} \\ \vdots \\ \{x_1, x_2, x_3, \ldots, x_k, b-1 \} $$
The algorithm is still valid, since all possible rolls are mapped to outputs, and the probability of each output is the same. The change in $E$ is:
$$ b\left[(k+1)b^{-(k+1)}\right]- k b^{-k} \\ = (k+1) b^{-k} - k b^{-k} \\ = b^{-k} $$
Thus, splitting an input increases $E$. Conversely, merging the set of inputs with a given prefix will decrease $E$. Therefore, choosing how many inputs of each length to assign to a given output can be solved by a greedy algorithm.
Imagine that the output space of the algorithm is represented by an interval of length $1$, and that each output is a sub-interval of length $p_i$ (the probability of that output). Each input is an interval of length $b^{-k}$, and the problem is to cover each output interval with a number of input intervals, with no overlap. The greedy algorithm does so optimally by repeatedly inserting the largest input (smallest $k$) that fits in the remaining space in the output.
Example: find the optimal distribution of inputs for $b=6$, $k=4$, $n=100$. We have:
$$ m = 1296 \\ \left\lfloor \frac{m}{n} \right\rfloor = 12 \\ m' = 96 $$
Remember that $p_i=r_i/m$, and $r_i=\lfloor m/n \rfloor + 0\text~{or}~1$, so:
$$ p = \frac{12}{1296}~\text{or}~\frac{13}{1296} $$
$6^{-1}=1/6$ and $6^{-2}=1/36$ are both too big, but $6^{-3}=1/216$ is small enough to fit in both $p_i$:
$$ \begin{align} p &= 2\times 6^{-3} + \frac{0}{1296} \\ &~\text{or}~2\times 6^{-3} + \frac{1}{1296} \end{align} $$
Finally, we fill up the remainder with multiples of $6^{-4}$:
$$ \begin{align} p &= 2\times 6^{-3} \\ &~\text{or}~2\times 6^{-3} + 1\times 6^{-4} \end{align} $$
We can see that this process is just the expression of $p$ as a decimal in base $b$! With this in mind, I'll skip an analysis of the minimum $E$ for optimally-scoring algorithms of finite $k$, and jump straight to algorithms where $k$ is infinite.
2.2 Optimal Algorithm with Exact Distribution
With an infinite maximum $k$, the value of $m$ is also infinite, and the score of the algorithm drops to zero. This means that we can achieve an exactly uniform output distribution. This allows us to forget about $m$, $m'$, $x$, and other values that complicated the analysis. $p=1/n$ for all the outputs. We can find the decimal expansion of $p$ by using long division to divide $n$ into $1$.
Example: $b=6$, $n=100=244_6$. The long division is done in base-6.
.0020543...
__________
244 ) 1
.532
----
.024
.02152
------
.00204
.001504
-------
.000132
.0001220
--------
.0000100
The final row is a power of the base, signifying that the decimal expansion repeats. Note that if we round the expansion to four digits, we get the same result as in the example for $k=4$.
With the expansion in hand, we can write an expression for $E$:
$$ E = n\left[\sum_k d_k \times k \times b^{-k} \right] + (e+ k_r)\times b^{-k_r} \\ E = 100\left[2\times 3\times 6^{-3} + 5\times 5\times 6^{-5} + 4\times 6\times 6^{-6} + 3\times 7\times 6^{-7} \right] + (E+5)\times 6^{-5} \\ E = 14738/4665 \approx 3.159\ldots $$
The factor of $n$ appears because there are $n$ identical outputs contributing to $E$. $d_k$ is the $k$-th digit in the base-$b$ expansion of $1/n$, and $k_r$ is the number of repeating digits.
I don't know of a way to simplify the formula for $E$, since the decimal expansion can vary wildly with $b$ and $n$. For example, although $1/100=0.00\overline{20543}_6$ is a fairly short expansion, changing $n$ by $1$ results in $1/101=0.\overline{0020455351}_6$.
2.3 A Implementation of an Optimal Algorithm
Here I will describe one class of optimal algorithm, based on the interpretation of the input as a base-$b$ decimal such that the $k$-th roll is the $k$-th digit after the decimal point. For example, the input:
$$ \{0, 4, 2, 2, 3\} $$
(with $b=6$) is equivalent to the decimal:
$$ .04223_6 $$
Note that inputs are uniformly distributed over the interval $[0,1)$.
Unlike the previous section where we grouped the inputs together based on their output, here I'll group the inputs based on size. The total amount of the interval occupied by inputs of length $k$ is $n d_k b^{-k}$. Note that this quantity is equivalent to the amount that we subtracted from the remainder at step $k$ of the long division, for example:
.0020543...
__________
244 ) 1
.532 <= 100 * 2 * 6^-3
----
.024
.02152 <= 100 * 5 * 6^-5
------
.00204
.001504 <= 100 * 4 * 6^-6
-------
.000132
.0001220 <= 100 * 3 * 6^-7
--------
.0000100
Thus, the remainder at each stage is simply the amount of the interval remaining after all the inputs of length $k$ or less have been removed. This leads us to the algorithm, which I'll demonstrate for $b=6$, $n=100$.
Setup: put the remainders from dividing $n$ into $1$ in base $b$ into a table, along with the number of new digits for each step.
roll 3
.024 => 24 <--+
roll 2 |
.00204 => 204 |
roll 1 |
.000132 => 132 |
roll 1 |
.0000100 => 100 |
roll 1 --+
For every output you need, start at the top of the table. Alternate between two steps:
- Roll the number of dice listed, and append their digits to the current value (which starts at zero).
- If the current value is greater than the number on the current row, return the difference of the two $\mathrm{mod}\ n$; otherwise continue to the next row.
- If you reach the end of the table, go back to the beginning, skipping the first row. (Note that if the decimal expansion terminates, the last remainder will be $0$ and you will never reach the end of the table.)
An example:
- Start with a current value of $0$.
- The first row is
roll 3
, so roll $\mathsf{3d6}$ and append their digits: $$ \{6, 1, 5\} \to 015_6 $$
- The next row is
24
. Since $15_6 < 24_6$, continue to the next row.
- The next row is
roll 2
, so roll $\mathsf{2d6}$: $$ \{4, 6\} \to 1540_6 $$
- The next row is
204
. Since $1540_6 \geq 204_6$, return the difference between the two, $\text{mod}\ 100$: $$ 1540_6 - 204_6 = 1332_6 = 344 \to \boxed{44} $$
Side note: The probability that you have to repeat the table at least once is equal to the last remainder: $.00001_6=1/7776\approx 0.013\%$. Assume that the DM and your entire party of five (six people total) make one roll per minute, every minute, for the next 10 years. That's a total of around $3\cdot 10^{7}$ rolls. Therefore the number of times you'll loop through the table will be around $\log_{7776}3\cdot 10^{7} \approx 1.9$!