PDA

View Full Version : Statistical Analysis of Rolling Dice



HMS Invincible
2009-04-23, 07:55 PM
I know what the curve and average of rolling 1 dice (uniform and avg = half the dice +.5) or multiple dice (bellshape) is but what is the average of rolling multiple dice and dropping the lowest one? I know it's still a bell shape curve. What would the formula be? How would you calculate this without rolling a large number of dice?

If you can't tell, this question came from people claiming pointbuy had a higher average than rolling 4d6, dropping lowest dice.

Jack_Simth
2009-04-23, 08:16 PM
Not sure of a formula, but it's easy enough to brute-force in OpenOffice:


Result 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
Count 1 4 10 21 38 62 91 122 148 167 172 160 131 94 54 21
Weight 3 16 50 126 266 496 819 1220 1628 2004 2236 2240 1965 1504 918 378
Total Rolls 1296
Weighted Average 12.24

Haeleth
2009-04-23, 08:36 PM
If you count all rolls of 3-8 as being the same as 0 points in the standard point-buy system, then on average your characters will average just over 29 points (29.13, to be exact).
So if you're on a 28-point buy, your characters come out essentially even on average since those 3's, 4's, and 5's that will come up (3-7 will show up at least once in 30% of your characters). But with less customization, and more randomness. If I had the choice between roll 4d6, keep top 3, and order by stat, or 28 points, I'm not going to gamble, but others might.
(The math: take Jack_Smith's rates of rolling each of the rolls, and weight them based on number of points required to buy that stat, and divide by 1296 for number of possibilities. Then multiply by six for the stats.)

Dhavaer
2009-04-23, 08:36 PM
Curse you, ninja!

I got the same results in Excel.

Here, for posterity, is the if function I used to find the lowest roll:

=IF(AND(D3<=A3, D3<=B3, D3<=C3),D3,IF(AND(C3<=A3, C3<=B3, C3<=D3),C3,IF(AND(B3<=A3, B3<=C3, B3<=D3),B3,IF(AND(A3<=B3, A3<=C3, A3<=C3),A3,0))))

If someone has a better way (and they do, I know), don't tell me.

Jack_Simth
2009-04-23, 09:04 PM
Curse you, ninja!

I got the same results in Excel.

Here, for posterity, is the if function I used to find the lowest roll:

=IF(AND(D3<=A3, D3<=B3, D3<=C3),D3,IF(AND(C3<=A3, C3<=B3, C3<=D3),C3,IF(AND(B3<=A3, B3<=C3, B3<=D3),B3,IF(AND(A3<=B3, A3<=C3, A3<=C3),A3,0))))

If someone has a better way (and they do, I know), don't tell me.

... I take it, then, that you're not familiar with the "MIN" and "SUM" functions? It's a very simple:
=SUM(A3:D3)-MIN(A3:D3)
to get the total without the lowest value in any given row.

dspeyer
2009-04-23, 09:23 PM
Really klunky Python code:


count =[0 for x in range(19)]
for d1 in range(1,7):
for d2 in range(1,7):
for d3 in range(1,7):
for d4 in range(1,7):
count[d1+d2+d3+d4-min(d1,d2,d3,d4)]+=1
count

Results:
{table]Value|Ways to get
3|1
4|4
5|10
6|21
7|38
8|62
9|91
10|122
11|148
12|167
13|172
14|160
15|131
16|94
17|54
18|21[/table]

Mean: 12.2446

Graph: http://chart.apis.google.com/chart?cht=lc&chs=200x125&chd=t:0,0,0,1,4,10,21,38,62,91,122,148,167,172,160 ,131,94,54,21&chds=0,181&chxt=x,y&chxl=0:|0|3|6|9|12|15|18|1:|0%|7%|14%

Dhavaer
2009-04-23, 09:28 PM
... I take it, then, that you're not familiar with the "MIN" and "SUM" functions? It's a very simple:
=SUM(A3:D3)-MIN(A3:D3)
to get the total without the lowest value in any given row.

Dammit! I knew there was a function that gave you the smallest in a group, but I kept searching for things like 'least' and 'less'.

Stephen_E
2009-04-23, 10:06 PM
Just a note.

To compare to point buy you have to take the point cost of the various rolls and then work out the average of those point costs.

for example a 18, 16, 14, 7, 6, 4 is 32pts + any mods for the lower than 8 stats. Note that no efective comparison is possible unless you're running point buy with an ability to go below 8. You then multiply the chance of getting the result by the point cost of that result and then average the result.

You can probably do this simply by taking the individual roll outcomes rather than calculating out the character sets. Thus "18"=(21x16pts)/1297+"17"=(54x13)/1297+ect and then multiply by 6.

I haven't double checked my general equation and used Despayer's numbers. I'll repeat the point that no effective comparison can be made with the standard point buy because it doesn't allow stats below "8".

Stephen E

Dhavaer
2009-04-23, 10:33 PM
If make it so every point below 8 gives -1, it works out to 28.52778 points.

kjones
2009-04-23, 10:52 PM
Really klunky Python code:


count =[0 for x in range(19)]
for d1 in range(1,7):
for d2 in range(1,7):
for d3 in range(1,7):
for d4 in range(1,7):
count[d1+d2+d3+d4-min(d1,d2,d3,d4)]+=1
count



Dude, it's Python. That's how it's supposed to work...

My friend calculated this once, I'll have to see if I can find his maths.

Also, this guy did it: linky (http://klubkev.org/~ksulliva/ralph/dnd-stats)

HMS Invincible
2009-04-24, 01:18 AM
I'm not sure if it matters, but I play 4th edition currently.

Anyway, pointbuy seems the most complex out of all the systems to guess the average "strength" of ability scores because of the min/maxing that occurs.

Josh the Aspie
2009-04-24, 08:42 AM
Hello all!

Interesting thread.

I was already aware of the tabular and algorithmic methods of getting this information, and would have shared had you not all beaten me to it.

I was wondering if anyone knew a formulaic method for figuring out the statistical average of Roll xdy keep z high (or keep z low), rather than an algorithmic one.

-Josh

HMS Invincible
2009-04-24, 11:21 AM
Hello all!

Interesting thread.

I was already aware of the tabular and algorithmic methods of getting this information, and would have shared had you not all beaten me to it.

I was wondering if anyone knew a formulaic method for figuring out the statistical average of Roll xdy keep z high (or keep z low), rather than an algorithmic one.

-Josh

I too, want a mathematical formula instead of rolling large amounts of dice. And coding doesn't count.

adanedhel9
2009-04-24, 04:26 PM
I was wondering if anyone knew a formulaic method for figuring out the statistical average of Roll xdy keep z high (or keep z low), rather than an algorithmic one.


I've searched for formulaic methods several times over the past 10 years, and I've never seen anything. The closest I ever found was a formula specifically for 4d6 drop lowest. The author pointed out that it would be possible to derive similar formulae for other combinations, but that every additional die (and every additional die dropped) would add another layer of complexity. At the time even adding a single die was beyond my abilities; I might be able to do it now, but I would have no idea where to start looking for that article...

Josh the Aspie
2009-04-24, 06:32 PM
I too, want a mathematical formula instead of rolling large amounts of dice. And coding doesn't count.

To be clear, the above methods are different from "just rolling lots and lots of dice".

What many of the folks above have done is to list all of the possible totals of the dice roll, and list the total number of ways that those possibilities can come about, in a tabular format. Multiplying each result, by the total number of rolls that produce that value, then dividing by the total number of rolls possible yields the statistical mean weighted average, also known as the "expected outcome" in many statistical analysis circles. This method assumes that each roll is equally likely to come up (that the dice are not loaded).

The coding examples above are ways of using a computer program to implement the algorithm that many people would use by hand to make sure that they had listed each and every possible roll, and it's value.

This is different from rolling the dice many many many many times, recording the results, totaling the results, then dividing by the number of results. That method of getting an approximate expected value takes into account any loading of the particular set of dice used, but also requires an impractically large number of rolls to get the accuracy for your pair of dice as high as the theoretical average presented above would be.

A formulaic example is the traditional mathematical formula z = f(x,y), where y is your result, and f is a function that takes the parameters of x and y.

So there are 3 ways of getting expected value results: Experimental (rolling lots and lots of dice), tabular (often using code to generate the table for large tables), and formulaic.

If the algorithm and formula are both correct, they will both produce equally valid and reliable results. In fact, many formulas have come about as a way of generalizing the results that were first found and proven with tables, and then are compared against those tables to check and make sure that they match, before the rigorous proof is derived. In fact, the tabular method is often easier to understand than the formula, so if you wish to 'show' your players the roughly equal values, you can use the above tabular method, and it will likely be easier to show.

As for the comparative value of the individual scores, your players may not accept that the point buy scale allows them to get the values they want for a reasonable number of points, which would be a separate argument from whether the distribution curve, combined with the point buy value associated with that, produce a mean number of points equal to the 28 point point buy.


I've searched for formulaic methods several times over the past 10 years, and I've never seen anything. The closest I ever found was a formula specifically for 4d6 drop lowest. The author pointed out that it would be possible to derive similar formulae for other combinations, but that every additional die (and every additional die dropped) would add another layer of complexity. At the time even adding a single die was beyond my abilities; I might be able to do it now, but I would have no idea where to start looking for that article...

If you happen to find that formula, I would be most grateful, and would be very interested in working together with you to expand on the formula. Do you remember anything else about the article?

Dogmantra
2009-04-24, 07:05 PM
I found this discussion of calculating the average on another forum.
I haven't really looked over it that much, because I started to get confused after a few lines, but if I understand it correctly, it's to do with calculating the average of ndd, drop lowest k:
(spoilered for length)An approach to an analytic solution:

Let us assume we have n dice with d sides each, numbered from 1...d. (In your case n = 5, d = 6). We want to drop the lowest k dice (k = 2 in your case).

How many ways are there to roll these dice? In total, there are n^d = n * ... * n (d times) outcomes

What we do now is to add up the totals (highest n - k) of all the n^d outcomes. Afterwards, we have to divide by n^d to get the average.

To be able to do this summation, let us first classify all the outcomes.

First of all, we classify them by a number l which shall be the highest number of the outcome that is to be dropped.

This way, we get n classes as l can run from 1 to d. In the example 5d6, drop lowest two, we have for example:

l(1 2 5 3 2) = 2, l(6 6 6 6 6) = 6, etc.

We now subdivide this classification further by the number r of dice that are lower than l. We see that r can run from 0 to n - 1. In our examples:

r(1 2 5 3 2) = 1, r(6 6 6 6 6) = 0.

Now, all the outcomes are classified by a pair (l, r). This classification is still too coarse. We introduce another number u, the number of dice that show l. u can run from 1 to n - r. Examples:

u(1 2 5 3 2) = 2, u(6 6 6 6 6) = 5.

Our outcomes are now classified by triples (l, r, u). This is fine enough to proceed.

Our summation would be now as follows: we sum over all possible classes given by a triple (l, r, u) the sum S(l, r, u) of all totals of all outcomes belonging to the class (l, r, u).

So let us fix (l, r, u), l = 1...d, r = 0...n-1, and u = 1...n-r.

What is the sum S(l, r, u)? To answer this, let us take a look at how an outcome belonging to the class (l, r, u) looks like.

It looks like:

A A A A B B B B B C C C C C C

where the A are dice lower than l, the B mark the dice that show l and the C are dice higher than l. We have r times A, u times B, and therefore n - u - r times C. However, we have been a little bit inexact here! The dice usually don't come sorted this way when they are rolled one after the other! But our row as given above is sorted: at the beginning all the dice lower than l, at the end all the dice higher. Anyway, the total doesn't depend on the sort order, so we can work with sorted outcomes like the one above. However, then, we have to take the number of outcomes that can be sorted into something like the above into account.

Let us define the binomial coefficient

(a over b)

by

(a over b) := 1*2*3*4*5*...*a/(1*2*3*...*b*1*2*...*(a-b))

Combinatorics tell us that the number of outcomes of class (l, r, u) that can be sorted as above is given by (n over r) * ((n - r) over u). Let us call this number M1(r, u), the M standing for multiplicity.

All the r dice below l don't count for the total so let us neglect them. However, this introduces another multiplicity in our formula: each set of all sorted outcomes of class (l, r, u) differing only in the lowest r dice consists of (l-1)^r members all leading the same total. Let us denote this number my M2(l, r).

So, now we have to deal only with rows like:

B B B B C C C C C C C,

where the number of B's is l, the number of C's being n - u - r.

What is the total T(l, u, r) of all these rows? The dice B contribute l * (u - (k - r)) to the total as (k - r) dice showing l shall be dropped. The dice C contribute anything from l + 1 to d to the total. Recall that there are (n - u - r) dice of type C.

By the Gaußian summation formula, we have

T(l, u, r) = l * (u - (k - r)) + (n - u - r) * ((d + 1) * d / 2 - (l + 1) * l / 2).

Now, let us throw everything together:

The average is

1/n^d * [the sum over l = 1...r of the sum over r = 0...k-1 of the sum over u = 1..n - r of (n over r) * ((n - r) over u) * (l - 1)^r * (l*(u - (k - r)) + (n - u - r) * ((d + 1) * d / 2 - (l + 1) * l / 2)].


It was here (http://www.enworld.org/forum/general-rpg-discussion/56352-5d6-drop-lowest-two-math.html)

Josh the Aspie
2009-04-24, 08:08 PM
Wow.

That was an incredibly interesting, but unfortunately dense proof. At several points I had to replicate intermediate steps to be able to verify that... but... it works. And while it is hardly likely to be as easily understood, it will undoubtedly take FAR less processing time than the otherwise recursive algorithmic method I would otherwise have had to use.

My congratulations go to the one who originally posted that, though I doubt I will register in another forum to express them (especially as that might violate their standards of post necromancy there). Still, in case such a person ever reads this, my comments are below.

Well done indeed! I would personally have included more intermediary steps, such as the formulation of d * (d + (l+1)) / 2 – l * (d + (l+1)) / 2, and the extraction and cancellation of l * d / 2 – l * d / 2 to gain d * (d +1) / 2 - l * (l+1) / 2. Also, including M1, M2, and T(u,l,r) in the formula before expanding for the final formula would have aided in clarity.

Still, Bravo!

Thank you, Dogmantra, for cross-posting that wonderful proof over here. As a capstone to a self taught course in Java, I intend to make a dice rolling program, that will also calculate min, max, statistical averages (including for keep high and keep low). This will save a huge amount of processing time on those later two.

If you need any help understanding the proof, I would be happy to help.

-Josh

Dogmantra
2009-04-24, 08:36 PM
Thank you, Dogmantra, for cross-posting that wonderful proof over here.

No problem, I live to serve, after all :smallwink: