Pseudo-Percentiles, Calculated for All Stat Arrays

**Reltzik** · 2007-03-22, 05:31 PM (ISO 8601)

I have just calculated the pseudo-percentile of every possible DnD stat array. Not the percentiles for highest-stat, next-highest, next-higest, et cetera... but for every possible array.

*watches every jaw drop out of sheer and undiluted apathy*

Nonono, this is useful, it lets you say to your players "We're playing a 65th percentile game", and all your players are then free to choose any stat array that's 65th percentile or lower from this list. This avoids some of the problems of standard point buy (everyone having an 8 in one thing and an 18 in another, the WotC mandated adjustment downward "to counteract the lack of uncertainty", et cetera). I'm sure it'll have some problems -- I've already thought of a few -- but it's nice to have alternatives.

The output was about fifty-seven THOUSAND lines long (that's how many possible stat arrays there are), so I'm not going to post it all here for fear of the mods banning me for flooding. Instead, I hope to recruit someone to make a nice stat-generating GUI out of this.

The next post is a bunch of dry math stuff about methodology (very interesting math to me, very boring to most everyone sane), and the one after that is dry stuff about me trying to get someone else to program a GUI for stat generation (ditto, except it's programming stuff). If there aren't at least three posts when you first read this thread, wait a minute and hit refresh.

After that, we'll play around a bit. Give me the stat arrays from some of your favorite characters and I'll tell you what pseudo-percentile they fall in. (Note: No more than three per poster per each of my responses, no more than ten posters per day guaranteed.) We'll get a feel for this system, see how we like it, and how it's broken. Hopefully that'll last us until someone makes a GUI and we can REALLY experiment.

... or everyone can ignore this thread just like they ignore most every other undertaking of mine. I'm not picky.

**Reltzik** · 2007-03-22, 05:31 PM (ISO 8601)

Math Stuff! (Math-phobes, you may want to skip this post.)

Calculating the "average" stat array from the standard 4d6-drop-lowest method is fairly easy, and unless I'm mistaken has been done several times.

But there's another, less-oft used form of central measure, called the median. Despite some disadvantages compared to averages, medians also have their advantages... specifically, they tend to be more representative of the data set. The "average" roll in DnD would strike most people as a bit low, because a few REALLY low rolls counterbalance a lot of higher rolls. Medians reflect the large number of higher rolls and ignore the extremeness of lower rolls. Their extensions (percentiles) are also far less of a headache to deal with then those of averages (standard deviations). I remember that someone on this board calculated the "median" stat array, maybe about six months ago or so.

One of the problems associated with both, though, is that you need some way to put them into a proper order. That is to say, you need to have a way of saying having all 15s is better or worse (okay, fine, worse) than having three 14s and three 16s. If there isn't some way of ordering them, you don't really have a median; and if there isn't some way of adding and dividing, you don't really have an average. (That's why I put them into quotes.) The two methods, mentioned above, did this by calculating the average/median highest stat, the average/median second-highest stat, and so on, and then combining them into a stat array. While somewhat enlightening, a lot of data is lost.

All this is why the math geeks reading this thread (and screaming "I KNOW THIS ALREADY!" during the last three paragraphs) probably smelled a rat when I started talking about percentiles for stat arrays. That's why I called it a pseudo-percentile. This sort of naming method makes it the same rat that I was smelling about the time that I (a vegetarian) learned that the "natural flavoring" in McDonald's fries was, in fact, beef.

In other words, I invented a measure which I am claiming is SORT OF like a percentile and called it a pseudo-percentile, and everyone should be looking at me askew and asking, "Okay, but what is it REALLY?"

So where's the beef? What do I mean when I say pseudo-percentile?

First, I treat the set of all stat arrays as a partially ordered system. For the laymen out there, that means that you have four, not three, possibilities when you compare items. You can have some that are clearly superior (such as all 15s to all 14s), some that are inferior (the same arrays compared in the opposite direction) some that are equal (3,3,3,18,18,18 versus 3,18,3,18,3,18 -- effectively the same thing since you assign at will), and finally some that just don't lend themselves to clear-cut comparisons (all 15s, versus three 14s and three 16s). Most of us should be familiar with greater-than, less-than, and equal; the last category is called "non-comparable" and it's what makes partially-ordered systems partially-ordered.

To be explicit, I arrange the stats in ascending order. (Or descending, both work.) Then array A = array B if and only if A[n] = B[n] for n from one to six. A is less than or equal to B if and only if A[n] is less than or equal to B[n] for all n from one to six. Intuitively, if A[n] is less than or equal to B[n], then B[n] is greater than or equal to A[n]. They are non-comparable if A[n] is less than B[n] for some value of n from 1 to 6, and greater than it for another. Finally, less-than, greater-than, equal, and non-comparable are mutually exclusive qualities.

Okay, so that's our comparator. Now, each stat array has a weighting: The odds of rolling THAT PARTICULAR stat array. For example, the odds of rolling all threes is 1 out of 4,738,381,338,321,616,896. (That's worse than 1 in 4.7 million trillion. If you've ever rolled a character with all threes, you've got bad dice by any sane scientific standard of certainty. Dispose of them in an appropriately gruesome manner.)

To calculate the pseudo-percentile of whatever stat array you are looking at, calculate the sum weight of EVERY stat array greater than it (call this value q), of EVERY stat array less than it (call this value p), and the stat array's own weight (call this value x). Ignore the non-comparable stat arrays. The pseudo-percentile is p / (p + x + q). I also calculated an alternative pseudo-percentile of (p + x) / (p + x + q), and the average is also attractive in an ugly sort of way: (2p + x) / 2(p + x + q). My output file contains the first two.

(My apologies for the lack of credit if anyone else has used this method before me. I'm not that good at research and didn't come across this method while trying to find one that'd work.)

Anyone interested in the raw data can PM me with an email address to send it to. It's 2.3 megs of text file. Feel free to redistribute it at will, but also to pour over it and try to extract any good approximating formulae out of it.

(By the way, my hand is REALLY cramped after doing the arithmetic for all those stat arrays.)

**Reltzik** · 2007-03-22, 05:32 PM (ISO 8601)

Computer Stuff! (Technophobes may wish to skip this post and, for that matter, leave off the internet entirely.)

I'm hoping that someone, or perhaps several someones, will turn this into a stat-generation GUI similar to the point-buy one at Invisible Castle. The output file is about 2.3 megs of text and organized by stats; too much to post here, but very nice for building a program around. At the very least, you should be able to put in six stats and get a percentile out. (Actually, my word processor's "find" op does that nicely.) Better yet, something where you can easily increment or decrement individual stats one at the time and see a color-coded warning if a certain operation takes you over your limit... well, be creative. I'm asking for volunteers, I can't be picky.

The format of the output file is one stat array per line. The format is: <statArray> <space> <p/(p + x + q) percentile, in scientific notation> <space> <(p + x)/(p + x + q) percentile, in scientific notation> <new line>. <statArrays> are six numbers, each separated by a lone space, between 3 and 18, arranged in ascending order. (So 18 3 3 3 3 3 isn't there, but 3 3 3 3 3 18 is.) PM me an email to send the file to if you're interested.

Also, I'd like to have someone independent of me write their own program to double-check my numbers (description of problem in the math post above). I THINK I got them right, but I may well have messed up. Goodness knows there were a dozen bugs when I first compiled the program. (Then I added a curly brace I'd left out, and it went up to a few hundred.) Kudos if you can figure out how I did it in O(n) time (n being number of possible stat arrays). Extra kudos if you figure out a DIFFERENT O(n) algorithm.

I can also send you the source file of my own program if you want, but be warned of the following:

1) It's in C. Not C++, not objective C. C.

2) The very first line of main() is the declaration:
STAT_NODE****** raggedArray;
If you don't like pointers, you won't like my programming.

3) I did not mean for my code to be reusable, or understandable. Much of it is slapdash. There are exactly three comments in the entire thing. One of them is simply me gloating over the unholy existence of a sixth level pointer.

So if you're going to ask for my code, consider yourself warned. Oh, and no kudos to anyone who figures out my O(n) algorithm AFTER seeing my code.

**Krellen** · 2007-03-22, 05:59 PM (ISO 8601)

A six level pointer?

You, sir, are going to programmer hell.

**PMDM** · 2007-03-22, 06:02 PM (ISO 8601)

Why don't you post the % results here?

**Deus Mortus** · 2007-03-22, 06:06 PM (ISO 8601)

Send me the text file to DeusMortus[at]Gmail.com and I'll slap a nice gui around it.

(I won't have my pc back until end next week, so if someone can do it now, be my guest)

**Reltzik** · 2007-03-22, 06:13 PM (ISO 8601)

Originally Posted by PMDM

Why don't you post the % results here?

Because there's (if I remember right) something like 57 THOUSAND lines of data. (I'm lazy and don't want to count again.) My verbose manner notwithstanding, I don't wish to flood the boards.

**Deus Mortus** · 2007-03-22, 06:27 PM (ISO 8601)

Atleast post the % for the things like all 3's, all 4's, all 5's, etc, give us something to work with ;)

**Reltzik** · 2007-03-22, 06:48 PM (ISO 8601)

Originally Posted by Deus Mortus

Atleast post the % for the things like all 3's, all 4's, all 5's, etc, give us something to work with ;)

Okie-dokie. First six numbers are the stats. After that, first is the p measure, and the second is the p + x measure.

18 18 18 18 18 18 9.904326e-001 1.000000e+000
17 17 17 17 17 17 9.312169e-001 9.422059e-001
16 16 16 16 16 16 8.528005e-001 8.716578e-001
15 15 15 15 15 15 7.367228e-001 7.569749e-001
14 14 14 14 14 14 6.055284e-001 6.294270e-001
13 13 13 13 13 13 4.689773e-001 4.908002e-001
12 12 12 12 12 12 3.417128e-001 3.625905e-001
11 11 11 11 11 11 2.342769e-001 2.511505e-001
10 10 10 10 10 10 1.491553e-001 1.627524e-001
9 9 9 9 9 9 8.743759e-002 9.728782e-002
8 8 8 8 8 8 4.700571e-002 5.351215e-002
7 7 7 7 7 7 2.281668e-002 2.675059e-002
6 6 6 6 6 6 9.361591e-003 1.157197e-002
5 5 5 5 5 5 3.102378e-003 4.136505e-003
4 4 4 4 4 4 7.730963e-004 1.159644e-003
3 3 3 3 3 3 0.000000e+000 1.286836e-004

Also, the Elite Array:

8 10 12 13 14 15 2.049263e-001 2.085460e-001

(That's about 20th percentile. Pathetic! EDIT: On the other hand, this system ranks it inferior to all 11s, when we know that just isn't true.)

EDIT: (I'm getting the urge to recalculate it all, taking into account the "reroll for bad stats" rule.)

**Jacob Orlove** · 2007-03-22, 06:49 PM (ISO 8601)

Those are super rare, though. It'd be much more interesting to see what a couple of "typical" stat arrays look like.

Say,
11, 11, 11, 10, 10, 10;
13, 12, 11, 10, 9, 8; (default NPC array, 15 point buy, 'average' 3d6 roll)
15, 14, 13, 12, 10, 8; (elite array, 25 point buy, 'average' 4d6 roll)

edit: whoops, simu-posted with the above. This was supposed to be a reply to Deus Mortus.

**Reltzik** · 2007-03-22, 07:03 PM (ISO 8601)

10 10 10 11 11 11 1.712570e-001 1.874345e-001
8 9 10 11 12 13 1.154418e-001 1.215573e-001

17-18th and 11-12th percentile, respectively. Elite Array was covered at the end of my last post: 20th percentile.

**CharPixie** · 2007-03-22, 07:36 PM (ISO 8601)

p is basically all rolls that are LESS THAN, q GREATER THAN, and x undetermined, right? If so, you might have a problem.

If you look at:

http://d20.jonnydigital.com/2006/10/what-are-the-odds

There's a pretty good rundown on the odds of 4d6. You can get the percentiles from it by some easy addition. Now, for your stat arrays of X X X X X X, the only way for an array to be less than it is for ALL of the stats in the array to less than it (and that's just bad english). Put another way,

p(less than) = p(each less than)^6 (which is much less)
however
q(greater than) = q(each greater than)^6 (which is again much less)

therefore, any fuzzy probability should enclose the percentile of X on 4d6.

Here's the percentiles:

0.1%0.4%1.2% <- 52.8%5.7%10.5%17.5%26.9% (10)38.4%51.2% (12)64.5%76.9%87.0%94.2%98.4%100.0%
You see the closest roll to the median is 12. So, 12 12 12 12 12 12 should include the percentile 51 whatever. Instead it's 34-36%.

Even if your method excludes all fuzzy percentiles (which I think it does, if I understand it correctly), p(x)^6 should still be a median value if you are normaling it by p(x)^6 + q(x)^6 (they should be roughly equal).

On the whole, I think you are on the right track; it sounds like your six headed pointer is some implementation of Dynamic Programming, which if you set up an DP grid going both ways, you can likely calculate the solution soundly in O(n) time (it's the charm of the demon, right). And your results look sane; they just don't seem RIGHT, you know?

**Reltzik** · 2007-03-22, 08:09 PM (ISO 8601)

If I understand what you're saying, CP, I was using a somewhat different method. The percentiles of the individual stats were never calculated, just those of the entire stat ARRAY.

x refers to the weight (odds) of the stat array in question, not the non-comparables. Those get ignored.

The fact that 12 is a median on a SINGLE stat suggests, with some thought, that the median on 6 identical stats will be higher. Look at the frequency table that you linked. Now take each of those entries and raise it to the 6th power. Those are the odds of getting, say, all 15s.

(EDIT: Reworded next paragraph for clarity purposes)

The values below 12 are less-frequent numbers; their strength is that there are more of them. The values above 12 are more-frequent numbers. When you raise their frequency to an exponent, you magnify that advantage; meanwhile, the frequencies of the lower values don't increase by nearly as much, and THEIR advantage of, well, numbers remains unchanged.

**Erom** · 2007-03-23, 09:27 AM (ISO 8601)

Holy crap, awesome.

If you send me the file (downward_spiral(at)mit(dot)edu) I'll write a gui into an executable .jar... you know, when I get around to it. It won't too fast, but it should run on anybody's comp that has a recent java install.

**Yakk** · 2007-03-23, 10:36 AM (ISO 8601)

You should print out your chances in percentage-with-3-decimal places form. Sure, you'll get some 0.000% entries, but they will be far more readable.

Cute metric. As you noted, it has issues, with the elite array being considered less than an all-11 build.

You might want to remove your 6-level pointer. You can deal with that kind of thing pretty easily using a 1-level pointer and a simple function that does the array lookups for you.

Just write:

Code:

struct index {
  int dim[6];
};

double* get_entry( double* table, index i );

Now your index struct contains 6 index values, and the arrangement of the table is coded (manually) into get_entry.

Far more readable, and all of your pointer math is put into one function (where you can write checks-for-bugs and boundry tests).

**Reltzik** · 2007-03-23, 02:33 PM (ISO 8601)

That's sort of what I did, instead I didn't write a function for access. The ragged away was "keyed" by an array of 6 ints. Once I built the array, I was accessing it with the array's members all the time. (Actually, I'd access it once, then assign the location to a first-level pointer called cursor.)

But since I didn't know what the ragged array was going to look like (and I wouldn't have hard-coded it without 6 nested for loops even if I HAD), I had to allocate it dynamically. And in C, a dynamically-allocated 6-dimensional array starts off life as a 6th level pointer. I suppose I could have wrapped it in a pretty-looking typedef, but C's not strongly typed and I prefer the reminder of what's ACTUALLY going on.

As for a lookup function, I considered it, but I'd have only USED it about six places in my code. And I could cut-and-paste those lines.

Oh, and Erom, you should have the data now. And Deus Mortus, you should have gotten it yesterday. However, hotmail's acting a bit quirky when I send this, so tell me if you didn't get it.

**Erom** · 2007-03-23, 02:51 PM (ISO 8601)

Got it and working on it.

EDIT: OK, here's what the first version looks like. I'll try and get a download link up soon. It was compiled under Java 5.0, so as long as you have that it should run. Ignore the code in the background - pay no attention to the man behind the curtain

.

**CharPixie** · 2007-03-23, 08:13 PM (ISO 8601)

Originally Posted by Reltzik

The values below 12 are less-frequent numbers; their strength is that there are more of them. The values above 12 are more-frequent numbers. When you raise their frequency to an exponent, you magnify that advantage; meanwhile, the frequencies of the lower values don't increase by nearly as much, and THEIR advantage of, well, numbers remains unchanged.

My point was that if you excluded everything except p(x) and q(x), and took both of them to the sixth power, then the ratio p(x)/q(x) would move closer to 0 or one, since that's what you are examining p(x)^6 / (p^6 + q^6).

Anyway, I got bored and wrote a version of it: http://www.sfu.ca/~dbruvall/. It's in the percentiles files. .NET Framework is required, since it's in C#.

By the way, I was apparently high when I was rambling on about dynamic programming. That's over engineering for you.

I still have some concerns; by eliminating the fuzzy probabilities, you are reducing the data set by around 90%. That 90% could be very significant.

**Erom** · 2007-03-23, 08:36 PM (ISO 8601)

Alright! It works, at least on my machine. Use at your own risk, of course, although I didn't do anything deliberately bad in the code. I've hosted it at rapidshare. I know, annoying, but free.

http://rapidshare.com/files/22485729/Roller.jar.html

Note also, that so long as you keep the .jar file's file structure intact, you should be able to swap out the text file for a new/custom version without breaking the code.

**Reltzik** · 2007-03-24, 11:38 AM (ISO 8601)

SWEET. That's beautiful work, Erom.

Okay I've got it running, and I decided to test it out on a barbarian at equal to or below 50th percentile.

I got 16, 14, 15, 10, 14, 10 as being 47.7 percentile. I actually had wisdom at 12 and wanted to increase con to 16, but I couldn't; that put me over. But bumping the wisdom up 2 points, that was workable. Hmm. Something doesn't feel quite right about that.

But wait! Most of us horribly abuse this concept called DUMP STATS. Let me try this here with oh, I dunno, Charisma.

16 16 16 16 16 9 is 49.6th percentile.
17 16 17 15 16 8 is 49.9th percentile.
18 16 16 14 16 7 is 50th percentile.
18 17 18 17 17 6 is 49th percentile.
18 18 18 18 18 5 is 45.6th percentile. o.0

Each time I dropped charisma, the percentile went from somewhere around 50 to somewhere around 35. A single low stat frees up a HUGE amount of percentilage for elsewhere. It's like the opposite of point buy, where changes in low stats are meaningless, but increasing stats is increasingly expensive.

So, I liked the idea, but it looks like this isn't a good system to use.

Thanks for the time and effort of everyone here (Especially Erom) in critiquing it.

EDIT: This data might STILL be of interest to probability-junkies, so I'll still email it to anyone who PMs me. Oh, and I just got an idea for a different system but, in honor of the recently departed, I'll wait a day before being able to put it up in good taste.

EDIT to EDIT: Oh, and no offense intended in not mentioning yours, CP. I don't have .NET and so I couldn't look at it. Still, I'm sure it's good work.

**Reltzik** · 2007-03-24, 01:15 PM (ISO 8601)

DAMMIT!

Okay, I was thinking, which is never something which is advisable, and I was trying to figure out how the hell 18 18 18 18 18 5 was below 50th percentile. And I was thinking back to the code and... yup, I realized I had a bug. I'd implemented my algorithm wrong.

So this data? It's all bad.

Here's what went wrong:

My algorithm constructed the ragged array holding the stat arrays upwards -- first all 3s, then 5 3s and a 4, so on. I guaranteed that by the time I got to any given stat, everyone below it was already constructed. At the same time, I used basic probability theory to calculate the weighting of that particular stat. So far so good. And I also gathered up the data I needed to calculate the weight BELOW the stat.

This was the tricky part of the algorithm, and my solution to it is what I was (and still am) proud of. Actually adding up everything below with a tree search is, itself, an O(n) opperation. Repeated n times, once for every stat array, that becomes O(n-squared). On such a huge data-set, this was just unacceptable to me. (In retrospect, the machine probably could have handled it.) I can't just add up the weights of all the ones below; that creates an overlap. So what I came up with instead was this:

Imagine that you've got one of those thousand-cubes from kindergarten/elementary school.

(If you didn't have these things in your school, they're devices to teach counting. Take a bunch of block cubes just small enough to be a choking hazard. You count these out one at a time. When you get to ten, you replace them with a group glued together in a line of ten. At 15 you've got a line of 10 and 5 singletons. When you get to twenty, you replace your line and 9 singletons with two lines. When you get to a hundred, you've got a square of ten lines glued together. And, when you get to a thousand, you've got a cube of ten hundred-squares stacked atop each other. That's a thousand cube.)

Now imagine that we select one corner of this thousand-cube; say, the near, upper-right one, and declare it to be the highest. Any move to the left, to the rear, or down is considered a decrement; any move right, to the front, or up is an increment. How can we add up the number below that cube, excluding it? One way is to just take the volume of the cube by multiplying how far you are into it on each dimension, and subtract 1. So if you're on the third rank forward, the fifth file to the right, and the seventh file up, then there are 3 * 5 * 7 - 1 = 104 below you. But if all the cubes are weighted funny, this becomes a bit tricky.

So here's another method. You take that part-cube that you're interested in, and then you look at that cube EXCLUDING the level that you're currently on. Take THAT thing's weight. Then, you take the weight of the level you're on, EXCLUDING the row you're on. Add that to the stuff below your level. And then, you take the weight of the row you're in, excluding yourself, and add that to your sum. There, you're done.

Do we HAVE those values handy to add up? Yes, if we go at it in the right order. The weight of all those values below your present level? That's the "weight below" value of the guy one step down from you, plus that guy's weight. How 'bout the current level, excluding your row? The guy one step back from you had to calculate that in order to figure out HIS weight, and you can save that data for later access by the guy in front of you. Similarly, the guy to your left will save the data for your present row.

Now expand this concept into six dimensions. For every stat array, you've got thirteen values you're interested in. You've got your own weight. Then you've got the weight of everything below you, PRESERVING THE FIVE HIGHEST STATS. That is, you only calculate the stuff below you by dropping your lowest stat; those 5 18s must remain 18s. Then you have a value for everything below you preserving the 4 highest, three highest, et cetera, right down to preserving nothing at all; that's the total weight below you. A similar set of six values exists for the weights above.

All right, now there are a couple of ways to trip up here. First, the initial value for the recursion: If you want to know what's below you preserving the n highest values, look at your n + 1th highest stat. If that's 3, then there's nothing below you. Set that weight below that stat to 0. In other words, there's no way to make 14 12 10 3 3 3 worse while still preserving the 3 highest stats. That part I was okay on.

But suppose you have 16 15 15 12 10 9. That repetition there is a problem, because I was organizing these things in descending order. I couldn't look at what happened if you decrement the second stat, because then you'd have 16 14 15 12 10 9, which is not part of my array. However, since it was equivalent to 16 15 14 12 10 9, and I'd just calculated the weight below-and-including for THAT, it was the same value and I could just copy it over, right?

WRONG. Because what I'd just calculated was the weight below-and-including that stat array PRESERVING THE TWO HIGHEST VALUES. What I need now was the below-and-including weight of that stat array PRESERVING THE HIGHEST VALUES. What I needed to do was go back to 16 15 14 12 10 9 and grab its preserve-highest-1 value, and instead I was using its preserve-highest-2 value that I'd grabbed earlier.

Oops.

So that's why 18 18 18 18 18 5 is recorded as 45.6th percentile. I grab the weight below-and-including 18 18 18 18 18 4, preserving the five highest. (It's not much.) Then I grab the weight below-and-including 18 18 18 18 17 5, preserving the four highest. So far so good. But then, I incorrectly evaluate the next value as the same thing, when it should have been preserving the three highest (a much, much heavier weight). This problem multiplies through the repetitions until I'm left with this absurd outcome. This is why 18 18 18 18 18 5 ranks as 46.5th percentile, and 18 18 18 18 18 3 is evaluated as 6.51st percentile. EDIT: And 16 16 16 16 16 13 is evaluated as 0.54th percentile!

Like I said, oops.

Anyhow, when break's over (in a bit over a week) I'll go back into school, fix the bug, and run it again. Hopefully we'll have something a bit saner this time around.

**Erom** · 2007-03-24, 01:40 PM (ISO 8601)

Oh man, I remember the thousand-cubes! Wow... anyway, have fun over break, if you manage to fix the problem, send me a copy of the updated data file, and I'll upload an updated gui.

**CharPixie** · 2007-03-24, 06:59 PM (ISO 8601)

Yup, that sounds a lot like dynamic programming. You should look it up; it's a good general solution to any problem where you calculate the same value over and over.

However, there's a way of calculating the Percentile on demand. Let's say p(x) is the chance that on 4d6 you roll LESS OR EQUAL than X and q(x) is the chance you roll equal or more. Then, the percentile for any array is given by six p(x) terms and six q(x) terms and the probability of that particular array, corrected for the number of permutations that the array can have. Why? Because if you were to mix p(x) and q(x) terms, it would be part of the data set that you've thrown out. So, if you precalc p and q for 3 to 18, you can generate any particular percentile by the following:

(p(STR) * p(DEX) * ... p(CHA)) /
((p(STR) * ... p(CHA)) + (q(STR) * ... q(CHA)))

In my code I was multiplying both P* and Q* by the number of permutations, but looking at the math I realize it cancels straight up.

EDIT: By the way, if you don't order your arrays while calculating and then deal with the amalgamation later, i think you'd get a simplier algorithm, if one that used a BIT more memory.

**nows7** · 2007-03-25, 09:18 AM (ISO 8601)

WOW! I have no understanding of what you have done, beyond that it is quite impressive - not string theory earthshattering, but still quite impressive.

If you are ever up for another challenge, you should try fitting poker hand possibilities onto a table for dice rolling.

**JoeFredBob** · 2007-03-25, 05:16 PM (ISO 8601)

It may not be string theory earth shattering, but it's way more than string theory relevant. =P

Anyway, I'd be interested in possibly looking at different metrics for deciding comparisons. For example, I think we can say that 10 10 10 10 11 11 is worse than 10 10 10 10 10 12. Why? Well, your total modifier is higher. It might be worth considering total modifier, rather than just individual numbers. Aditionally, 10 10 10 10 12 12 should be worse than 10 10 10 10 10 14, if we want to line up roughly with how point buy works.

I haven't put a huge amount of thought into this yet, but it's probably worth thinking about. All the algorithms in the world aren't worth anything if your metric isn't good.

**Reltzik** · 2007-03-27, 04:16 PM (ISO 8601)

Char Pixie, I had to mull that one over for a few days to remember why it wouldn't work. Then I had to drive 600 miles in a single day, for what we may assume were unrelated reasons.

The problem with your system is that it includes ordering when ordering doesn't matter. In other words, if they're all 15s, you're including odds for 14 15 15 15 15 15 as below, when no such stat array exists (assuming that we're "naming" these things by descending order). If you instead try to say that this is a valid array, you're saying that ordering does matter... in which case, you can't add up the weights for 14 15 15 15 15 15 and 15 15 15 15 15 14 and say that it's the sum of that equivalence class, because you have overlap at all 14s (and one or two other places, as well).

**Yakk** · 2007-03-29, 10:00 AM (ISO 8601)

Two stat case:
V(A,B) = V(A-1, B) + V(A, B-1) - V(A-1, B-1) + X(A,B)

Ie, the number of rolls under (A,B) is the number of rolls under (A-1, B) plus the number of rolls under (A, B-1), minus their overlap (the rolls under (A-1, B-1)), plus the number of rolls that hit (A,B) exactly.

Three stat case:
V(A,B,C) = V(A-1, B, C) + V(A, B-1, C) - V(A-1, B-1, C) + V(A, B, C-1) - V(A-1, B, C-1) - V(A, B-1, C-1), + V(A-1, B-1, C-1) + X(A,B,C)

... I think.

Rolls under 2,2,2 are rolls under 1,2,2 and 2,1,2 and 2,2,1.

The rolls under 1,1,2 and 1,2,1 and 2,1,1 are counted twice, and the rolls under 1,1,1 are counted three times.

Subtracting out the rolls under the (1,1,2) etc, we remove the rolls under (1,1,1) three times -- so we have to add it back in.

This gets more complicated, but there does appear to be a pattern.

Odds are one could make such a recursive definition for 6 stats.

Then a dynamic programming solution falls out, so long as the recursive definition doesn't get too crazy, our problem is solveable in O(number of different roles) space and O(different roles * complexity of recursive definition) time.

**CharPixie** · 2007-03-29, 02:44 PM (ISO 8601)

It doesn't matter if it's ordered or unordered. Those terms actually cancelled when computing P/P+Q. I was including them up until the point where I realized (A) I had made a mistake in the program that counted the number of orderings and (B) once I fixed it, it yielded the same results.

Ah. And Yakk demonstrates why I should handwave less while doing problem formulation. I was giving it some thought, and I think you could use:

V(A..F) = V(A-1,B..F) + X(A..F) where a > 3

and if you switch to B-1 once A-1 is depleted, you could reduce the complexity of the recursive definition to something that's always two terms (and six if statements).

**Seatbelt** · 2007-03-29, 03:01 PM (ISO 8601)

How would this be practical and useful for players?

**KIDS** · 2007-03-29, 03:15 PM (ISO 8601)

Though my knowledge of statistic analisys is limited and I don't know much about programming either I do see the potential in your work. Was an interesting read neverntheless, keep it up!

Thread: Pseudo-Percentiles, Calculated for All Stat Arrays

Pseudo-Percentiles, Calculated for All Stat Arrays

Re: Pseudo-Percentiles, Calculated for All Stat Arrays

Re: Pseudo-Percentiles, Calculated for All Stat Arrays

Re: Pseudo-Percentiles, Calculated for All Stat Arrays

Re: Pseudo-Percentiles, Calculated for All Stat Arrays

Re: Pseudo-Percentiles, Calculated for All Stat Arrays

Re: Pseudo-Percentiles, Calculated for All Stat Arrays

Re: Pseudo-Percentiles, Calculated for All Stat Arrays

Re: Pseudo-Percentiles, Calculated for All Stat Arrays

Re: Pseudo-Percentiles, Calculated for All Stat Arrays

Re: Pseudo-Percentiles, Calculated for All Stat Arrays

Re: Pseudo-Percentiles, Calculated for All Stat Arrays

Re: Pseudo-Percentiles, Calculated for All Stat Arrays

Re: Pseudo-Percentiles, Calculated for All Stat Arrays

Re: Pseudo-Percentiles, Calculated for All Stat Arrays

Re: Pseudo-Percentiles, Calculated for All Stat Arrays

Re: Pseudo-Percentiles, Calculated for All Stat Arrays

Re: Pseudo-Percentiles, Calculated for All Stat Arrays

Re: Pseudo-Percentiles, Calculated for All Stat Arrays

Re: Pseudo-Percentiles, Calculated for All Stat Arrays

Re: Pseudo-Percentiles, Calculated for All Stat Arrays

Re: Pseudo-Percentiles, Calculated for All Stat Arrays

Re: Pseudo-Percentiles, Calculated for All Stat Arrays

Re: Pseudo-Percentiles, Calculated for All Stat Arrays

Re: Pseudo-Percentiles, Calculated for All Stat Arrays

Re: Pseudo-Percentiles, Calculated for All Stat Arrays

Re: Pseudo-Percentiles, Calculated for All Stat Arrays

Re: Pseudo-Percentiles, Calculated for All Stat Arrays

Re: Pseudo-Percentiles, Calculated for All Stat Arrays

Re: Pseudo-Percentiles, Calculated for All Stat Arrays

Posting Permissions