PDA

View Full Version : Math Problem



Crow
2014-04-22, 05:52 PM
We have a program at my job which is supposed to populate work rosters at random.

There are 31 spots on the roster, and a person may be assigned only one each day.

You work 15 days. What is the probability of being placed in the same roster position on 6 of those days?

Livius
2014-04-22, 06:49 PM
Is the question at least 6 days of the 15 or exactly 6 days of the 15?

For at least 6 of 15, just choose 6 of the 15 days to get the same job. Probability = ( 15 choose 6 ) / ( 2^15 ) = 0.1527 = 15.27%

For exactly 6 of 15 days, it's the same as above, but multiplied by 30/31 for each of the 9 other days you get a different job. Probability = ( 15 choose 6 ) / ( 2^15 ) * (30/31)^9 = 0.1137 = 11.37%

Crow
2014-04-22, 07:01 PM
I'm looking for exactly 6, or greater than/equal to 6.

To get any specific spot, I'm already looking at 1 in 31 to start, so .032

I'm not sure your figures are correct. I guess another way to put it is if I had a roulette wheel with 31 slots, what are my chances of landing on the same number 6 times in 15 attempts?

Razanir
2014-04-22, 08:15 PM
You're interested in a binomial distribution.

Suppose you have n trials, and the probability of a success (getting placed in said roster position, in this case) is p.

The probability of x successes is C(n,x)*p^x*(1-p)^(n-x), where C(n,x)=n!/(x!(n-x)!) is combinations of n choose x.

P(Binom(15,1/31) = 6) = C(15,6) * (1/31)^6 * (30/31)^9 = 5005*30^9/31^15 = .000004198

Or if you want at least 6 days, the probability is .000004384

Either way, it's incredibly unlikely.

Livius
2014-04-22, 08:21 PM
You're interested in a binomial distribution.

Suppose you have n trials, and the probability of a success (getting placed in said roster position, in this case) is p.

The probability of x successes is C(n,x)*p^x*(1-p)^(n-x), where C(n,x)=n!/(x!(n-x)!) is combinations of n choose x.

P(Binom(15,1/31) = 6) = C(15,6) * (1/31)^6 * (30/31)^9 = 5005*30^9/31^15 = .000004198

Or if you want at least 6 days, the probability is .000004384

Either way, it's incredibly unlikely.

This is right. Somehow when I did it I decided to divide by 2^15 instead of 31^5. I'm not actually sure why I thought that worked, since, by inspection, it shouldn't be very likely.

Knaight
2014-04-23, 12:08 AM
Or if you want at least 6 days, the probability is .000004384

Either way, it's incredibly unlikely.

It is worth noting that the probability goes up substantially if it's about somebody doing this over some 15 day stretch, and it happening in one of many 15 day intervals. I don't know what the underlying question behind wanting to know the statistics is, but that could be a factor.

Khedrac
2014-04-23, 01:25 AM
For exactly 6 of 15 days, it's the same as above, but multiplied by 30/31 for each of the 9 other days you get a different job. Probability = ( 15 choose 6 ) / ( 2^15 ) * (30/31)^9 = 0.1137 = 11.37%
Not necessarily, because 6 is less than half of 15 if you just use (30/31)^9 then you could end up with 6 to 9 of them being on a different same assignment.

The question therefore gets more complex - do we mean "exactly 6 days and all other assignments for fewer days" or what?

Fortuna
2014-04-23, 01:29 AM
Not necessarily, because 6 is less than half of 15 if you just use (30/31)^9 then you could end up with 6 to 9 of them being on a different same assignment.

The question therefore gets more complex - do we mean "exactly 6 days and all other assignments for fewer days" or what?

While this is true, to the OP: If you're interested in this for any practical purpose, you can ignore this post, because the contribution of that factor to your probability is going to be negligible.

Radar
2014-04-23, 05:22 AM
In the given situation we need to consider a few more factors:
1. There is more then one person within the work rooster.
2. It is randomised every few weeks (I guess from the 15 days of work).

In essence, we are not looking at a single random arrangement, but at an outlier in a larger statistical sample. To make the predictions accurate, we would need to calculate the probability of anyone in the company getting a given task at least 6 times during the full work period. Then we could also predict, how often would it happen in a longer time span.

Thus: does the ammount of tasks match the ammount of workers?

Crow
2014-04-23, 02:47 PM
The roster is supposed to be populated randomly each day. Every day is separate from the next. Overall, we have somewhere to the order of about 200 people.

I'm not sure how many have gotten 6 of the same post in 15 days, I know of only one. Several have gotten the same post 4 to 5 times in the 15 day period though. The company says the program is working properly, and that rosters are being populated at random. I contend that it is not.

factotum
2014-04-23, 03:17 PM
I contend that it is not.

On what basis? The critical thing to remember about randomness is that it tends to clump together. If you stand there flipping coins, getting head-tail-head-tail all the time might be the most *even* distribution of results, but it sure as heck ain't the most random! That being the case, seeing occasional clumps where some guy gets the same post 5 or 6 times in a row certainly doesn't prove it's not random--technically, even someone getting the same post 15 times in a row wouldn't prove that, because the probability of that actually occuring is non-zero (albeit very small).

Radar
2014-04-23, 03:37 PM
Ok, so the proper question is: what is the probability of at least one person out of 200 people getting any task (out of 31) 6 times in 15 days.

It is important to stress, that it could be anyone and it could be any given task. I assume, that each task is equally probable or in other words requires the same ammount of workers.

1. The probability of a specific person getting any given task 6 times in 15 days is obtained as follows:
the total number of possible arrangements is (disregarding the ordering) (15+31-1)!/(15! (31-1)!). The total number of combinations (with repetitions) of situations with any task repeating itself at least 6 times is: (10+31-1)!/(10! (31-1)!). You basicaly need to force the outcome on 5 days. By dividing the latter by the former we obtain the probability: 13/5289 or 0.25%. Not as low as estimated earlier.

2. The probability of at least a single person out of 200 getting any task 6 times in a 15 day period is most easily calculated as "1 - probability of noone getting 6 identical tasks". This gives us 1 - (1-13/5289)^200, which is about 39%.

Answer: it is quite likely for someone out of 200 wokers to get 6 identical tasks in a 15 day period and this is for a single 15 day period.

Crow
2014-04-23, 04:13 PM
On what basis? The critical thing to remember about randomness is that it tends to clump together. If you stand there flipping coins, getting head-tail-head-tail all the time might be the most *even* distribution of results, but it sure as heck ain't the most random! That being the case, seeing occasional clumps where some guy gets the same post 5 or 6 times in a row certainly doesn't prove it's not random--technically, even someone getting the same post 15 times in a row wouldn't prove that, because the probability of that actually occuring is non-zero (albeit very small).

You can't prove conclusively that it isn't occuring at random, but when this is happening with several people at the same time, you can make the case that there is something wrong with the random algorithm. I get what you're saying, but even with a short sample time (15 days), with so many officers, we should be seeing something close to an even distribution without so many statistical outliers.

I am getting what this fellow got using binomial distribution:


You're interested in a binomial distribution.

Suppose you have n trials, and the probability of a success (getting placed in said roster position, in this case) is p.

The probability of x successes is C(n,x)*p^x*(1-p)^(n-x), where C(n,x)=n!/(x!(n-x)!) is combinations of n choose x.

P(Binom(15,1/31) = 6) = C(15,6) * (1/31)^6 * (30/31)^9 = 5005*30^9/31^15 = .000004198

Or if you want at least 6 days, the probability is .000004384

Either way, it's incredibly unlikely.

I'm not sure your .25 number is accurate.

ChristianSt
2014-04-23, 05:52 PM
The roster is supposed to be populated randomly each day. Every day is separate from the next. Overall, we have somewhere to the order of about 200 people.

I'm not sure how many have gotten 6 of the same post in 15 days, I know of only one. Several have gotten the same post 4 to 5 times in the 15 day period though. The company says the program is working properly, and that rosters are being populated at random. I contend that it is not.

Have all 31 posts equal probability? And are different peoples posts independent from each other (e.g. is it possible that on a given day all people could get the same post)?

Crow
2014-04-23, 06:02 PM
They are supposed to have equal probability. One man, one post.

Radar
2014-04-24, 01:16 AM
I'm not sure your .25 number is accurate.
It's much more accurate for a number of reasons:

1. The quoted calculations assume, that you get a specific task 6 times - not any task 6 times.
2. They also assume, you get this task exactly 6 times - not at least 6 times.
3. I'm not even sure the binomial distribution is the right tool for the job here, because there are 31 possible outcomes and one should not simplify the system like that. That's exactly why I calculated the probability from first principles.

I might have even underestimate the probability, because there is still a degree of freedom in the 5 forced outcomes, but it's difficuilt to retrieve, so I neglected it.

ChristianSt
2014-04-24, 05:01 AM
You can't prove conclusively that it isn't occuring at random, but when this is happening with several people at the same time, you can make the case that there is something wrong with the random algorithm. I get what you're saying, but even with a short sample time (15 days), with so many officers, we should be seeing something close to an even distribution without so many statistical outliers.

The problem is that 15 is a really low number while 200 is a (while still not that large) is larger. The imo only useful test you could make is to see whether posts are distributed with equal probability. So if you ask all people and count how often each post gets distributed, you should get roughly equal amounts.

To compare that: say you have a d100. Rolling two times 100 with it, is a rather small odd (0.0001 to be precise). Yet if you have enough people some of them are guaranteed to roll two 100 in a row. If you have 200 people the chances to have at least one such person are 1-(0.9999)^200]. Which is 0.0198, so around 2%.




I am getting what this fellow got using binomial distribution:

I'm not sure your .25 number is accurate.
It is not .25 but .25% which is 0.0025. To be honest I have no clue what Radar is trying to do. Or it is clear what he wants to do (since he says it), but I have no idea how he arrives at the numbers he uses. The amount of different post distributions is rather trivial (31^15 = 2,34x10^22]). The amount of valid combination is not that easy to figure out. (And yes, 0.0025 seems a bit high for me)


I'm also sceptical of the numbers Razanir arrived at. At least the first, with" P(Binom(15,1/31) = 6) = C(15,6) * (1/31)^6 * (30/31)^9 = 5005*30^9/31^15 = .000004198" is certainly correct. But "Or if you want at least 6 days, the probability is .000004384" seems rather low to me.
Mainly because of the same reasons Radar said in his most recent post: it is only the probability of gaining one specific post exactly 6 times. (It might be that .000004384 is correct for getting one specific post 6 or more times, but imo it seems to be too low).

I'm not sure how to arrive at the exact number, but as an estimate I would multiple .000004198 with 31, resulting in 0.000130138, which is even a higher probability than my example with getting two times 100 on a d100. Yes I know that it is wrong (since I don't factor in more than 6 days the exact task and some distributions are counted twice (6 days post 1 and 6 days post 2 for example), but I think the effects should at least cancel out each other a bit), but imo it should be in the right magnitude, which would result in a single-digit % for it happening.


Unfortunately your numbers are too big for an enumeration :smallfrown:. Otherwise this would be the easiest way to get the result.

Radar
2014-04-24, 09:29 AM
It is not .25 but .25% which is 0.0025. To be honest I have no clue what Radar is trying to do. Or it is clear what he wants to do (since he says it), but I have no idea how he arrives at the numbers he uses. The amount of different post distributions is rather trivial (31^15 = 2,34x10^22]). The amount of valid combination is not that easy to figure out. (And yes, 0.0025 seems a bit high for me)
I simply calculated the number of combinations with repetitions for the total number of relevant states - that's why I didn't use 31^15 (which would be valid, only if we were concerned with the ordering) as the total number of states. I don't think, I made a mistake in that part, since it's pretty much a textbook example, but if I did make one, I sure would like to know.

My estimation of the situations involving a single task being given at least 6 times is obtained in a similar way, but it's not entirely accurate - it's the lower bound on the total number of relevant situations. The idea behind the estimation is that to ensure that a task is given at least 6 times, we need to enforce the outcome 5 times to be equal to any other task already given. Therefore I again calcualated the number of combinations with repetitions for picking 10 random tasks out of 31, since the remaining 5 won't be random. What I did not take into account is that we most often can chose between a number of possible outcomes to enforce, which would make the overal probability even higher.

After that I did falsy assume, that work schedule for each person is an independent random variable, because the real situation (constant ammount of workers for each job) is a real pain to consider in full detail.

At any rate, if you see any fault in my reasoning, then I would be grateful for an explanation.

NichG
2014-04-24, 10:02 AM
I wrote up a quick simulation and I got Radar's numbers for 5 days, not for 6 days. For 5+ days I measure a cumulative chance of 0.00249 out of 100k trials, which means a 40% chance that at least one employee would have the same task for 5 days out of 15. For 6+ days I measure a cumulative chance of 0.00014 (out of 100k trials) - consistent with ChristianSt's result - which means that for 200 employees there's about a 3% chance that at least one of them would get the same task for 6 days out of 15.

3% is not that rare, really. Also given that we're not taking into account hidden constraints that a business might have, I think its more likely that the algorithm isn't precisely what you think it is, but may also be trying to satisfy some constraint like 'at least 3 people are doing each task each day' which would give different results than these numbers.

Chen
2014-04-24, 10:24 AM
The roster is supposed to be populated randomly each day. Every day is separate from the next. Overall, we have somewhere to the order of about 200 people.

I'm not sure how many have gotten 6 of the same post in 15 days, I know of only one. Several have gotten the same post 4 to 5 times in the 15 day period though. The company says the program is working properly, and that rosters are being populated at random. I contend that it is not.

Wait you have 200 people but only 31 posts? Does this mean some people don't get a post? Or can posts have multiple people on them? Is it possible some of the spots are constrained in the number of people that can work at them? That might throw the probability off some.

ChristianSt
2014-04-24, 11:19 AM
I simply calculated the number of combinations with repetitions for the total number of relevant states - that's why I didn't use 31^15 (which would be valid, only if we were concerned with the ordering) as the total number of states. I don't think, I made a mistake in that part, since it's pretty much a textbook example, but if I did make one, I sure would like to know.

My estimation of the situations involving a single task being given at least 6 times is obtained in a similar way, but it's not entirely accurate - it's the lower bound on the total number of relevant situations. The idea behind the estimation is that to ensure that a task is given at least 6 times, we need to enforce the outcome 5 times to be equal to any other task already given. Therefore I again calcualated the number of combinations with repetitions for picking 10 random tasks out of 31, since the remaining 5 won't be random. What I did not take into account is that we most often can chose between a number of possible outcomes to enforce, which would make the overal probability even higher.

After that I did falsy assume, that work schedule for each person is an independent random variable, because the real situation (constant ammount of workers for each job) is a real pain to consider in full details.

At any rate, if you see any fault in my reasoning, then I would be grateful for an explanation.

Ah, I see. Honestly I have still not clue how your numbers are supposed to work. But in any case I can say that your approach is not correct.

Because the approach "# valid/# total" is only a simplification/shorthand of P(valid)/P(total)=P(valid), iff all events have equal probability.
While ordering is certainly not relevant to the question, it is relevant for the odds.

Let consider the similar (at least from style, not from numbers) problem of flipping a coin three times (basically 3 days with 2 posts), interesting in events with 3 times heads or 3 times tails.
If you count the total number of cases without accounting for ordering, there are 4 cases: "3 heads" / "2 heads, 1 tails" / "1 heads, 2 tails" / "3 tails". There are 2 interesting cases, yielding a "result" of 50%.

Yet that number is just wrong! (And hopefully it should be pretty clear that the probability to throw 3 times heads or 3 times tails in a row is not 50%)

The problem is that P("3 heads") and P("2 heads, 1 tails) isn't equal. There are 8 different results, each falling into one of those 4 cases:



0
0
0
"3 heads"


0
0
1
"2 heads, 1 tails"


0
1
0
"2 heads, 1 tails"


0
1
1
"1 heads, 2 tails"


1
0
0
"2 heads, 1 tails"


1
0
1
"1 heads, 2 tails"


1
1
0
"1 heads, 2 tails"


1
1
1
"3 tails"



This table enumerates all possibly different events, each having the same probability (=1/8), yielding the result of P(Having 3 heads or 3 tails with 3 coin flips)=0.25




@NichG: While I like the idea of simulating, if the OP is concerned with the randomness of the "original experiment", I don't think the randomness of (probably another) PRNG is much better :smallwink:. And yes, I have the feeling that it is rather likely that we use assumptions that are not valid (hence the questions about it I had earlier).



It is also possible to solve this using a Markov Chain (http://en.wikipedia.org/wiki/Markov_chain), though I'm not sure yet, which states to use. I'm tinkering a bit and will post it if I have something.

Radar
2014-04-24, 11:35 AM
Ah, I see. Honestly I have still not clue how your numbers are supposed to work. But in any case I can say that your approach is not correct.

Because the approach "# valid/# total" is only a simplification/shorthand of P(valid)/P(total)=P(valid), iff all events have equal probability.
While ordering is certainly not relevant to the question, it is relevant for the odds. (things of relevance)
Thank you! :smallsmile:

That's what I get for oversimplifying the situation.

Crow
2014-04-24, 12:13 PM
Wait you have 200 people but only 31 posts? Does this mean some people don't get a post? Or can posts have multiple people on them? Is it possible some of the spots are constrained in the number of people that can work at them? That might throw the probability off some.

These people work different shifts and days off. There is never more than one person assigned to perform a task. 31 posts, 31 people, no exceptions. Every person is eligible to work every post.

Basically, we're looking for the chance of any one person working the same post 6 times in one 15-day period....since we've only been running this program for 15 days (minus days off, so about 21, at time of posting).

The Walrus
2014-04-24, 12:57 PM
So, if I'm understanding correctly, on any day where a person is working, that person has an equal chance of being assigned any given post (theoretically), and will only work at that one post for that day, but that person won't necessarily be working all 15 days?

Crow
2014-04-24, 02:30 PM
They will be working all 15 days.

Razanir
2014-04-24, 03:44 PM
it is relevant for the odds.

PROBABILITY! (Sorry, I'm practically majoring in stats, so it irritates me to no end when people use the two interchangeably)


They will be working all 15 days.

I'm even more confused now. So there are 200 people to be assigned to 31 posts, with only one person assigned to each post, and yet we're saying that this arbitrary person IS working all 15 days?

It's two different probabilities. Working the same post 6/15 days is easy. It's just a binomial probability. Being chosen out of 200 people to work, and STILL working the same post 6/15 days is a bit tougher. That one (I think) would require a hypergeometric probability to see if you're even chosen a particular day.

NichG
2014-04-24, 05:37 PM
@NichG: While I like the idea of simulating, if the OP is concerned with the randomness of the "original experiment", I don't think the randomness of (probably another) PRNG is much better :smallwink:. And yes, I have the feeling that it is rather likely that we use assumptions that are not valid (hence the questions about it I had earlier).


I don't believe in encouraging superstition. People really need to get over their issues with PRNGs. If it were this easy to tell that the data came from a PRNG rather than a true random source, you wouldn't need to generate trillions of numbers and run them through exhaustive statistical suites. An actual software bug is more likely, but I'd posit that given the span of answers given by people in this thread, its easier to make a short program that gets it right than to actually do the math correctly :smallsmile:

But yes, invalid assumptions cannot be fixed by code or math.

Crow
2014-04-24, 06:08 PM
On any given shift, there are 31 slots that a person can be assigned, one person, one slot, no exceptions. Forget the 200 people part. First we need to find the actual probability of a single person being assigned the same post 6 times in a 15 day period.

There are only 31 people to choose from during a single shift (the others work different shifts/days off/etc). On this single shift, one person has gotten the six. However several have gotten 5 of the same position in this 15 days as well, but I'm interested in the probability of the six.

Anyhow, once we have determined that, rather than trying to figure out the intricacies of our scheduling and shift setup (which is a mess anyways), let us just treat the 200 people as 200 separate 15-day periods and work from there to find out how likely it is for the one person from earlier to work 6 given 200 15-day periods.

Does this make sense?

I appreciate everybody's help. If this thing is working, that is fine. It just seems very suspicious, especially taking into account how many people have managed to do 5 of the same post so far.

Razanir
2014-04-24, 06:21 PM
OAnyhow, once we have determined that, rather than trying to figure out the intricacies of our scheduling and shift setup (which is a mess anyways), let us just treat the 200 people as 200 separate 15-day periods and work from there to find out how likely it is for the one person from earlier to work 6 given 200 15-day periods.

Except it doesn't quite work that way. If the person has to be chosen out of 200 to work, then that changes the probabilities.

NichG
2014-04-24, 06:24 PM
Okay, I've used the following algorithm to simulate that.

Take 31 people. For each day of the 15 day period, I have 31 slots. For each slot, pick a random person to fill that slot, but if I pick a person who already is assigned to a slot, pick again randomly until I pick someone who is not yet assigned.

Out of that set of 31 people, the measured chance of one of them getting the same slot 6 times is 0.0037. So that's a little smaller than what you'd get if you assumed that people were being assigned completely randomly (e.g. slots could be left open).

Razanir
2014-04-24, 06:41 PM
Okay, I've used the following algorithm to simulate that.

Take 31 people. For each day of the 15 day period, I have 31 slots. For each slot, pick a random person to fill that slot, but if I pick a person who already is assigned to a slot, pick again randomly until I pick someone who is not yet assigned.

Out of that set of 31 people, the measured chance of one of them getting the same slot 6 times is 0.0037. So that's a little smaller than what you'd get if you assumed that people were being assigned completely randomly (e.g. slots could be left open).

You just brought up an important point. Are these people distinguishable or not? Are we looking for any single person gets assigned 6 days in a row, or a particular person gets assigned 6 days in a row?

Crow
2014-04-25, 03:18 AM
Except it doesn't quite work that way. If the person has to be chosen out of 200 to work, then that changes the probabilities.

Except they don't. There are only 31 available to work because all those other people work different shifts with different days off. ;)


You just brought up an important point. Are these people distinguishable or not? Are we looking for any single person gets assigned 6 days in a row, or a particular person gets assigned 6 days in a row?

I'm not sure I understand the question. It isn't 6 in a row though. I can't imagine the probability of something like that.

ChristianSt
2014-04-25, 03:57 AM
While I'm not sure if it really solves your problem (since the posts assignments of multiple persons are not independent from each other - at least if I understand correctly), I wrote a Python program to figure out the exact probabilities using a Markov Chain. The only rounding that is done was the division in the end to get the final result (so I have "# of successful states"/"# of total sates", each state having the same probability).

The probability of doing any task 6 or more times is 0.000135914578989 (which isn't far of from what I previously have thought).
[The exact result is 102879716273711701/756943935220796320321 ]

I have also a list of doing a task exactly N times:
Probability of doing the most frequent(s) post(s) exactly 1 times after 15 days is 0.0167485327755
Probability of doing the most frequent(s) post(s) exactly 2 times after 15 days is 0.662140058327
Probability of doing the most frequent(s) post(s) exactly 3 times after 15 days is 0.286857498906
Probability of doing the most frequent(s) post(s) exactly 4 times after 15 days is 0.0317756885253
Probability of doing the most frequent(s) post(s) exactly 5 times after 15 days is 0.00234230688761
Probability of doing the most frequent(s) post(s) exactly 6 times after 15 days is 0.000130146041241
Probability of doing the most frequent(s) post(s) exactly 7 times after 15 days is 5.57769641972e-06
Probability of doing the most frequent(s) post(s) exactly 8 times after 15 days is 1.85923214985e-07
Probability of doing the most frequent(s) post(s) exactly 9 times after 15 days is 4.82023149962e-09
Probability of doing the most frequent(s) post(s) exactly 10 times after 15 days is 9.64046299925e-11
Probability of doing the most frequent(s) post(s) exactly 11 times after 15 days is 1.46067621201e-12
Probability of doing the most frequent(s) post(s) exactly 12 times after 15 days is 1.6229735689e-14
Probability of doing the most frequent(s) post(s) exactly 13 times after 15 days is 1.24844120684e-16
Probability of doing the most frequent(s) post(s) exactly 14 times after 15 days is 5.94495812783e-19
Probability of doing the most frequent(s) post(s) exactly 15 times after 15 days is 1.32110180618e-21


So getting a task 5times is roughly 20 times more likely than getting a task 6 times.


The import thing is that I try to set a number of states which correspond to the things that can happen in the experiment.
Since we don't care about which post gets assigned, but only on how many posts get assigned how many times, I can reduce the amount of states by a pretty large factor. I don't need to compute all 31^15 possibilities, but after 15 days there is only a space of 176 states, which is manageable. And after arriving at those states I can simple group them together by categories (a post assigned exactly N times).

Basically the important part is how the state space is set up, and to figure out how you transit from state to state.

I use a list, the first number is how often this state has occurred.
Each cell N with content x after that says "This state has x posts which where assigned N times.
To get a probability you can simple divide through the number of total outcomes (which equals posts^days).


>>> After 1 days:
>>> [31, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

So after the first day there are 31 different outcomes. Each contains a single different pots.

>>> After 2 days:
>>> [930, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
>>> [31, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

After the second they the single state "splits" into two:
The more likelier outcome though is to get a different post: 30*31=930 cases of 2 different posts.
The other outcome is in 1 of those 31 cases you get the same post, yielding 1*31=31 outcomes with same post twice.

>>> After 3 days:
>>> [26970, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
>>> [2790, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
>>> [31, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

Now it starts to get "messier":
From the first state of day two it is possible two it the first two states of day:
In 29 cases (so 29*930=26790) you have 3 different posts.
In 2 cases (so 2*930=1860) you have one post 2 times and one post 1 time.

So why there are 2790 outcomes and not 1860? Because it is also possible to hit this outcome after getting two times the same post after two days: In 30 cases you don't hit that post again, so you add another 30*31=930 outcomes to that case. 930+1860=2790.
Also in 1 case you still have a single posts assigned, yielding in 31 outcomes with 3 times a single post.

To make a good computation out of it, there only needs to be a rather easy transition rule for the state space, which can be found easily and is this:
Basically from <X, Y, Z, 0>=<X post assigned 1 time, Y post assigned 2 times, Z post assigned 3 times, 0 post assigned 4 or more times> you go

into the state <X+1, Y, Z, 0> with probability [P-(X+Y+Z)]/P
into the state <X-1, Y+1, Z, 0> with probability X/P
into the state <X, Y-1, Z+1, 0> with probability Y/P
into the state <X, Y, Z-1, 1> with probability Z/P

Where P is the total number of posts (so 31).

Theoretically it is possible to continue this without a computer, but it gets harder with each step. For example the complete state space after day 15 is this:


(From there you can compute the probability for each of those states, by dividing the first number by the number of total outcomes (=31^15).
You can/should ignore the "L", it only signalizes that this is a long and not an int (because the number is too big for a regular int).)
After 15 days:

[393008709555221760000L, 15, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[2427406735488134400000L, 13, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[5259381260224291200000L, 11, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[5074841566883088000000L, 9, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[2283678705097389600000L, 7, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[456735741019477920000L, 5, 5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[34601192501475600000L, 3, 6, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[644742717418800000, 1, 7, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[584375695580476800000L, 12, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[2029936626753235200000L, 10, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[2283678705097389600000L, 8, 2, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[1014968313376617600000L, 6, 3, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[173005962507378000000L, 4, 4, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[9026398043863200000, 2, 5, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[62683319749050000, 0, 6, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[169161385562769600000L, 9, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[289990946679033600000L, 7, 1, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[138404770005902400000L, 5, 2, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[20058662319696000000L, 3, 3, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[626833197490500000, 1, 4, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[10252205185622400000L, 6, 0, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[6686220773232000000, 4, 1, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[835777596654000000, 2, 2, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[11143701288720000, 0, 3, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[92864177406000000, 3, 0, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[11143701288720000, 1, 1, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[28573593048000, 0, 0, 5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[92269846670601600000L, 11, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[253742078344154400000L, 9, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[217493210009275200000L, 7, 2, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[69202385002951200000L, 5, 3, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[7521998369886000000, 3, 4, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[188049959247150000, 1, 5, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[36248868334879200000L, 8, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[46134923335300800000L, 6, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[15043996739772000000L, 4, 2, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[1253666394981000000, 2, 3, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[12536663949810000, 0, 4, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[2005866231969600000, 5, 0, 2, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[835777596654000000, 3, 1, 2, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[50146655799240000, 1, 2, 2, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[11143701288720000, 2, 0, 3, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[428603895720000, 0, 1, 3, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[1647675833403600000, 7, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[1504399673977200000, 5, 1, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[313416598745250000, 3, 2, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[12536663949810000, 1, 3, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[104472199581750000, 4, 0, 1, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[25073327899620000, 2, 1, 1, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[482179382685000, 0, 2, 1, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[321452921790000, 1, 0, 2, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[1392962661090000, 3, 0, 0, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[160726460895000, 1, 1, 0, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[1984277295000, 0, 0, 1, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[10149683133766176000L, 10, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[21749321000927520000L, 8, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[13840477000590240000L, 6, 2, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[3008799347954400000, 4, 3, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[188049959247150000, 2, 4, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[1504399673977200, 0, 5, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[2636281333445760000, 7, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[2407039478363520000, 5, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[501466557992400000, 3, 2, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[20058662319696000, 1, 3, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[83577759665400000, 4, 0, 2, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[20058662319696000, 2, 1, 2, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[385743506148000, 0, 2, 2, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[171441558288000, 1, 0, 3, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[200586623196960000, 6, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[125366639498100000, 4, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[15043996739772000, 2, 2, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[192871753074000, 0, 3, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[6686220773232000, 3, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[771487012296000, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[4762265508000, 0, 0, 2, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[96435876537000, 2, 0, 0, 2, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[3571699131000, 0, 1, 0, 2, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[5014665579924000, 5, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[2005866231969600, 3, 1, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[115723051844400, 1, 2, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[77148701229600, 2, 0, 1, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[2857359304800, 0, 1, 1, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[1428679652400, 1, 0, 0, 1, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[3401618220, 0, 0, 0, 0, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[805530407441760000, 9, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[1318140666722880000, 7, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[601759869590880000, 5, 2, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[83577759665400000, 3, 3, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[2507332789962000, 1, 4, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[133724415464640000, 6, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[83577759665400000, 4, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[10029331159848000, 2, 2, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[128581168716000, 0, 3, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[2228740257744000, 3, 0, 2, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[257162337432000, 1, 1, 2, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[1058281224000, 0, 0, 3, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[8357775966540000, 5, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[3343110386616000, 3, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[192871753074000, 1, 2, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[128581168716000, 2, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[4762265508000, 0, 1, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[1190566377000, 1, 0, 0, 2, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[334311038661600, 4, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[77148701229600, 2, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[1428679652400, 0, 2, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[1904906203200, 1, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[17008091100, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[4286038957200, 3, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[476226550800, 1, 1, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[5669363700, 0, 0, 1, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[47076452382960000, 8, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0]
[57310463770560000, 6, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0]
[17909519928300000, 4, 2, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0]
[1432761594264000, 2, 3, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0]
[13776553791000, 0, 4, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0]
[4775871980880000, 5, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0]
[1910348792352000, 3, 1, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0]
[110212430328000, 1, 2, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0]
[36737476776000, 2, 0, 2, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0]
[1360647288000, 0, 1, 2, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0]
[238793599044000, 4, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0]
[55106215164000, 2, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0]
[1020485466000, 0, 2, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0]
[1360647288000, 1, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0]
[6074318250, 0, 0, 0, 2, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0]
[7347495355200, 3, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0]
[816388372800, 1, 1, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0]
[9718909200, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0]
[136064728800, 2, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0]
[4859454600, 0, 1, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0]
[694207800, 1, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0]
[2046802277520000, 7, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0]
[1790951992830000, 5, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0]
[358190398566000, 3, 2, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0]
[13776553791000, 1, 3, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0]
[119396799522000, 4, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0]
[27553107582000, 2, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0]
[510242733000, 0, 2, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0]
[340161822000, 1, 0, 2, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0]
[4592184597000, 3, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0]
[510242733000, 1, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0]
[6074318250, 0, 0, 1, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0]
[102048546600, 2, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0]
[3644590950, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0]
[1214863650, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0]
[5984550, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0]
[66331555290000, 6, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0]
[39798933174000, 4, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0]
[4592184597000, 2, 2, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0]
[56693637000, 0, 3, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0]
[2040970932000, 3, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0]
[226774548000, 1, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0]
[1349848500, 0, 0, 2, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0]
[56693637000, 2, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0]
[2024772750, 0, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0]
[809909100, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0]
[4654650, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0]
[1591957326960, 5, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0]
[612291279600, 3, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0]
[34016182200, 1, 2, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0]
[22677454800, 2, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0]
[809909100, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0]
[404954550, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0]
[2792790, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0]
[27831421800, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0]
[6184760400, 2, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0]
[110442150, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0]
[147256200, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0]
[1269450, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0]
[343597800, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0]
[36814050, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0]
[423150, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0]
[2831850, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0]
[97650, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0]
[13950, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0]
[31, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1]


If you want to check other parameters (though I'm not sure how well it scales, but I tried it for 30 days which did take 50 seconds on my PC. 15 days take less than 1 second, so I wouldn't except to compute 100 days in a reasonable time) or my code/logic, here it is (though it isn't really good commented and I'm not sure how good my Python is, this is basically the first program I wrote in Python):



#!/usr/bin/env python

import copy
from time import time
from fractions import gcd
import sys

days = int(sys.argv[1]) if len(sys.argv) > 1 else 15 # First command line argument
posts = int(sys.argv[2]) if len(sys.argv) > 2 else 31 # Second command line argument
success = int(sys.argv[3]) if len(sys.argv) > 3 else 6 # Third command line argument

class state(object):
"""
Represents a state for the Markov Chain
"""

def __init__(self):
self.internal_state = [1]
"""
internal_state[0] = count
internal_state[n] = number of posts that occur exactly n times; n <= days
"""
for i in range(days):
self.internal_state.append(0);

"""access number of posts that occured exactly n times"""
def get_posts(self,n):
if n<=0 or n>days:
raise NameError("invalid n")
return self.internal_state[n]
def add_posts(self,n,k):
if n<=0 or n>days:
raise NameError("invalid n")
self.internal_state[n] += k

"""access how often this state appears"""
def get_count(self):
return self.internal_state[0]
def add_count(self,k):
self.internal_state[0] += k
def multiply_count(self,k):
self.internal_state[0] *= k

"""prints state"""
def out(self):
print(self.internal_state)

"""returns True iff states have an equal distribution of posts"""
def equal_posts(self,other):
for i in range(days):
if self.get_posts(i+1) != other.get_posts(i+1):
return False
return True

"""compares states using the post distribution"""
def compare_posts(lhs,rhs):
for i in range(days):
if lhs.get_posts(days-i) > rhs.get_posts(days-i):
return 1
elif lhs.get_posts(days-i) < rhs.get_posts(days-i):
return -1
return 0

def main():
start = time()
successes = 0
markov_chain = [[state()]]

"""main loop: iterates step after step"""
for i in range(days):
markov_chain.append(new_step(markov_chain[i]))
markov_chain[i+1] = shrink_states(markov_chain[i+1])
print("After "+str(i+1)+" days:")
for s in markov_chain[i+1]:
s.out()
print("")

end_result = compress_states(markov_chain[i+1])

"""prints some fun stuff"""
print(end_result)
print("")
for i in range(1,days+1):
if i>=success:
successes += end_result[i]
print("Probability of doing the most frequent(s) post(s) exactly "+str(i)+" times after "+str(days)+" days is "+str(1.0*end_result[i]/end_result[0]))
print("")
print("Probability of doing a post at least "+str(success)+" times after "+str(days)+" days is "+str(1.0*successes/end_result[0]))
print("Exact result = " +str(successes)+"/"+str(end_result[0]))
gcd_t = gcd(successes,end_result[0])
print("Greatest Common Divisor is "+str(gcd_t))
print("Exact result = " +str(successes/gcd_t)+"/"+str(end_result[0]/gcd_t))
print("")
print("--- Variables used ---")
print("Days = "+str(days))
print("Posts = "+str(posts))
print("Success = "+str(success))

end = time()
print("")
print("time elapsed = "+str(end-start))

def new_step(state_array):
"""
Returns the new step of the Markov Chain
"""
new_step_array = []

for s in state_array:
left_posts = posts
"""For each outcome that has already a post"""
for i in range(1,days):
if s.get_posts(i)>0:
temp_state = copy.deepcopy(s)
temp_state.add_posts(i,-1)
temp_state.add_posts(i+1,1)
temp_state.multiply_count(s.get_posts(i))
left_posts -= s.get_posts(i)
new_step_array.append(temp_state)
"""Not all posts have appeared yet"""
if left_posts>0:
temp_state = copy.deepcopy(s)
temp_state.add_posts(1,1)
temp_state.multiply_count(left_posts)
new_step_array.append(temp_state)

return new_step_array

def shrink_states(state_array):
"""
Sorts ands summarizes the states to a more compact form
"""
state_array.sort(state.compare_posts)
found_something = True
start_i = 0
while found_something:
found_something = False
for i in range(start_i,len(state_array)-1):
"""Same states are added together"""
if state_array[i].equal_posts(state_array[i+1]):
state_array[i].add_count(state_array[i+1].get_count())
state_array.remove(state_array[i+1])
found_something = True
start_i = i
break
return state_array

def compress_states(state_array):
"""
Compresses states to a single list with the first cell the total combinations.
Each cell N after that lists the amount of combinations having exactly N max posts.
"""
state_array = shrink_states(state_array)
end_result = [0]
for i in range(days):
end_result.append(0);
i = days
while i>0 and len(state_array)>0:
if state_array[-1].get_posts(i)>0:
end_result[0] += state_array[-1].get_count()
end_result[i] += state_array[-1].get_count()
state_array.pop()
continue;
i -= 1
return end_result

if __name__=="__main__":
main()


So from that for a independent drawing the chances for that event (working a post on 6 days) for 200 people is around 2-3%. [My desktop calculator says that 1−(1−102879716273711701/756943935220796320321)^200=0.026818583]

I have no clue how the dependency factors in. I would assume it increases that percentage, but from what NichG said it could also be smaller.

Edit: Yeah it doesn't scale too well. 35 days nearly takes 5 minutes. It would probably save a decent amount to introduce a final state if only really interested in "done a post X or more times", because that would greatly reduce the number of states. For the initial problem for day 15 it would reduce the number of states form 176 to 86. So that should vastly improve performance in the long run. (But since this isn't the question, I'm unlikely to optimize that program.)

Waar
2014-04-25, 06:45 AM
We have a program at my job which is supposed to populate work rosters at random.

There are 31 spots on the roster, and a person may be assigned only one each day.

You work 15 days. What is the probability of being placed in the same roster position on 6 of those days?

A rough estimate would be about 10^(-4) (one in 10 000)

edit: the problem can be solved with 31*(1- binomcdf(15,1/31,5))=31*P(X>5) where X is (an element of) Bin(15,1/31),

ChristianSt
2014-04-25, 07:07 AM
A rough estimate would be about 10^(-4) (one in 10 000)

While it is in the right ballpark (as shown above), I think giving estimates for combinatorics problems without any reasoning is basically without value.

Yes you could guess correct, but humans are generally bad at guessing probabilities. Just throwing random numbers into such a thread doesn't help. Also it makes it basically impossible to really argue with your reasoning (or find a problematic spot or a really good idea you have that could be applied to another problem).

(Also "rough estimate" != probability in the first place)


Edit: To your edit


edit: the problem can be solved with 31*(1- binomcdf(15,1/31,5))=31*P(X>5) where X is (an element of) Bin(15,1/31),

This is only an approximation, too, since it counts events twice (namely those having two or more elements appearing 6 or more times).

Razanir
2014-04-25, 08:08 AM
Except they don't. There are only 31 available to work because all those other people work different shifts with different days off. ;)

I'm even more lost. Point is, the probability is different depending on if we're looking at being sorted 31 people to 31 slots and 200 people to 31 slots. Even in the case of just working a given slot on a given day changes, because you first need to be chosen to work that shift.

But as I think I understand it now, we're assuming that you are, in fact, working the shift in question all 15 days? This is just assigning 31 people to 31 slots. Right?


I'm not sure I understand the question. It isn't 6 in a row though. I can't imagine the probability of something like that.

Let's say we're both working the same shift. The probability of me getting any particular slot is 1/31. The probability of you getting any particular slot is 1/31. But when we track 2 people's positions throughout the bi-week period, those probabilities change. Instead of having 31²=961 possibilities (31 slots for me times 31 slots for you), we only have 31*30=930 possibilities (those 961 minus the 31 where we work the same slot).

So to know the probability of someone working the same slot 6 times, it matters if we're looking at you getting the same slot 6 times (easy, almost trivial compared to stuff I've seen) or anyone getting the same slot 6 times (much more complicated).

Waar
2014-04-25, 08:08 AM
This is only an approximation, too, since it counts events twice (namely those having two or more elements appearing 6 or more times).

sure, but the odds of that happening is very small ( 31*(1- binomcdf(15,1/31,5))*31*(1- binomcdf(15-6,1/31,5)=3,66...*10^-10) and were as such left out
(oh and how could you possibly have Three or more 6 day groups, all with different days, in 15 days? :smalltongue:)

ChristianSt
2014-04-25, 08:47 AM
sure, but the odds of that happening is very small ( 31*(1- binomcdf(15,1/31,5))*31*(1- binomcdf(15-6,1/31,5)=3,66...*10^-10) and were as such left out
(oh and how could you possibly have Three or more 6 day groups, all with different days, in 15 days? :smalltongue:)

They might be very small (or might not), though in all cases if you do make some reasonable approximation you should also say that you are only doing an approximation because of XXX.

Yes for practical concern in this case it is imo negligible, but just saying "the answer is this" is wrong.

(I never did say that three or more such group could happen with those number. Two or more is still correct even if in that special case it says basically two. Since if you only consider "exactly two" your approach isn't usable for number where two or more can happen. E.g. if the OP wants to also wants to know tasks which happen 5 times or how it is on a 18 or 20 day schedule.)


In any case at least from my standpoint (unless someone can point out any errors in my reasoning/code) I would say that the original OP-question (% of getting any task 6 or more time if assigned 15 tasks with equal distribution independently) is answered exactly.

The second answer (on how likely the OP-question is happening at least for one out of 200 people) is trivial after the first [=1-(1-P)^200].

Factoring in any dependency between people/posts is a whole other beast, where I right now have no clue how to tackle it exactly. (Though it is possible to use the independent answers as approximation - though I have no clue how good they are.)

CarpeGuitarrem
2014-04-25, 09:46 AM
A simpler solution: gather data about what posts people have worked over an extended period of time. Graph it, and see how evenly their posts are distributed. The more people's data you can put into this, the better.

Evenness of distribution is a far better indicator of good randomness than picking out a specific allegedly-random sequence and verifying it.

Waar
2014-04-25, 10:13 AM
They might be very small (or might not), though in all cases if you do make some reasonable approximation you should also say that you are only doing an approximation because of XXX.

Yes for practical concern in this case it is imo negligible, but just saying "the answer is this" is wrong.



Hey, I never claimed it had infinite degrees of accuracy, I merely gave the result of an approximation and "showed" how I got to it. :smallwink:

Crow
2014-04-25, 02:20 PM
But as I think I understand it now, we're assuming that you are, in fact, working the shift in question all 15 days? This is just assigning 31 people to 31 slots. Right?

This correct.


So to know the probability of someone working the same slot 6 times, it matters if we're looking at you getting the same slot 6 times (easy, almost trivial compared to stuff I've seen) or anyone getting the same slot 6 times (much more complicated).

I think just figuring out the probability of you working the same slot 6 times is sufficient.

@Christian: That is some very nice work!

Jasdoif
2014-04-25, 02:41 PM
OK, assuming the same 31 people in a shift, and one person to a spot (and one spot to a person) each day....Let me throw together some code to model that, this certainly looks like a scenario well-suited for the Monte Carlo method of measuring random tests. Let's go for...a million shifts, so we'll have 31 million individual schedules measured. Probably overkill, but oh well!


import collections,random

# random.shuffle doesn't return the shuffled list, so this function will
def Shuffled(sequence):
"""Returns a random.shuffle-ized copy of a sequence."""
workingSequence=sequence[:] # We make a copy so we don't alter the original
random.shuffle(workingSequence)
return workingSequence

# To keep track of our counts
highestCountDict=collections.defaultdict(lambda: 0)

# And our base roster
baseRoster=list(range(31))

# Hopefully self-explanatory
numberOfTests=1000000

# See above
numberOfDays=15

# Run our batch of tests....
for testCounter in range(numberOfTests):
# An list of daily rosters, ordered by spots
currentRosterList=[Shuffled(baseRoster) for dummy in range(numberOfDays)]
# Zip pulls one spot from each day, giving us individuals' schedules
currentScheduleList=zip(*currentRosterList)
for thisSchedule in currentScheduleList:
# Counter counts the occurences of each spot
thisCounter=collections.Counter(thisSchedule)
# Grab the highest count
highestCount=max(thisCounter.values())
# Increment our dictionary
highestCountDict[highestCount]+=1

# And to show our results!
# Since each test does multiple schedules we explicitly calculate this
print("Total schedules tested: {0}".format(sum(highestCountDict.values())))

for count,occurences in highestCountDict.items():
print("{0:>2}: {1:>9}".format(count,occurences))

Results will of course vary somewhat, but my run ended up with...

Total schedules tested: 31000000

1: 519834
2: 20528633
3: 8889969
4: 984878
5: 72540
6: 3980
7: 161
8: 5

So one's persons probability of working the same spot six times in 15 days is around....0.0128%. The chance of one or more people out of 200 working the same spot six times is around 2.55%; Not something to bet on but certainly feasible.

You mentioned earlier that "several" people had the same spot four or five times. With the number of tests turning up four or five, one person's chance comes out to roughly...3.41%. With that, I'd expect six or seven people out of 200 to have that, which fits with "several".


I don't see cause to be suspicious here.


EDIT: Huh, didn't see ChristianSt's posts when I started.


Factoring in any dependency between people/posts is a whole other beast, where I right now have no clue how to tackle it exactly. (Though it is possible to use the independent answers as approximation - though I have no clue how good they are.)I think your numbers are still correct, actually. While the spots in a day's roster have interdependencies with themselves, there's no dependency between multiple days. Since the individuals we're measuring are composed of single items from multiple days, there should be no dependency there, either.

Also, my rough numbers on end up close to your calculated ones...and me running a similar test case with independently determined days for a schedule ends up close to both.

So I'm going to go with "yeah, what ChristianSt said."

Slipperychicken
2014-04-29, 07:32 PM
If people being assigned to the same post X or more days in a row is a problem, why not just prevent the program from letting that happen? Like making a "ConsecutivePostDays" counter for each employee (which resets when someone is assigned a different position) and forbid the program from making assignments which would raise one of those counters to X or higher.

NichG
2014-04-29, 09:03 PM
If people being assigned to the same post X or more days in a row is a problem, why not just prevent the program from letting that happen? Like making a "ConsecutivePostDays" counter for each employee (which resets when someone is assigned a different position) and forbid the program from making assignments which would raise one of those counters to X or higher.

Its possible though unlikely that you could get into an unsolvable or jammed state with that. I can't tell off the top of my head, but that sounds vaguely like its heading towards the knapsack problem (using a certain amount of space - in this case 3 or 4 or 5 repeats per worker - to pack a certain set of objects, which would be the assignments that need to be filled)

factotum
2014-04-30, 01:51 AM
If people being assigned to the same post X or more days in a row is a problem, why not just prevent the program from letting that happen?

Because it would make things a great deal more complicated for very little gain--as the posts in this thread say, the chances of getting someone doing 6 days in a row are pretty small overall, so adding code to prevent that seems redundant.