Topic Rybka Support & Discussion / Rybka Discussion / Generating chess 960 starting positions with a 6-sided die

I just figured out that you can do it with 4.45 rolls average, is that the optimum solution? Is there an easier way to do it? Did I miss anything?

See attachment :-) .

See attachment :-) .

Attachment: chess960.pdf (386k)

A much simpler*, although definitely not optimal way:

You need a number between 1 and 960, and four dice-rolls will trivially give you a number between 1 and 1296. If you get 961 or higher, discard and re-roll all four.

The chances of needing to re-roll is 336/1296 = 0.2592. Number of needed re-rolls is governed by the negative binomial distribution, so 0.2592 / (1-0.2592) = 0.3499 rerolls (throwing four dices each). 4 * 1.3499 ~= 5.4 rolls.

Information-theoretically, there's only 3.83 dice-rolls needed to get to 960 positions, so if you want a large number of positions, you can most certainly do better.

* Assuming what you want is the position number, not the position itself.

You need a number between 1 and 960, and four dice-rolls will trivially give you a number between 1 and 1296. If you get 961 or higher, discard and re-roll all four.

The chances of needing to re-roll is 336/1296 = 0.2592. Number of needed re-rolls is governed by the negative binomial distribution, so 0.2592 / (1-0.2592) = 0.3499 rerolls (throwing four dices each). 4 * 1.3499 ~= 5.4 rolls.

Information-theoretically, there's only 3.83 dice-rolls needed to get to 960 positions, so if you want a large number of positions, you can most certainly do better.

* Assuming what you want is the position number, not the position itself.

I brute-forced it with some simple dynamic programming. Assuming you're limited to rejection sampling, the minimum number of required throws is 4.333. (There's no guarantee that this is a minimum over all methods, since there are numbers for which you could use the extra information in which value you're rejecting. However, in our case, the problem is frequently

The method is is fairly simple: Throw two dice, giving you a number 1..36. This will partition your 1..960 range into 36 nearly-equal ranges; 1/3 will contain 26 numbers and 2/3 will contain 27.

For the third die, throw a d5 by doing d6 with rejection. This costs 1.2 throws. This splits the 26 range into partitions 6,5,5,5,5 (1.16 throws on average, just sum over the probabilities) and the 27 range into partitions 6,6,5,5,5 (1.12 throws on average).

So you have 2 + 1.2 + (1.16 * 1/3 + 1.12 * 2/3) ~= 4.333 throws.

Here's a little graph over how many d6 rolls you need with this method for a dN:

Obviously, powers of 6 are the cheapest. Values just above (e.g. 217) are also fairly cheap, since it's unlikely that you'll hit that range. But it very steeply goes to a maximum, and then rolls off gradually are you reach the next power of 6.

*too much*information, it seems.)The method is is fairly simple: Throw two dice, giving you a number 1..36. This will partition your 1..960 range into 36 nearly-equal ranges; 1/3 will contain 26 numbers and 2/3 will contain 27.

For the third die, throw a d5 by doing d6 with rejection. This costs 1.2 throws. This splits the 26 range into partitions 6,5,5,5,5 (1.16 throws on average, just sum over the probabilities) and the 27 range into partitions 6,6,5,5,5 (1.12 throws on average).

So you have 2 + 1.2 + (1.16 * 1/3 + 1.12 * 2/3) ~= 4.333 throws.

Here's a little graph over how many d6 rolls you need with this method for a dN:

Obviously, powers of 6 are the cheapest. Values just above (e.g. 217) are also fairly cheap, since it's unlikely that you'll hit that range. But it very steeply goes to a maximum, and then rolls off gradually are you reach the next power of 6.

Aha, that's clever :-) . "I brute-forced it with some simple dynamic programming." Can you show the code?

Sure; it's pretty straightforward C: https://pastebin.com/St0na7QU (the forum doesn't take well to code)

Of course, for general N, this is obviously just an upper bound on the average number of rolls needed. E.g., nobody in their right mind would do a d2 using rejection; you'd count 1,3,5 as one category and 2,4,6 as another, giving it always in one roll. If you force num_rolls_needed[2] = num_rolls_needed[3] = 1.0 (by saying you can do d2 and d3 using one d6 throw), the graph actually becomes much more complicated, and surprisingly large areas are flat. I still won't claim this makes it optimal, though.

Of course, for general N, this is obviously just an upper bound on the average number of rolls needed. E.g., nobody in their right mind would do a d2 using rejection; you'd count 1,3,5 as one category and 2,4,6 as another, giving it always in one roll. If you force num_rolls_needed[2] = num_rolls_needed[3] = 1.0 (by saying you can do d2 and d3 using one d6 throw), the graph actually becomes much more complicated, and surprisingly large areas are flat. I still won't claim this makes it optimal, though.

> the forum doesn't take well to code

"Insert raw text" bellow the text area and start with

`[hl=c]`

to add highlighting. Here's your code:
[hl=c] #include <stdio.h> #include <math.h> #define NUM_N 960 double num_rolls_needed[NUM_N + 1]; int main(void) { num_rolls_needed[0] = 0.0; num_rolls_needed[1] = 0.0; for (int i = 2; i <= 6; ++i) { double p = (6 - i) / 6.0; num_rolls_needed[i] = 1 + p / (1.0 - p); printf("%3d %5.3f\n", i, num_rolls_needed[i]); } for (int i = 7; i <= NUM_N; ++i) { double best_needed = HUGE_VAL; int best_d = -1; for (int d = 2; d <= 6; ++d) { // Try throwing d2..d6. // Sum over all possible outcomes for that die. double sum_needed = 0.0; for (int outcome = 1; outcome <= d; ++outcome) { int range_start = lrint(i * (outcome - 1) / (double)(d)); int range_end_exclusive = lrint(i * outcome / (double)(d)); sum_needed += num_rolls_needed[range_end_exclusive - range_start]; } double needed = num_rolls_needed[d] + sum_needed / d; if (needed <= best_needed) { best_needed = needed; best_d = d; } } num_rolls_needed[i] = best_needed; printf("%3d %5.3f [best = %d]\n", i, num_rolls_needed[i], best_d); } }

The same in Python

[hl=python] #!/usr/bin/env python3 # -*- coding: utf-8 -*- """ Created on Sat Sep 15 11:21:25 2018 @author: fkling """ import matplotlib.pyplot as plt NUM_N = 960 num_rolls_needed = [0.000]*(NUM_N + 1) num_rolls_needed[0] = 0.0 num_rolls_needed[1] = 0.0 num_rolls_needed[2] = 1.0 num_rolls_needed[3] = 1.0 for i in range(4,7): p = (6 - i) / 6.0 num_rolls_needed[i] = 1 + p / (1.0 - p) print(i, num_rolls_needed[i]) for i in range(7,NUM_N+1): best_needed = 2e16 best_d = -1 for d in range(2,6+1): # Try throwing d2..d6. #Sum over all possible outcomes for that die. sum_needed = 0.0 for outcome in range(1,d+1): range_start = round(i * (outcome - 1) /d) range_end_exclusive = round(i * outcome / d) sum_needed += num_rolls_needed[range_end_exclusive - range_start] needed = num_rolls_needed[d] + sum_needed / d if (needed <= best_needed): best_needed = needed best_d = d num_rolls_needed[i] = best_needed print(i, num_rolls_needed[i],"[best =", best_d,"]") plt.plot(num_rolls_needed) plt.ylabel('number of rolls') plt.xlabel('number of results') plt.show()

aha, I understand, this looks pretty elegant to me :-)

Dynamic programming is a useful technique with a very confusing name (basically, it was called “dynamic” because it's a positively loaded word that would be hard to argue against in funding, and “programming” didn't mean the same thing back in the 60s). Once you have something that's recursive, it's fairly often possible to speed it up in this fashion.

You can change the sequence and start with dividing 32 results in 5,5,5,5,6,6 and then continue with 6 and 5. This makes it actually applicable in the following way:

Divide by 32? You mean 36, or reject on 33–36? I don't understand your diagrams at all, sorry.

OK, I'm not good at explaining things sometimes :-) . I meant 32 results first, then 5 and then 6 = 960

Actually I made a mistake, because the distribution is not truly random with this diagram.

Actually I made a mistake, because the distribution is not truly random with this diagram.

If you split the range into nearly equal ranges, the chances for single results are not distributed evenly anymore, are they? For instance, if you divide 960 in 36 ranges, those results in ranges with 26 numbers have the probability 1/36*1/26 and those with 27 have 1/36*1/27 as chance, right?

You're right; it won't work. So the method needs to be adjusted somehow.

divide the number i (the different results) by the number of sides of the die d. Round up, this is our range.

Example: i = 32 with d = 6, range is 32/6 = 5.333... -> round up: 6

Now d*range = 6*6 = 36 and 36 - i = 4 values are discarded, you need to reroll. The chance for that is p = (d*range-i)/(d*range) = 4/32. The expectation for the number of tries we need until we succeed is rerolls = 1/(1-p). We need

need = (rolls[d] + rolls[range])*rerolls

rolls total. That's it :-)

In our example:

need = (1 + 1)*36/32 = 2.25

Example: i = 32 with d = 6, range is 32/6 = 5.333... -> round up: 6

Now d*range = 6*6 = 36 and 36 - i = 4 values are discarded, you need to reroll. The chance for that is p = (d*range-i)/(d*range) = 4/32. The expectation for the number of tries we need until we succeed is rerolls = 1/(1-p). We need

need = (rolls[d] + rolls[range])*rerolls

rolls total. That's it :-)

In our example:

need = (1 + 1)*36/32 = 2.25

[hl=python] #!/usr/bin/env python3 # -*- coding: utf-8 -*- """ Created on Sat Sep 15 11:21:25 2018 @author: fkling """ import matplotlib.pyplot as plt NUM_N = 960 num_rolls_needed = [0.000]*(NUM_N + 1) num_rolls_needed[0] = 0.0 num_rolls_needed[1] = 0.0 num_rolls_needed[2] = 1.0 num_rolls_needed[3] = 1.0 for i in range(4,7): p = (6 - i) / 6.0 num_rolls_needed[i] = 1 + p / (1.0 - p) print(i, num_rolls_needed[i]) for i in range(7,NUM_N+1): best_needed = 2e16 best_d = -1 for d in range(2,6+1): # Try throwing d2..d6. #divide NUM_N in blocks with equal size, last block contains "lost" information range_width = -(-i // d) range_end_exclusive = d * range_width * 1.0 #fraction of values "lost" that are higher than the desired range, roll again then discard_p = (range_end_exclusive - i)/range_end_exclusive if discard_p > 0: #you need to reroll 1/(1-discard_p) times needed = 1/(1-discard_p)*(num_rolls_needed[range_width] + num_rolls_needed[d]) else: needed = num_rolls_needed[range_width] + num_rolls_needed[d] if (needed <= best_needed): best_needed = needed best_d = d num_rolls_needed[i] = best_needed print(i, num_rolls_needed[i],"[best =", best_d,"]") plt.plot(num_rolls_needed) plt.ylabel('number of rolls') plt.xlabel('number of results') plt.show()

Btw., this arives at 4.45 rolls for i = 960 :-) .

Actually one can try to improve that solution by not rerolling forever, but instead using the information of the lost rolls, for instance to divide the number into blocks as well.

Actually one can try to improve that solution by not rerolling forever, but instead using the information of the lost rolls, for instance to divide the number into blocks as well.

Yes, that's about the solution I had in mind. Of course, proving optimality is another thing entirely. I would be surprised if people haven't already considered this problem in literature, but I'm unable to find something on a quick search.

By the way, you don't need the test for p_discard > 0. The formula will work just fine for p_discard = 0.

By the way, you don't need the test for p_discard > 0. The formula will work just fine for p_discard = 0.

Sure, someone will have solved that a long time ago, but who cares, it's fun to rediscover things :-) . At least for chess960 this could be new ;-)

I added some part to use the "lost" rolls, this could be close to optimum now (I'm not sure if my formulation of the maximum recursion depth makes sense, probably increasing that will lead to slightly better results, but take much longer).

I added some part to use the "lost" rolls, this could be close to optimum now (I'm not sure if my formulation of the maximum recursion depth makes sense, probably increasing that will lead to slightly better results, but take much longer).

[hl=python] #!/usr/bin/env python3 # -*- coding: utf-8 -*- """ Created on Sat Sep 15 11:21:25 2018 @author: fkling """ import matplotlib.pyplot as plt from math import log NUM_N = 960 num_rolls_needed = [0.000]*(NUM_N + 1) def test_dices(ir): n = 2e16 d_best = -1 for d in range(2,6+1): depth_max = int (-(-log(ir) // log(d))) new = roll(ir,d,False,depth_max) d_best = d if (new < n) else d_best n = new if new < n else n return [n,d_best] def roll(ir,dr,free,depth): #ir is the number of results we want to get, dr the type of die global num_rolls_needed #is the first roll free? first = 0 if free else num_rolls_needed[dr] #block size we devide the number into, ceil(ir/dr) width = -(-ir // dr) #size of all blocks, >= ir end = dr * width disc = int(end - ir) pd = disc / end if disc == 0: return num_rolls_needed[width] + first elif disc == 1: return 1/(1-pd)*(num_rolls_needed[width] + first) else: needed = 2e16 #first check the simple rerolling method if (1-pd) > 0: needed = (1/(1-pd)*(num_rolls_needed[width] + first)) #try using the discarded rolls now for j in range(1,depth+1): ps = 1-pd needed_temp = (num_rolls_needed[width] + first) * sum(ps * pd**k * (k+1) for k in range(j)) + pd**j * (roll(ir,disc**j,True,depth-1) + j*(num_rolls_needed[width] + first)) if needed_temp < needed: needed = needed_temp #return the smaller value return needed num_rolls_needed[0] = 0.0 num_rolls_needed[1] = 0.0 num_rolls_needed[2] = 1.0 num_rolls_needed[3] = 1.0 for i in range (0,4): print(i, num_rolls_needed[i]) for i in range(4,7): p = (6 - i) / 6.0 num_rolls_needed[i] = 1 + p / (1.0 - p) print(i, num_rolls_needed[i]) for i in range(7,NUM_N+1): result = test_dices(i) num_rolls_needed[i] = result[0] best_d = result[1] print(i, round(num_rolls_needed[i],8),"[ best =", best_d,"]") plt.plot(num_rolls_needed) plt.ylabel('number of rolls') plt.xlabel('number of results') plt.show()

the number of rolls for 960 is now 4.4345679... . After two attempts using two rolls each for getting 32 results, you have 4*4 = 16 "lost" values. Use these to determine the position of the bishops and roll the die again to position the knights later on. Then you arrive at that solution.

I cleaned up the code a little bit.

[hl=python] #!/usr/bin/env python3 # -*- coding: utf-8 -*- """ Created on Sat Sep 15 11:21:25 2018 @author: fkling """ import matplotlib.pyplot as plt from math import log NUM_N = 960 num_rolls_needed = [0.000]*(NUM_N + 1) best_d = [0]*(NUM_N + 1) def test_dices(ir): n = 2e16 d_best = -1 for d in range(2,6+1): depth_max = int (-(-log(ir) // log(d))) new = roll(ir,d,False,depth_max) d_best = d if (new < n) else d_best n = new if new < n else n return [n,d_best] def roll(ir,dr,free,depth): #ir is the number of results we want to get, dr the type of die global num_rolls_needed #is the first roll free? first = 0 if free else num_rolls_needed[dr] #block size we devide the number into, ceil(ir/dr) width = -(-ir // dr) #size of all blocks, >= ir end = dr * width #discarded values in the last block disc = int(end - ir) #chance to get discarded values pd = disc / end #case 1: ir can be divided by dr if disc == 0: return num_rolls_needed[width] + first #case 2: only one value left, can't use that information elif disc == 1: return 1/(1-pd)*(num_rolls_needed[width] + first) #case 3: more than one value left else: #first check the simple rerolling method needed = (1/(1-pd)*(num_rolls_needed[width] + first)) #try using the discarded rolls now for j in range(1,depth+1): #check if the number of divisions is much higher than the number, then don't use it if disc**j > 2*ir: break #chance of success ps = 1-pd #now check all the possibilities and try to use the discarded rolls #basically it is the first two rolls times the success rate ps and this is repeated then with the chance of no success pd #when there was no success (chance for that is pd**j), you can use the information of the discarded rolls (but don't forget the rolls you needed to get there) needed_temp = (num_rolls_needed[width] + first) * sum(ps * pd**k * (k+1) for k in range(j)) + pd**j * (roll(ir,disc**j,True,depth-1) + j*(num_rolls_needed[width] + first)) if needed_temp < needed: needed = needed_temp #return the smaller value return needed num_rolls_needed[0] = 0.0 num_rolls_needed[1] = 0.0 num_rolls_needed[2] = 1.0 num_rolls_needed[3] = 1.0 best_d[0] = 0 best_d[1] = 1 best_d[2] = 2 best_d[3] = 3 best_d[4] = 6 best_d[5] = 6 best_d[6] = 6 for i in range (0,4): print(i, num_rolls_needed[i]) for i in range(4,7): p = (6 - i) / 6.0 num_rolls_needed[i] = 1 + p / (1.0 - p) print(i, num_rolls_needed[i]) for i in range(7,NUM_N+1): result = test_dices(i) num_rolls_needed[i] = result[0] best_d[i] = result[1] print(i, num_rolls_needed[i], best_d[i]) lines = plt.plot(best_d) plt.plot(num_rolls_needed) plt.setp(lines, linewidth=0.1, marker='o',markersize=1, linestyle="None") plt.ylabel('number of rolls') plt.xlabel('number of results') plt.show()

I don't find Python so readable, so I haven't gone into your code, but the usual tactic for when you have an extra variable (in this case, number of leftover values) is simply making a 2D table instead of an 1D one.

What language do you find most readable? I'm biased because all my familiarity is with C/C++ : /

Plain C is usually the simplest to understand for these kinds of things. C++ if you want data structures (e.g. using std::unordered_map is usually a lot more readable than rolling your own hashtable).

My professional life is C++ with the occasional Perl or Python.

My professional life is C++ with the occasional Perl or Python.

Mmh, why should Pyhton be less readable? To me it looks like programming without the "technical" stuff, while C++ made me feel I had a better understanding/control of what was really going on. For writing small scripts where performance is not critical, I always considered Python to be the most simple and elegant solution. But of course I'm not a professional.

Btw., I never worked with Perl, is there any advantage over Python?

Btw., I never worked with Perl, is there any advantage over Python?

> Mmh, why should Pyhton be less readable?

It's hard to see where each block begins and ends. Also, lack of static typing.

In terms of speed (especially for chess) does it go: C then C++ then Python?

I've only programmed in Fortran, but I don't know if it's possible to make a chess engine in this language. Apparently it's fast too.

I've only programmed in Fortran, but I don't know if it's possible to make a chess engine in this language. Apparently it's fast too.

Yes. In general Python is slower than C, first of all because it is not compiled before execution and if you run it, it executes code written in C (at least per default). As far as I know, the performance difference between C and C++ is not big.

Fortran is horrible (although performance should be OK). If you want a positive experience, learn C++ and wonder why you ever wrote something in fortran :-)

Fortran is horrible (although performance should be OK). If you want a positive experience, learn C++ and wonder why you ever wrote something in fortran :-)

Generally, C and C++ are the same speed. There are C++ techniques that have costs (especially when used indiscriminately), but also C++ techniques that make for large practical speed advantages in some cases. If you compile C code as C++, you can expect exactly the same speed.

Python is far behind; think a factor 10–100.

Python is far behind; think a factor 10–100.

I did not know about the negative binomial distribution, but now that I see it, I understand at least how to derive the mean value :-) (actually I did not know much about infinite sums before, too... there's little math in school and university if you study chemistry). I'll have a look at your program now :-)

Okay! I see you're using your PhD to go for the big bucks!

Powered by mwForum 2.27.4 © 1999-2012 Markus Wichitill