While browsing an old backup CD, I found data from my (2nd) bishop pair experiment from 2002:
What is the value of having the advantage of the two bishops? Can it be expressed in figures, i.e. evaluations, or as an %-impact on the performance percentage?
The previous experiment:
In an old and quite large experiment I did in 1996, I let play chess programs against themselves from 8 special starting positions, where only one side had the bishop pair, each (i.e. without Nb1, Bc8 etc.). To avoid having rating bonuses still included in the evaluation, I waited until one of the two bishops had disappeared, so I could read a kind of "real rating gain" which the 2 bishops achieved, as the programs evaluated it.
The average "rating success" so to speak, of the 2 bishops was
+0.46 pawns (1996)
-----------
with white and also with black (IOW, there was no advantage of the first move visible). These games were not finished.
New approach:
Meanwhile, GUIs were much improved. For a new two bishops experiment, I used the same old 8 starting positions (EPD see below), but let play more games, and engines against each other (not against themselves). Also, I decided to use only final results for the statistics, no evaluations.
5 engines played each of the 8 positions twice (w/b) against each of the 4 opponents, which results in 160 games total.
Results: W /D /L % (from the 2 bishop's viewpoint!)
----------------------------------------
White with the 2 bishops: 38/20/22 60,0%
Black with the 2 bishops: 29/19/32 48,1% -> advantage of first move visible!
----------------------------------------
2 bishops total: 67/39/54 54,1%
Based on the previous experiment (and on other experiments without opening theory), I thought the advantage of the first move would only matter in computer chess, when opening books or theory positions are used. Without books no white advantage was visible to me, so far. This seems to have changed. The results above indicate an advantage for the first move of ~ +6% compared to a theoretical 50% result. It can also be expressed by the w/b points ratio which was 89,5:70,5 = 1,27:1 (or ~56:44)
For comparison: In human games (2400+ elo), White scores ~55,2% according to my database. Chess programs score similar; I got 54,0% and 55,7% for White, from two large comp-comp databases which are partially identical (a few comp-human games are included). The main difference is the number of drawn games btw., which is ~51% among strong humans, but only ~24%...25% among computers.
The 2 bishops fought with White and with Black, in 80 games each.
Main result:
Having the 2 Bishops Advantage during the experiment, led to a score plus of
~ +4 %, or 86,5:73,5 points = 1,177:1 (or ~54:46)
------
The figures varied comparing the engines: Hiarcs 7.32 - seemingly* - showed the strongest impact of the 2 bishops (17:12 points), while Junior 5 scored 11,0 with, and 11,5 without the bishop pair.
*) The results inavoidably are a mixture of (A) having the 2 bishops or not, (B) having the first move or not, and (C) the strength relation between the opponents. (A) and (B) were separated, see above. (C) shouldn't be a problem because full round robins were played. - But more important to consider is: The 2 bishops were the only difference at the start, but probably many games were decided by other factors. For example, in a few games I watched live, the bishop pair disappeared soon and/or the adavantage seemd to change several times from one side to the other.
Addendum:
---------
Start positions, i.e. to use in opening databases for engine matches:
rn1qkbnr/pppppppp/8/8/8/8/PPPPPPPP/R1BQKBNR w KQkq - 0 1
rnbqk1nr/pppppppp/8/8/8/8/PPPPPPPP/R1BQKBNR w KQkq - 0 1
rn1qkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKB1R w KQkq - 0 1
rnbqk1nr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKB1R w KQkq - 0 1
r1bqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RN1QKBNR w KQkq - 0 1
rnbqkb1r/pppppppp/8/8/8/8/PPPPPPPP/RN1QKBNR w KQkq - 0 1
r1bqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQK1NR w KQkq - 0 1
rnbqkb1r/pppppppp/8/8/8/8/PPPPPPPP/RNBQK1NR w KQkq - 0 1
Program's total scores:
(PGN attached)
What is the value of having the advantage of the two bishops? Can it be expressed in figures, i.e. evaluations, or as an %-impact on the performance percentage?
The previous experiment:
In an old and quite large experiment I did in 1996, I let play chess programs against themselves from 8 special starting positions, where only one side had the bishop pair, each (i.e. without Nb1, Bc8 etc.). To avoid having rating bonuses still included in the evaluation, I waited until one of the two bishops had disappeared, so I could read a kind of "real rating gain" which the 2 bishops achieved, as the programs evaluated it.
The average "rating success" so to speak, of the 2 bishops was
+0.46 pawns (1996)
-----------
with white and also with black (IOW, there was no advantage of the first move visible). These games were not finished.
New approach:
Meanwhile, GUIs were much improved. For a new two bishops experiment, I used the same old 8 starting positions (EPD see below), but let play more games, and engines against each other (not against themselves). Also, I decided to use only final results for the statistics, no evaluations.
5 engines played each of the 8 positions twice (w/b) against each of the 4 opponents, which results in 160 games total.
Results: W /D /L % (from the 2 bishop's viewpoint!)
----------------------------------------
White with the 2 bishops: 38/20/22 60,0%
Black with the 2 bishops: 29/19/32 48,1% -> advantage of first move visible!
----------------------------------------
2 bishops total: 67/39/54 54,1%
Based on the previous experiment (and on other experiments without opening theory), I thought the advantage of the first move would only matter in computer chess, when opening books or theory positions are used. Without books no white advantage was visible to me, so far. This seems to have changed. The results above indicate an advantage for the first move of ~ +6% compared to a theoretical 50% result. It can also be expressed by the w/b points ratio which was 89,5:70,5 = 1,27:1 (or ~56:44)
For comparison: In human games (2400+ elo), White scores ~55,2% according to my database. Chess programs score similar; I got 54,0% and 55,7% for White, from two large comp-comp databases which are partially identical (a few comp-human games are included). The main difference is the number of drawn games btw., which is ~51% among strong humans, but only ~24%...25% among computers.
The 2 bishops fought with White and with Black, in 80 games each.
Main result:
Having the 2 Bishops Advantage during the experiment, led to a score plus of
~ +4 %, or 86,5:73,5 points = 1,177:1 (or ~54:46)
------
The figures varied comparing the engines: Hiarcs 7.32 - seemingly* - showed the strongest impact of the 2 bishops (17:12 points), while Junior 5 scored 11,0 with, and 11,5 without the bishop pair.
*) The results inavoidably are a mixture of (A) having the 2 bishops or not, (B) having the first move or not, and (C) the strength relation between the opponents. (A) and (B) were separated, see above. (C) shouldn't be a problem because full round robins were played. - But more important to consider is: The 2 bishops were the only difference at the start, but probably many games were decided by other factors. For example, in a few games I watched live, the bishop pair disappeared soon and/or the adavantage seemd to change several times from one side to the other.
Addendum:
---------
Start positions, i.e. to use in opening databases for engine matches:
rn1qkbnr/pppppppp/8/8/8/8/PPPPPPPP/R1BQKBNR w KQkq - 0 1
rnbqk1nr/pppppppp/8/8/8/8/PPPPPPPP/R1BQKBNR w KQkq - 0 1
rn1qkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKB1R w KQkq - 0 1
rnbqk1nr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKB1R w KQkq - 0 1
r1bqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RN1QKBNR w KQkq - 0 1
rnbqkb1r/pppppppp/8/8/8/8/PPPPPPPP/RN1QKBNR w KQkq - 0 1
r1bqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQK1NR w KQkq - 0 1
rnbqkb1r/pppppppp/8/8/8/8/PPPPPPPP/RNBQK1NR w KQkq - 0 1
Program's total scores:
1 Chess Tiger 14.0 42.0/64
2 Fritz 7 41.0/64
3 Hiarcs 7.32 29.0/64
4 Shredder 5.32 25.5/64
5 Junior 5.0 22.5/64(PGN attached)
Attachment: 2bishops.pgn (154k)
This definitely sounds like the correct way to determine the answer to piece-value questions. But you really need one or two orders of magnitude more games to get statistically signifiant results valid to, say, 0.1 Pawn. (By introducing other differences in the initial positions than just B-pair or not, the same games can be used to measure multiple piece values.) I once played some 20,000 games using the method of deleting pieces from the opening position for each side, (counting results). I always used self play, but using a mix of approximately equally strong engines would probably be better (had they been available). By combining it with Pawn odds (deleting f2/f7) the scores can be translated to Pawn values. (E.g. B-pair wins (color averaged) by 59%, but when you then delete one extra Pawn for the side with the B-pair, he only scores 41%, the B-pair is apparently 0.5 Pawn.)
My conclusions were:
*) The result score is quite independent of the quality of play. For the cases I tested, I got nearly the same percentages whether I used 40 moves/10 sec or 40 moves/5 min.
*) The white advantage seems to be 53-54%
*) A B-pair was indeed worth 0.50 Pawn
*) Putting Bishops on the same color (e.g. for one side swapping B-N on one wing) destoys the pair bonus
My conclusions were:
*) The result score is quite independent of the quality of play. For the cases I tested, I got nearly the same percentages whether I used 40 moves/10 sec or 40 moves/5 min.
*) The white advantage seems to be 53-54%
*) A B-pair was indeed worth 0.50 Pawn
*) Putting Bishops on the same color (e.g. for one side swapping B-N on one wing) destoys the pair bonus
> This definitely sounds like the correct way to determine the answer to piece-value questions.
Thanks. - Meanwhile, I am aware of the number of games issue. Oh these satanic statistical requirements...

> *) A B-pair was indeed worth 0.50 Pawn
Amazing! My first, methodically less convincing bp. experiment of 1996 resulted in an "eval success" of +0.46. This fits together.
> *) Putting Bishops on the same color (e.g. for one side swapping B-N on one wing) destoys the pair bonus
An interesting observation and I would always appreciate if an engine includes this detail of knowledge. But I guess most engine programmers will wave this aside as being too sophisticated and of no practical use.
OTOH, an engine should know that KBB-K with such bishops on the same square color cannot mate. I wonder if studies exist, to test engine knowledge with it (it will probably require to switch off tablebases support).
In Stockfish' material imbalance code it is easy (well, I hope I got it right somewhat
) to make a small correction to code so that if there are two bishops of the same color, the bishop pair bonus is not awarded. But I doubt the effect can be measured the normal way, in games you will rarely see a bishop underpromotion with one of the original bishops gone. Of course if you just test situations where this has actually occurred, same colour bishop pair compared to normal bishop pair, it is possible. But I have not actually done this for Rainbow Serpent.
I have increased the value for the bishop pair bonus (different colour bishops) a bit that is also found here in the material imbalance table, the first number in this array:
In Stockfish:
The 1690, in Stockfish 1617, coefficient is for the bishop pair, originally the 6 coefficients are for no piecetype, pawn, knight, bishop, rook and queen respectively but the NO_PIECE_TYPE is here replaced by the bishop pair. See also the comments here below and small changes for correcting if the bishops are of the same color I added in the piece of code from Stockfish material imbalance:
Rainbow Serpent:
In Stockfish master it is (only in this comment) still referred to as PIECE_TYPE_NONE but in the functional code Marco changed this name to NO_PIECE_TYPE
A value of 1690 is later divided by 16 (you can see that in the last line of above codeblock, the
) so that is ~= 105 (decimal) and the value of a pawn in the endgame is indeed about twice that:
0x102 hexadecimal is 258 decimal value. I think this means that 1690 (mostly a guess from my side) is still a bit low but you would have to factor in all the other corrections in the material imbalance table, 1690 is only the linear term involving the bishop pair, there are also higher order corrections that are impossible to tune by hand because too many. (there are some more changes in the material imbalance table of RS so the bishop pair values can't be totally compared to the Stockfish code bishop pair coefficients, without looking at the other imbalance table changes here as well, and changes to bishop eval elsewhere)
Regards, Eelco
) to make a small correction to code so that if there are two bishops of the same color, the bishop pair bonus is not awarded. But I doubt the effect can be measured the normal way, in games you will rarely see a bishop underpromotion with one of the original bishops gone. Of course if you just test situations where this has actually occurred, same colour bishop pair compared to normal bishop pair, it is possible. But I have not actually done this for Rainbow Serpent. I have increased the value for the bishop pair bonus (different colour bishops) a bit that is also found here in the material imbalance table, the first number in this array:
const int LinearCoefficients[6] = { 1690, -162, -1172, -190, 105, 26 };In Stockfish:
const int LinearCoefficients[6] = { 1617, -162, -1172, -190, 105, 26 };The 1690, in Stockfish 1617, coefficient is for the bishop pair, originally the 6 coefficients are for no piecetype, pawn, knight, bishop, rook and queen respectively but the NO_PIECE_TYPE is here replaced by the bishop pair. See also the comments here below and small changes for correcting if the bishops are of the same color I added in the piece of code from Stockfish material imbalance:
Rainbow Serpent:
// Evaluate the material imbalance. We use NO_PIECE_TYPE as a place holder
// for the bishop pair "extended piece", this allow us to be more flexible
// in defining bishop pair bonuses.
const bool bishoppairWhite = pos.piece_count(WHITE, BISHOP) > 1 && opposite_colors(pos.piece_list(WHITE, BISHOP)[0], pos.piece_list(WHITE, BISHOP)[1]);
const bool bishoppairBlack = pos.piece_count(BLACK, BISHOP) > 1 && opposite_colors(pos.piece_list(BLACK, BISHOP)[0], pos.piece_list(BLACK, BISHOP)[1]);
const int pieceCount[2][8] = {
{ bishoppairWhite , pos.piece_count(WHITE, PAWN), pos.piece_count(WHITE, KNIGHT),
pos.piece_count(WHITE, BISHOP) , pos.piece_count(WHITE, ROOK), pos.piece_count(WHITE, QUEEN) },
{ bishoppairBlack , pos.piece_count(BLACK, PAWN), pos.piece_count(BLACK, KNIGHT),
pos.piece_count(BLACK, BISHOP) , pos.piece_count(BLACK, ROOK), pos.piece_count(BLACK, QUEEN) } };
In Stockfish master it is (only in this comment) still referred to as PIECE_TYPE_NONE but in the functional code Marco changed this name to NO_PIECE_TYPE
// Evaluate the material imbalance. We use PIECE_TYPE_NONE as a place holder
// for the bishop pair "extended piece", this allow us to be more flexible
// in defining bishop pair bonuses.
const int pieceCount[2][8] = {
{ pos.piece_count(WHITE, BISHOP) > 1, pos.piece_count(WHITE, PAWN), pos.piece_count(WHITE, KNIGHT),
pos.piece_count(WHITE, BISHOP) , pos.piece_count(WHITE, ROOK), pos.piece_count(WHITE, QUEEN) },
{ pos.piece_count(BLACK, BISHOP) > 1, pos.piece_count(BLACK, PAWN), pos.piece_count(BLACK, KNIGHT),
pos.piece_count(BLACK, BISHOP) , pos.piece_count(BLACK, ROOK), pos.piece_count(BLACK, QUEEN) } };
mi->value = (int16_t)((imbalance<WHITE>(pieceCount) - imbalance<BLACK>(pieceCount)) / 16);
return mi;
}
A value of 1690 is later divided by 16 (you can see that in the last line of above codeblock, the
))/ 16); division) so that is ~= 105 (decimal) and the value of a pawn in the endgame is indeed about twice that:
const Value PawnValueEndgame = Value(0x102);0x102 hexadecimal is 258 decimal value. I think this means that 1690 (mostly a guess from my side) is still a bit low but you would have to factor in all the other corrections in the material imbalance table, 1690 is only the linear term involving the bishop pair, there are also higher order corrections that are impossible to tune by hand because too many. (there are some more changes in the material imbalance table of RS so the bishop pair values can't be totally compared to the Stockfish code bishop pair coefficients, without looking at the other imbalance table changes here as well, and changes to bishop eval elsewhere)
Regards, Eelco
I agree that the bishop pair bonus appears to be unrelated to time control. The mean value of the bonus is approximately 0.5 pawns for long time control games.
>What is the value of having the advantage of the two bishops?
A large, big, huge +9% in wins percentage for Rybka!
Rybka 4.1 played 432 games against herself, TC=10 ply.
all 432 games (first turn bonus =2.31%) +91/-270/-71 52.31%
216 games (white bonus two bishops 61.34-50-2.31=9.03%) +67/=131/-18 61.34%
216 games (white opposing two bishops 43.29-50-2.31=-9.02%) +24/=139/-53 43.29%
PGN (I doubt anyone needed)
Your math seems right. My first idea, or conclusion is that Rybka 4.1 may be capable to use the bishop pair more effectively (against herself...), than other engines. But to investigate that, such test games of different engines against Rybka, with and without bishop pair, would be required.
The problem: If you set a fixed depth level, Rybka 4.1 will have a huge advantage calculating +3 plies deeper internally. I suggest to set a normal time control level, instead.
The problem: If you set a fixed depth level, Rybka 4.1 will have a huge advantage calculating +3 plies deeper internally. I suggest to set a normal time control level, instead.
>calculating +3 plies deeper internally
Why, how? I think it is not Rybka but some GUIs swallows first lines as it shows still zero kN.
While some are not, look attachment...
And yes, different engines must play at timed control.
Attachment: rybkainfinity.png (65k)
go depth -2
info depth -2 score cp 48 time 1 nodes 5 nps 5000 pv b1c3
bestmove b1c3
info depth -2 score cp 48 time 1 nodes 5 nps 5000 pv b1c3
bestmove b1c3
Powered by mwForum 2.27.4 © 1999-2012 Markus Wichitill