Not logged inRybka Chess Community Forum
Up Topic Rybka Support & Discussion / Rybka Discussion / "Opposite Colored Bishops Endgame Penalty millipawns" fail!
- - By mindbreaker (****) [us] Date 2013-03-19 01:22 Edited 2013-03-19 01:47
I was having some issues with the opposite color bishop settings during attempts to optimize them.  I decided to do a more careful assessment of them.

What I found is that "Opposite Colored Bishops Endgame Penalty millipawns" does nothing.  Well, it does nothing in LittleBlitzer.  It could be that the name of the setting is incorrect.

I tried: Opposite Colored Bishops Endgame Penalty millipawns and Opposite Colored Bishops Endgame Penalty.

In each case with an immediate = and the value as per LilltleBlitzer directions.

I ran matches between default setting and each Exp setting.

A match Exp 480 vs Default was running, so I just made modified versions of the 480 setting and kept everything else like the tournament conditions and such the same.

I started with 480b the relevant changes were:
Bishops Are Opposite Colored Penalty millipawns=22
Opposite Colored Bishops Endgame Penalty millipawns=15

I decided that would take too long so I decided to make 480c:
Bishops Are Opposite Colored Penalty millipawns=9000
Opposite Colored Bishops Endgame Penalty millipawns=9000
(I figured if they were massive and functional they would have to affect rating by a large amount and makes a result quick to recognize) It took a ratings belly flop. So that means it was working.

Next 480d:
Bishops Are Opposite Colored Penalty=9000
Opposite Colored Bishops Endgame Penalty=9000
(Just because 48c works does not mean omitting "millipawns" necessarily means no function so I checked that. If does fail. Fine, it s a result, and embarrassingly I found my error in previous testing)

Next 480e:
Bishops Are Opposite Colored Penalty millipawns=9000
Opposite Colored Bishops Endgame Penalty millipawns=0
(just because c proved functional does not mean both lines need the "millipawns"...right?) Well, indeed another belly flop.

Next 480f:
Bishops Are Opposite Colored Penalty millipawns=0
Opposite Colored Bishops Endgame Penalty millipawns=9000  And the very shocking result...FAIL!!!  Exp 480F acted exactly like Opposite Colored Bishops Endgame Penalty millipawns=9000 did absolutely nothing!!

Next 480g:
Bishops Are Opposite Colored Penalty millipawns=0
Opposite Colored Bishops Endgame Penalty=9000
(I figured...if I can make the error of ommiting the "millipawns" part of the name so could the author. No, that was not it. Parameter fail!)

Inescapable conclusion: "Opposite Colored Bishops Endgame Penalty millipawns" does absolutely nothing.

Well, that is, barring anything incredibly embarrassing like entering the info wrong...but I did check letter for letter, character for character.

It might still work in Fritz?  I doubt it.

3/18/2013 6:21:28 PM :

    Program                          Elo    +   -   Games   Score   Av.Op.  Draws

  1 Rybka 4.1 Exp 480f             : 3226   15  15  1309    53.2 %   3203   33.6 %
  2 Rybka 4.1 Exp 480g             : 3226   11  11  2478    53.2 %   3203   35.4 %
  3 Rybka 4.1 Exp 480d             : 3222   12  12  2001    52.7 %   3203   34.9 %
  4 Rybka 4.1 Exp 480              : 3221    3   3 26324    52.5 %   3203   34.6 %
  5 Rybka 4.1 Exp 480b             : 3218   13  13  1758    52.1 %   3203   32.8 %
  6 Rybka 4.1 Default              : 3203    3   3 38025    48.7 %   3213   34.2 %
  7 Rybka 4.1 Exp 480e             : 3145   11  11  2712    41.6 %   3203   30.8 %
  8 Rybka 4.1 Exp 480c             : 3140   15  15  1443    41.0 %   3203   31.4 %
Parent - - By mindbreaker (****) [us] Date 2013-03-19 21:39
No replies? Hard to understand the post? Here is the crux:

I made a Rybka 1.4 setting where "Opposite Colored Bishops Endgame Penalty millipawns" was set to 9000.  That means that it would rather hand a queen over for nothing than go into an opposite colored bishop ending.  The result of the testing of the engine was no effect on rating.  The only way that is possible is that that option is broken.

The other one: "Bishops Are Opposite Colored Penalty millipawns" does work.  Verified with an identical test.
Parent - - By bob (Bronze) [us] Date 2013-03-21 18:10
This is not that uncommon.  The program simply avoids opposite-bishop endings, but in doing so, does not have to give up anything significant to do so.  There are MANY evaluation terms that are effective with small values, and the bigger you make them, the more puzzled you become because it has no measurable effect on actual games...  If you could set up a position where the program MUST either get mated or trade into an opposite bishop ending, you would at least see the score change.   Or if you could set up one where it has three choices:  (a) get mated;  (b) lose rook;  (c) enter opposite bishop ending, you should see the effect of your change, since it will now choose to give up the rook to avoid either getting mated or entering that opposite endgame with the -9 score...
Parent - - By mindbreaker (****) [us] Date 2013-03-21 19:43
Maybe you are right.  Maybe that is why there are two opposite colored bishop variables...maybe he made this one first and it seemed to do little to nothing.

I think I am not going to sweat it. It is what it is. The other one has an effect, I guess that is enough.
Parent - By bob (Bronze) [us] Date 2013-03-22 04:54
Note that they appear to be two different things.  One says BOC in an endgame is bad, the other says any BOC position is bad.  Making that one too big might begin to produce some strange play.  Particularly if you play one version with a big penalty for that term against the one with a big bonus.  One will try to force something the other hates.  Would be interesting how far the "hater" would go to avoid the BOC positions in the middlegame.  :)
Parent - - By h.g.muller (***) [nl] Date 2013-04-03 21:32
Are you sure you understand how this option is supposed to work?

The name seems to make no sense: one side's penalty is the other side's bonus. So it does not make sense to give a penalty to the side that enters an unlike Bishops end-game when he is two Pawns behind. It is actually in his advantage. It should only be a penalty for the leading side. And it also makes no sense to ever subtract more than the lead you have. If the leading side is only 20 centi-Pawn ahead because of a better centralized King, it makes no sense to ever subtract more than 20 cP. The 20 cP might not be worth much in the presence of unlike Bishops, but it should certainly not be seen as a disadvantage. Otherwise you would get an engine that in a nearly equal situation would intentionally seek an objectively inferior position just to let the opponent get the penalty and revert the score.

So unless this option was designed to wreck things for any non-zero setting, it must be a bit more subtle than its name suggests. E.g. the specified value could just be a maximum that is deducted from the lead, towards equality, but never beyond it. In that case setting it to a Queen might not have as much impact as you think. It would only avoid unlike-Bishop end-games where it is ahead, because it would always think it loses all advantage by doing so. But if it doesn't have an advantage, it wouldn't be scared off much.
Parent - - By mindbreaker (****) [us] Date 2013-04-04 05:14
I would think that it should be willing to give up its queen to prevent the opponent from having a bishop of the opposing color from its bishop in an endgame since the difference between the value of the queen and the opponent's bishop is something like 6000mp.  That means suicidal queen for bishop looks good to it.  It is possible that that only happens when up a queen anyway.  So the outcome is less likely to be different.  It probably matters what "endgame" means to the program.

The option's valid range is 10,000 to -10,000.  If that means anything.
Parent - - By h.g.muller (***) [nl] Date 2013-04-06 20:53 Edited 2013-04-06 21:05
The point is that it is impossible to undiscriminately penalize unlike-Bishop engines. Whatever you do to the score, there is a;lways one side that will benefit from it. Because an unlike-Bishops ending is an unlike-Bishop ending for BOTH sides. So when you set the ridiculously high bonus there must be some unlike-Bishops end-games the engine will start to like very much. Namely those where its opponent gets the penalty.

If you really want to know if the option works, you should not play games, but just set up an unlike-Bishop position, play with the option value, and look how it affects the score reported when you analyze the position. E.g. you could set up B vs B+4P. That should already give a huge advantage, as all Pawns are passers and probably count as 200cP this late in the end-game. Say you are at +800 when the option value is 0. Then you cans set the option to 3000 mP, and the score should drop to +500. If you set it to +8000 it should drop to 0. But I doubt it would drop to -200 if you set the option to 10000. Because it just does not make any sense to put the side without Pawns in the lead. He is in an end-game with unlike Bishops too.
Parent - - By bob (Bronze) [us] Date 2013-04-07 15:59
I can't imagine this being done any other way than to basically say "if you are ahead, then BOC is bad, if you are behind, BOC is good", in some form or another.  Mine has always been done like that, but there is no fixed penalty, the term just pulls the score toward "draw" when BOC is detected, to give the side that is winning a chance to avoid going in any deeper toward the draw...  Otherwise, it would seem to me it is a zero penalty, since BOTH sides would always be in a BOC ending if either side is in it.  :)
Parent - - By h.g.muller (***) [nl] Date 2013-04-07 18:29
Indeed, this is what I expect too. And I agree it is more natural to implement this as a multiplier rather than an additive penalty. (E.g. Fruit uses a factor 1/2 for this.) The option value range suggest Rybka does implement it as an additive term, though. But given that it is an additive term, I cannot imagine that it would be allowed to 'reverse the advantage'. I.e. if you set it to 200 milliPawn, and the leading side is only 100 mP ahead, I would expect it to only subtract 100 mP, so you are left with 0. That means ridiculously large penalties like +9000 mP could only manifest themselves if you are at least 9000 mP ahead in an unlike-Bishop ending. And it is not very likely to be an unlike-Bishop ending in that case, unless you have a ridiculously large number of Pawns.
Parent - - By bob (Bronze) [us] Date 2013-04-08 16:43
I just remembered an ugliness that some (myself included) used to use years ago, "asymmetrical evaluation".  IE what is good for white is not necessarily good for white, sides reversed.  Most common example was king safety.  But perhaps here.  The idea being to simply tell the program "BOC is bad for YOU, no matter what, but not bad for the opponent."  Which opens the door for misevaluation as we both mentioned.  A serious approach would be to recognize that such "terms" are relative to the side that is winning (or losing), such as "trade pieces when ahead but not pawns" which translates to "trade pawns when behind but not pieces" which is exactly what is intended.
Parent - - By mindbreaker (****) [us] Date 2013-04-10 18:58
"BOC is bad for YOU, no matter what, but not bad for the opponent."  If you are playing roughly equal opposition I agree that this would not be ideal unless used from long range (applied the whole game or from middlegame on) where very little may need to be given up to achieve.  Closer to the endgame it still may be better though if you are substantially stronger than the opposition.
Parent - - By bob (Bronze) [us] Date 2013-04-11 17:02
There's a solution I have always used.  Ratings.  I know the rating of my program, and that of my opponent, which lets me tune the "contempt" (draw score) appropriately, so that a draw against a better opponent is +, and a draw against a worse opponent is -, and the greater the rating difference the larger the +/- value is set.  Then you would actually favor a BOC ending against a stronger opponent, and dislike it even more than normal against a weaker opponent...
Parent - - By donkasand (***) [za] Date 2013-04-11 17:55
You must really LOVE these BOC endings!
Parent - By bob (Bronze) [us] Date 2013-04-11 20:25
funny, I suppose...
Parent - - By mindbreaker (****) [us] Date 2013-04-13 11:08
If you are using your own interface and playing online I suppose that is possible but most of the time an engine does not have access to that information.  Maybe an engine could make guesses as to the strength of the opponent given the moves it has already made in the game deducing who is making errors when there is disagreement on moves?  Some sort of minimal retrograde analysis?  That is slightly different than just making the decision for "+ or -" for the BOC based on whether one is ahead or not.  Different because the opening was just book and one may be playing from an inferior position to begin with.  If ground is being made up and one is close to even it may be preferable to avoid the BOC with the hope that one will continue to make inroads.  Also if you blundered and then your opponent did not notice but you did when it was his move, that won't affect the game but it does indicate that you are seeing a little more than him because you missed it one ply further out (there is the risk that it is a delusion...I suppose).  With humans time may be a factor as well.  The same is probably true with machines but, I can see not playing quite as risky machine vs machine as they are not likely to loose on time or miss anything shallow but with less time there is less depth and the inherent risk of missing something tactically or strategically deep but equally fatal.

Even playing online it may be suboptimal to go with the rating as presented by the server.  It might be better to look at how an opponent has actually performed against your engine...possibly evaluating how they have done in endgames as well vs one another.  If the server value is dramatically higher than it used to be, I would probably still go with the rating as they probably got some improved hardware or new engine. Also with access to the last 20 games and the opponents, you can see if the performance against those opponents is better or worse than you usually get against them.

Probably a lot of bother for an Elo point or two.

It is getting late, so I apologize in advance if this is hard to follow.
Parent - - By bob (Bronze) [us] Date 2013-04-14 05:35
It is built in to xboard/winboard.  This was one of many things Tim Mann added to the xboard protocol back in the 90's when I started the Crafty project and used xboard to connect to FICS/ICC...

Server ratings are certainly a bit "dynamic" in nature, but they provide reasonable guidance as to whether the opponent is better or worse than you, once the ratings are reasonably established.

Xboard also tells me whether I am playing a computer or not, as well, as you are exactly correct.  You DO want to play somewhat differently if you know your opponent.  Crafty has had a "swindle mode" for almost 20 years now.  Idea is that in some drawn endings, like KRP vs KR, against a human it is better to play on and see if he makes a mistake.  But with endgame tables, you are likely to just give up the pawn when in a drawn position, making it easy on the human.  Crafty uses "swindle mode" to avoid that.  It looks at the root move list and saves just the moves that lead to a draw, tossing the moves that outright lose (you can easily lose krp kr if you try).  It then disables the egtbs and does a normal search, but only chooses between the moves that are safe draws.  But doing this it will NEVER give up the pawn voluntarily if it can avoid doing so, giving the human a chance to make a mistake and lose.  There are many such endings that are drawn, but which can easily be lost (KRB vs KR, KBB vs KR, KBB vs KN, etc).

Playing against a computer that almost certainly uses EGTBs as well, disabling this lets games end quicker which speeds up testing.
Parent - - By h.g.muller (***) [nl] Date 2013-04-14 11:48
This shows you it is a fundamental mistake to score tablebase draws as 0 (or whatever your contempt factor happens to be). A much more sensible approach would be to only use the tablebase probe to set a multiplier to correct the native eval score, and use a factor 8 1/or 1/16 in theoretially drawn positions. Then the engine would also not willingly move from a draw that is nearly won to a draw that is nearly lost.
Parent - By bob (Bronze) [us] Date 2013-04-15 15:08
My egtb score has never been 0.0 for a draw.  I've always used (assuming basic draw = 0.0) -0.01 for drawn positions where the side on move is down in material, +0.01 for drawn positions where the side on move is ahead in material, and 0.00 where material is even.  But swindle mode still works...  It will prefer avoiding any repetition or 50 move draw if it is ahead in material, so long as it preserves at least a draw, and it will prefer heading to a repetition or 50 move draw if behind in material...

I should obviously add, I don't really use EGTBs after I ran a bunch of cluster tests that showed zero Elo gain with them.  So most of this is a moot point for me, today.
Parent - By mindbreaker (****) [us] Date 2013-04-04 05:35
The point of a personality adjustment is to change it from the presumed optimal settings...for most users.  In that context it does not make much sense to make it only work in a very narrow range.  I may be going for optimal (well in this particular case just seeing if it is alive) ultimately, but most players just want to see some interesting style...perhaps have it play like some opponent at their club, or a postal player may simply want to have it give them suggestions to keep a game going...to avoid drawish lines.  Or maybe they want to see what might be some attacking ideas to explore further.

From a user's point of view...it seems odd that a parameter would have a very limited range especially if the UCI says it is -10000 to 10000.
Parent - By Trotsky (****) [fr] Date 2013-04-06 21:19
perhaps this is telling you something already said, but the effect of opposite bishops is to increase the likelihood of a draw. I don't exactly remember what mine did (too long ago now) but I think it just reduced the score by some percentage. +2 would become +1.5 or -2 woudl become -1.5 etc.

what you want in effect is to tell the winning side not to reduce down to opposite bishops and vice versa for the losing side.
Up Topic Rybka Support & Discussion / Rybka Discussion / "Opposite Colored Bishops Endgame Penalty millipawns" fail!

Powered by mwForum 2.27.4 © 1999-2012 Markus Wichitill