Not logged inRybka Chess Community Forum
Up Topic Rybka Support & Discussion / Rybka Discussion / Rybka 4 misevaluating endgames
- - By JohnL (***) Date 2010-07-18 18:02
Couldn't resist posting two of todays end positions from the Sparkassen tournament. Rybka thinks there is about half-pawn resp. whole pawn advantages which is pretty big if you compare with same evaluation in the opening.
But in both cases players agreed a draw. The better side didn't even bother playing on, despite no risk.

Mamedyarov-Kramnik

8/5pkp/6p1/r7/8/3R1P2/5P1P/6K1 w - -


[-0.46]  d=20  32.Rd4 Kf6 33.Kg2 Rg5 34.Kh1 Ke6 35.h4 Rh5 36.Kg2 g5 37.hxg5 Rxg5 38.Kh3 h6 39.Re4 Kf5 40.Rg4 Rxg4 41.fxg4 Kf4 42.Kh4 f5 43.gxf5 (0:00:31) 360kN
[-0.48]  d=19  32.Rd4 Kf6 33.Kg2 Rg5 34.Kh1 Ke6 35.h4 Rh5 36.Kg2 g5 37.hxg5 Rxg5 38.Kh3 h6 39.Re4 Kf5 40.Rg4 Rxg4 41.fxg4 Kf4 42.Kh4 f5 43.gxf5 (0:00:26) 321kN
[-0.48]  d=18  32.Rd4 Kf6 33.Kg2 Rg5 34.Kh1 Ke6 35.h4 Rh5 36.Kg2 g5 37.hxg5 Rxg5 38.Kh3 h6 39.Re4 Kf5 40.Rg4 Rxg4 41.fxg4 Kf4 42.Kh4 f5 43.gxf5 (0:00:26) 321kN
[-0.48]  d=17  32.Rd4 Kf6 33.Kg2 Rg5 34.Kh1 Ke6 35.h4 Rh5 36.Kg2 g5 37.hxg5 Rxg5 38.Kh3 h6 39.Re4 Kf5 40.Re7 f6 (0:00:22) 321kN
[-0.49]  d=16  32.Rd4 Kf6 33.h4 Kg7 34.Kg2 Ra7 35.f4 f5 36.Kg3 Kh6 37.Rd3 Kh5 38.Rd1 Ra3 39.f3 Ra7 40.Rb1 Rd7 41.Kg2 (0:00:16) 321kN
[-0.49]  d=15  32.Rd4 Kf6 33.h4 Kg7 34.Kg2 Ra7 35.f4 f5 36.Kg3 Kh6 37.Rd3 Kh5 38.Rd1 Ra3 39.f3 Ra7 40.Rb1 Rd7 41.Kg2 Rc7 (0:00:15) 321kN

Naiditsch-Leko

8/8/5pk1/7p/4P2P/5KP1/4R3/4r3 b - -


[+1.07]  d=20  53...Rf1 54.Kg2 Rd1 55.e5 Rd7 56.Kf3 Kf5 57.exf6 Kxf6 58.Rb2 Rd3 59.Kf4 Rd4 60.Ke3 Rd5 61.Rb6 Kf5 62.Kf3 Rd3 63.Kg2 Rd5 64.Kh3 Ke4 65.Rh6 Rb5 66.Rg6 Ra5 67.Re6 Kf5 68.Rc6 Rb5 (0:01:03) 284kN
[+1.07]  d=19  53...Rf1 54.Kg2 Rd1 55.e5 Rd7 56.Kf3 Kf5 57.exf6 Kxf6 58.Rb2 Rd3 59.Kf4 Rd4 60.Ke3 Rd5 61.Rb6 Kf5 62.Kf3 Rd3 63.Kg2 Rd5 64.Kh3 Ke4 65.Rh6 Rb5 66.Rg6 Ra5 67.Re6 Kf5 68.Rh6 Ke4 (0:00:56) 212kN
[+1.07]  d=18  53...Rf1 54.Kg2 Rd1 55.e5 Rd7 56.Kf3 Kf5 57.exf6 Kxf6 58.Rb2 Rd3 59.Kf4 Rd4 60.Ke3 Rd5 61.Rb6 Kf5 62.Kf3 Rd3 63.Kg2 Rd5 64.Kh3 Ke4 65.Rh6 Rb5 66.Rg6 Ra5 67.Re6 Kf5 68.Rh6 Ke4 (0:00:49) 157kN
[+1.07]  d=17  53...Rf1 54.Kg2 Rb1 55.e5 Rb7 56.exf6 Rb6 57.Rf2 Rxf6 58.Rb2 Rf5 59.Rb6 Rf6 60.Rb5 Rf5 61.Rb6 Rf6 62.Rb5 Rf5 63.Rb6 Rf6 64.Rb5 Rf5 65.Rb6 Rf6 66.Rb5 Rf5 67.Rb6 Rf6 68.Rb5 Rf5 (0:00:30) 107kN
[+1.01]  d=16  53...Rf1 54.Kg2 Rc1 55.Re3 Kf7 56.Rb3 Kg6 57.Rb6 Rc2 58.Kf3 Rc3 59.Kf2 Kf7 60.Rb5 Kg6 61.e5 Rc2 62.Kf3 Rc3 63.Kg2 Rc2 64.Kh3 fxe5 65.Rxe5 Rc6 66.Rg5 Kh6 67.Rd5 Kg6 68.Kg2 Rf6 (0:00:13) 102kN
[+1.01]  d=15  53...Rf1 54.Kg2 Rb1 55.e5 Rb7 56.exf6 Rb6 57.Rf2 Rxf6 58.Rb2 Rf5 59.Rb6 Rf6 60.Rb5 Rf8 61.Rg5 Kh6 62.Rd5 Kg6 63.Kh3 Rf6 64.Rg5 Kh6 65.Rc5 Rb6 66.Kg2 Kg6 67.Rg5 Kh6 68.Rd5 Kg6 (0:00:11) 93kN

A little disappointing, one would figure that these quite common misevaluations would be not too difficult to add knowledge about.
Some kind of built-in pattern tablebase, perhaps?
Parent - - By Moz (****) Date 2010-07-19 05:13
On the bright side, this is an ideal position for those who are looking to learn more about Rook Endgame Scaling. :grin:
Parent - - By JohnL (***) Date 2010-07-19 21:44
Maybe :smile:  I haven't tried the "rook endgame scaling". But what is the point? I assume Vas has already optimized this parameter and isn't it specific knowledge that is needed?

And what is special with rook endings? Many people think that rook endgames are particularly drawish. But as far as I know the statistics, Q-Q endgames are more drawish and N-N, B-B (same colour) and N-B about equally drawish as the R-R.
Parent - - By Moz (****) Date 2010-07-19 22:39 Edited 2010-07-19 22:42
I don't fully understand how RES works but for this particular endgame, reducing RES reduces the evaluation at a ratio of nearly 1 to 1 which is a clear indication that the position is drawn.  Every other engine I used to test this position was similarly over-optimistic.  This isn't a Rybka issue, it's a chess engine issue in general.  At least Rybka gives us a parameter that helps shed light on a certain class of rook endings.
Parent - - By JohnL (***) Date 2010-07-20 19:18
OK, but I don't understand what is difficult. Just pick either or both Mueller/Lamprecht or the Dvoretsky book and just add all principles mentioned.
(No, I have never tried myself :grin: )

And treating two games as one observation getting rid of the noise variance from advantageous opening positions will significantly speed up the test procedure so there should be time for endgame improvements as well :wink:
Parent - - By Moz (****) Date 2010-07-21 00:48

> Just pick either or both Mueller/Lamprecht or the Dvoretsky book and just add all principles mentioned.


The problem is that engines don't operate on principles and making the changes necessary to calculate this particular endgame correctly will negatively impact performance and calculations in other areas.  When Vas is scraping the bottom of the Elo barrel I'm guessing we'll see much more focus on this kind of endgame issue.
Parent - - By JohnL (***) Date 2010-07-21 17:04
The problem is that engines don't operate on principles and making the changes necessary to calculate this particular endgame correctly will negatively impact performance and calculations in other areas

I would say that a basic evaluation function is just a number of quantified principles added together.
Parent - By Moz (****) Date 2010-07-21 23:57

> I would say that a basic evaluation function is just a number of quantified principles added together.


I was referring to the way that humans apply principle and experience to steer games towards favorable endings instead of pure calculation.  Correctly identifying a position as drawn when one side is up a pawn is difficult enough but doing it early enough to change the outcome of the game without weakening the engine in other areas must be quite difficult (or engine devs just don't like endings and don't want to bother with them!
Parent - - By onursurme (***) [tr] Date 2010-07-30 09:07
When there are only rooks and pawns at the board (and also kings - this is for Uri), a spesific program module can
start to operate.
Parent - By dr_zied_haddad (**) [fr] Date 2010-07-30 14:50
Hi
there is a specific thread where we discuss this issue. See the following link
http://rybkaforum.net/cgi-bin/rybkaforum/topic_show.pl?tid=17938

please feel free to add the positions and games where you judge rybka plays poorly. We intend creating a tstset of such endings for future Rybka development with your help, all of you. Thanks
Parent - By JohnL (***) Date 2010-07-30 20:02
Yes, there is a powerful tool called IF-statement :smile:

IF {only a bunch of pieces left} THEN {use special evaluation}

And done cleverly, it can't practically cost anything to negate this in case no "endgame".

For example, there are different cases of 4 vs. 3 pawns and a rook, each with different chances.
Doubled pawns can be an advantage when stopping an opponent's majority and a passed pawn is something you generally want to have etc. You can find this is any book.

Also if the evaluation is very high/low for a move but doesn't change with depth, why not simply lower it because the chance of draw is increasing as well.
But until the chess programmers realize these things, we seem to have to live with crappy endgame evaluations...:cry:
Parent - - By Uly (Gold) [mx] Date 2010-07-20 00:00

> I assume Vas has already optimized this parameter and isn't it specific knowledge that is needed?


The parameter wasn't optimized at all.
Parent - - By JohnL (***) Date 2010-07-20 19:13
That is hard to believe :surprised: You are sure the default is not the strongest?
Parent - - By Uly (Gold) [mx] Date 2010-07-21 00:56

> You are sure the default is not the strongest?


No, it wasn't tested much, anything can be the strongest, including default, but there being so many possible values, the chance of the default being the strongest is very low.

I think the strongest should be around Rook Endgame Scaling 60.
Parent - - By dr_zied_haddad (**) [fr] Date 2010-07-21 11:11
In your tests, have you made it by scaling R.E.S with different values (100,90,80,60,50 ...). It's clear that getting rid of the R.E.S by setting it to zero is very bad (loss of 150 rating points by loosing many many endgames);

We need to understand a little bit what is this Rook endgame scaling (only Vas can) to try to focus on why such or such value seems to work better. So may be the right scaling is default, but the knowledge implemented in Rybka is the weak and insufficient parameter, which tends to be prooven by the fact that Rybka doesn't "understand" many of the Rook endgame positions (luck of knowledge). So in such cases, brut force (= variation calculation) with the powerful tools of today (fast CPU) could be more helpful because it analyses much more positions, which could find the right lines.
Parent - - By Uly (Gold) [mx] Date 2010-07-21 11:30

>> We need to understand a little bit what is this Rook endgame scaling


This parameter enters in effect when the passed pawn has an own rook ahead of it, or similar drawish structures. The parameter works as a percentage, that is, if such position is valued as 1.00, a RES value of 50 will make it be 0.50 instead, a value of 30 will make it 0.30 instead, a value of 2.00 with RES at 60 would become 1.20, and so on.
Parent - By dr_zied_haddad (**) [fr] Date 2010-07-21 11:49
Aha
I understand a little bit more.
For sure, tonight is gonna be a long night.
Parent - By yanquis1972 (****) [us] Date 2010-07-21 06:44
with RES/60 (my default since i've found out about what RES does), R4s scores are within a handful of centipawns of X53 with triplebases loaded. and in the second positions its eval was lower by varying bits than any top engine i tried. it seems to me that while rybka 4 for whatever reason takes a common engine problem out of proportion; it can often be corrected or at least more in line with its peers with a simple adjustment such as this. this doesn't make it better, simply that all engines seem to share this problem & vas has not yet found or released an answer to it.  the good news is you can create a personality with deflated RES and use it as a primary analysis engine so that the scores won't be quite so wonky.
Parent - By dr_zied_haddad (**) [fr] Date 2010-07-20 06:47
Hi John

There is a special thread for this issue. Main objective is to collect the maximum positions where Rybka does not play well in endgames, so that we can use these testset positions for future rybka development.
- - By sarciness (***) [gb] Date 2010-07-29 22:41
I heard that all rook endgames are drawn- eval should be 0.00. :smile:
Parent - - By dr_zied_haddad (**) [fr] Date 2010-07-30 14:49
Hi
there is a specific thread where we discuss this issue. See the following link
http://rybkaforum.net/cgi-bin/rybkaforum/topic_show.pl?tid=17938

please feel free to add the positions and games where you judge rybka plays poorly. We intend creating a tstset of such endings for future Rybka development with your help, all of you. Thanks
Parent - By patrick delaurentis (**) Date 2010-07-31 18:22
if we want better chess engines we need to stop buying every one that comes out with only slight improvements- with the expertise, money, and hard ware vas has he has surely made a engine that could probably crack elo 3700 already but just like the old saying the money is in the medicine not the cure- he is going to milk it as much as he can if we keep buying- so if he only states their is an increase of 40 elo points or some elusive statement like search has been improved like on rybka 4 dont buy it- make him work for his money and i guarantee you will see the true elos appear.
Up Topic Rybka Support & Discussion / Rybka Discussion / Rybka 4 misevaluating endgames

Powered by mwForum 2.27.4 © 1999-2012 Markus Wichitill