Not logged inRybka Chess Community Forum
Up Topic Rybka Support & Discussion / Rybka Discussion / Rybka4 database of games where Rybka doesn't play good endga
1 2 3 4 Previous Next  
Parent - - By Banned for Life (Gold) Date 2010-07-31 19:45
Vas has strong strengths and strong weaknesses. His major strengths are:

1) He is more knowledgeable than anyone else when it comes to what the keys are to making the highest Elo chess engine, and
2) He is a very hard worker.

His primary weakness is that he is not the type of person who produces a polished product. Instead he turns out products that are 98% done. This isn't uncommon, but it leaves him open to having people hack his programs, include the last 2%, and re-release them as improved products. This was the fate of R3, and as things stand now, I am 100% sure this will also be the fate of R4.

After this occurs, instead of fixing the problems, Team Rybka will concentrate on the symptoms of the problem, and will make a completely futile attempt to keep people from hearing about the polished alternatives by not allowing their names to be mentioned, and by convincing the major test groups not to test them.

As far as endgames go, Vas hasn't spent a large percentage of his time working on endgames because there isn't a large enough associated Elo gain to justify the time. People who don't agree with this tend to be more interested in analysis than engine-engine play. Higher Elo engines are generally better sellers, so there is a business case for this as well.
Parent - - By keoki010 (Silver) [us] Date 2010-07-31 20:10

> 1) He is more knowledgeable than anyone else when it comes to what the keys are to making the highest Elo chess engine, and
> 2) He is a very hard worker.

Not so sure I agree with #2.  If he was, he'd have patched both R3 and R4.  I think all he did for R4 was reprogram R3 cluster.
As far as the child, I'm sure he is capable of taking care of both! :neutral:
Parent - By Banned for Life (Gold) Date 2010-07-31 23:44
Very hard workers don't necessarily finish the jobs they are working on before moving onto something else.
Parent - - By Uly (Gold) [mx] Date 2010-07-31 22:48

> Instead he turns out products that are 98% done.

Probably because in the development phase he jumps from Alpha to release, or well, only having 6 hours of beta testing... (that's more or less what the final compile got)

The sad thing is, that even if we had a more decent beta testing stage (say, a version is only released if it has no issues for two weeks), the MultiPV, show stopping bug would have been in anyway, as it was discarded as "normal" behavior early on. It seems getting a small pool of REAL users into a beta testing stage (that should last more than 6 hours...) would have been necessary.
Parent - - By Banned for Life (Gold) Date 2010-07-31 23:51
I spend most of my time doing backward analysis with MPV, so I knew Rybka 4 was broken the first time I tried to use it. Since I'm much more of an SOB than the people in the testing group, I'm very sure that my reports wouldn't have been confused with normal increased branch factor delays (or maybe Vas would have expelled me from the beta testers group! :lol:).

By the way, the delays I see in single PV mode when I'm analyzing backward and forward only occur when I'm using large pages. I find this very strange because I use large pages with all of my chess engines (except R3) and never had any issues.
Parent - - By keoki010 (Silver) [us] Date 2010-07-31 23:56
The large pages delay is,,,, have you tried going to 2 pv then 1?  When I try it usually start analyzing again. Of course I'm sure you have!
Parent - - By Banned for Life (Gold) Date 2010-08-01 00:03
When I use MPV, I generally increase the number of PVs until I have all the lines that are within some delta, maybe 10 or 15 cp. It doesn't take long using this approach before the kn/s goes up and no more output shows up. At that point, you can go back to a single PV and sometimes this will clear things up, but sometimes you also need to clear hash. This is a separate problem from the one associated with large pages and single PV only, where the kn/s doesn't rise, but the engine may get stuck at very low depths for extended time periods (I'm not sure if it's really stuck, or if it is trying to calculate for a large depth while not proving output for depths greater than say 13 to 17.
Parent - By keoki010 (Silver) [us] Date 2010-08-01 00:40
Ok, like I said I usually just go to 2 pv and then when it starts showing up back to 1 and let it go from there.  Of course if you have a fail hi/low then everything is different.
Parent - By Labyrinth (*****) [us] Date 2010-08-01 00:16
I really think Vas could benefit from having a 2nd programmer on his team even if maybe just to write small, simple bits of code. I can understand how he is worried about security though.
Parent - By dr_zied_haddad (**) [fr] Date 2010-08-01 01:56 Edited 2010-08-01 01:59
I'm astonished that only 6 hours of beta testing is sufficient.
As said Larry in previous posts, using all the computing power of the huge hardware of the cluster, can make the tester play thousands and thousands of games in a limited time, which in deed will ensure ELO increase. But concerning quality of play, playing style, some bugs related to misevaluation of certain positions etc. need an eye check of the games.
My technique concerning developing settings for engines, relies at first on some changes in piece or evaluation parameters that is based on analysis of positions from games played by the default or developed setting. Then i analyse again the games to see if my changes are good or not (concerning what i look for: positional play, tactical play ..) Naturally if the setting looses much rating it's clear that my changes are not good. May be not all the changes (just a small part that needs to be fixed) And so on. So i rely on the fact that independent checking of the games (especially those games that are lost, then short draws, then very long draws ...) and on the board analysis under different circomstances: infinite pondering, incremental time control etc. is necessary to release a good product, with less hidden bugs, but also this evaluation could help the programmer to focus on such or such direction according to suggestions of the beta testers involved in this process. The role of such guys are in my opinion at least as important as the hardware guy, and the book master......
Parent - By JohnL (***) Date 2010-08-03 18:04
I'd guess that some major reasons for spending a lot of time on the cloud online Rybka and the cluster instead of other stuff is both that is simply more fun and that it gives a much better CV. Spending a lot of time on idiosyncratic chess parameter optimizations may not be that fun after a couple of years nor useful in a possible post-Rybka career.

After all, the Rybka SW development is practically a one man show. It is not exactly that some business analyst analyses the market and gives input to a system analyst who produces detailed specifications to Vas about what to implement :smile: Or that some project leader is making sure that he does what he should with the right prioritizations... :grin:.

I also think that he sometimes has shown some of the bad sides of the "agile programmer" attitudes which is very popular nowadays; just do what you like and don't care to much about quality or the customers, that will just automatically be fine. The (often unrealistic) idea that you don't need to do systematic testing or that you can deliver in very short time is also typical. Or that theoretical modelling is useless and any kind of prediction/planning is not possible.

But lets give him a chance, I think "cloud Rybka" is a very interesting concept, even if a problem may be that people aren't as "rational" as you would expect. And I think one of his advantages vs. the competitors has been that he understands the non-pure programming aspects of SW development much better.
Parent - - By patrick delaurentis (**) Date 2010-08-01 17:01
i do give props to vas on rybka 3, it is a very good program, i just feel he could of put a bit more time in rybka 4.
Parent - By keoki010 (Silver) [us] Date 2010-08-03 18:20
and deliver on the fix he promised :grin:
Parent - By dr_zied_haddad (**) [fr] Date 2010-08-13 21:48
Here is another position where Rybka overestimates his chances as white.

8/2R2pk1/1PN5/Pr4p1/8/8/3K4/7b w - -

I think that it'll be a draw, because white cannot move his passed pawns and promote.

What do you think about?
Parent - - By dr_zied_haddad (**) [fr] Date 2010-08-13 21:51
Another interesting ending.

8/8/1p1k3p/2p2p1P/P1B5/8/3K4/8 b - - 1 59

Rybka evaluates it as winning for White. But it's hard to find the win!?
I think black got good chances to draw this ending.
Parent - By Felix Kling (Gold) [de] Date 2010-08-25 10:34
That's an easy win. (not that interesting..)
- By Yoav Dothan (**) [il] Date 2010-08-21 12:13
Here is another example of a draw, where Rybka believs black is winning:
8/8/2B5/8/8/4n2K/7p/6k1 b - - 0 13
- - By Yoav Dothan (**) [il] Date 2010-08-21 12:20
Another one - I know that Rybka does not evalute B underpromotion, but here it is vital, as the Q promotion leads to pat
4B3/5P2/3b4/7k/5r1B/6K1/8/8 w - - 0 6
Parent - - By Regularuser (***) [gb] Date 2010-08-21 12:30
Has anyone got any examples of where Rybka does not play the endgame well?   These are yet more examples of where it misevaluates but that does not necessarily mean that it would play the endgame badly - it could misevaluate and still find the best moves before and after the given positions.   There have been precious few examples of actaully misplayed endgames (the topic of the thread) in this thread.   I am beginning to think that Rybka plays the endgame almost perfectly albeit with the odd misevlauation along the way :)
Parent - - By Uly (Gold) [mx] Date 2010-08-21 23:49
I've been finding positions that Rybka evaluates and plays incorrectly, both drawing a won position by simplifying to a drawn endgame and losing drawn positions. All the cases I've seen would have been solved correctly with 6men tablebases, and sometimes with just 5men...

I guess I'll keep an eye opened and post cases I find on this thread.
Parent - By Regularuser (***) [gb] Date 2010-08-22 09:03
Thanks - I'll look out for them.  It is interesting to know that the ones you have come across would be solved by tablebases.
Parent - - By dr_zied_haddad (**) [fr] Date 2010-08-22 22:42
Hi my friends
I don't understand with all the knowledge of nowadays engines and all the powerful computers we have, how does the top engines are sometimes unable to play endgames efficiently. This is for me an important issue, because it geopardize the use of these engines as analysis tools etc.
Hopefully, i can also rely on my chess skills (i'm almost 2200 rated) with a good endgame knowledge. May be 5 or 6 pieces endgame tables are helpful (for sure), i will feel more confortable with the engines that doesn't use these tablebases because they are implemented with sufficient knowledge to solve these endgames alone.
Parent - - By Uly (Gold) [mx] Date 2010-08-23 00:13

> I don't understand with all the knowledge of nowadays engines

For Rybka, the knowledge was removed to speed up the engine.
Parent - - By dr_zied_haddad (**) [fr] Date 2010-08-23 06:41
This knowledge removal is in my opinion not a good thing. The best thing to do i.m.h.o is to keep it implemented in the engine, we could switch it on/off. But also a simple function could set it on (for example in long time control when there are 10 pieces on the board this function is set on automatically with a warning in the screen.
Parent - - By Uly (Gold) [mx] Date 2010-08-23 08:33
Yeah, but then the engine wouldn't be optimized for that knowledge, so results would be unpredictable (say on a worst case, that Rybka becomes 200 elo weaker by switching knowledge on, then using Rybka would be pointless), and Vas wouldn't worry about it since it would just be an experiment, if we complain enough about the side effects he'd just remove the switch.
Parent - By dr_zied_haddad (**) [fr] Date 2010-08-25 07:51
I think that this diserves dispatching the work. In fact, it seems that the major part of Vas's beta testers team are only testing Rybka in engine vs engine matches. this for sure insures to find the best version with the possibly higher ELO increase, but it's not the most important, because normal users like the biggest part of the participaters in this forum, are interested with a product that analyses well, and for sure they want to buy the best possible product. Rybka is incredibly strong, it's biggest weakness is endgame misunderstanding of certain positions. For the development of cloud Rybka this issue is of prime importance, as we should rely in Rybka analysis. Thus i think that may be developing an engine with more endgame knowledge is much more important than gaining ELO. Even if this knowledge will slow down a little bit the engine, this could be a prejudice in Blitz time control games, but certainly a big plus in slower or for correspondance game analysis.

I still believe that such knowledge should be set on or off in the setting (it's worth to see the difference in performance between the two engines in Blitz time control games), but also even if it is set in Off during the game, there is still a possibility that the engine uses this additional knowledge in endgame. May be, if we have a quad, Vas could think about a child process were Rybka uses 3 processors with the engine without this additional knowledge sothat the overall speed of the engine is preserved, and use a thread with the additional knowledge (let say starting from 16 pieces on the board) to see if there is a significant difference in the evaluation of the position. In such a case, i will certainly rely on this additional knowledge for a more accurate analysis. But this complicated issue must be beta tested energetically.

I don't know how Vas works, if he tends to give his team a new beta engine each hour, this plan seems difficult. But in an hour i don't see a dramatic change in the engine (a part from fixing bugs) and this team must be focused on testing for 1-2 weeks thourougly this feature.

A revolution in chess endgame studying !!

This feature could be only a feature of Cloud Rybka !!!
Parent - - By JohnL (***) Date 2010-08-25 16:55
I find it hard to believe that adding endgame knowledge in the right way would cost any significant performance.

I think it is just a matter of some refactoring and domain analysis, which Vas' does not prioritize while it doesn't add significantly to the ELO.
Parent - By Uly (Gold) [mx] Date 2010-08-25 19:13

> I find it hard to believe that adding endgame knowledge in the right way would cost any significant performance.

Significant enough for Vas to remove it, otherwise he wouldn't have bothered as removing the knowledge is more work that just leaving it there.
Parent - - By JohnL (***) Date 2010-08-25 22:22
I think one example is the Kramnik-Mamedyarov game above.

But a problem is that it is typically non-trivial to prove it.
Parent - By Regularuser (***) [gb] Date 2010-08-26 07:34 Edited 2010-08-26 07:38
Agreed, this is an example where R4 (almost certainly) went wrong.

Actually I thought that as R+4 vs R+3 can give reasonable practical chances that with the doubled pawns white might have a theoretical win.   But as someone else pointed out (possibly in a different thread) the doubled pawns can be helpdful for holding 4 vs 3.   As I started to analyse I began to realise how hard it was for white to make progress.

Anyway, Kramnik is probably the best in the world at these types of positions, so I think we can say he made the right choice.  It would be interesting to know Kramnik's view on whether it is a win with best play without Nxf6 and whether the rook ending gave any real chances at all. 

But I think this is the most clear cut case in this thread for R4 going wrong in the endgame.   I am sure that mis-evlauation will quite often (but not always) lead to mis-play, its just that we have not seen that demonstrated very often.
- - By Felix Kling (Gold) [de] Date 2010-09-03 03:09
8/4kp2/2p3p1/p1P1P3/PpK1PP1p/8/6PP/8 b - -

big plus for white - but easy draw.
Parent - By dr_zied_haddad (**) [fr] Date 2010-09-04 13:56
Excellent example of engines nowadays weaknesses (misevaluation of positions).
Parent - - By dr_zied_haddad (**) [fr] Date 2010-09-09 15:33
What do you think about this position?

8/5p2/6k1/p2R4/r6p/4K3/6PP/8 w - -

White plans: Exchange one of the g or f pawns with Black then let Black push his pawn till a2. His king will have no serious place to hide from perpetual check and if he hides, he will be far away from defending the a2 pawn. The Black rook is busy defending this pawn. It'll be a hard fought draw i think. Clearly, white got a stable plus in this position, but playing good chess could give white the possibility to manage for a draw.

Try to let the top engines analyse. They all will give at least +0,7 or more in Black's favour.
Parent - - By dr_zied_haddad (**) [fr] Date 2010-09-19 20:51
Need help to evaluate this position.

8/5p2/6k1/p2R4/r6p/4K3/6PP/8 w - -
Parent - - By Indrajit (***) [in] Date 2010-09-20 13:38
What kind of help do you need?
Parent - - By dr_zied_haddad (**) [fr] Date 2010-09-25 18:12
A few days away from computer.
I need help on how to evaluate this position, using various engines, but also naturally usin Rybka
Parent - By Indrajit (***) [in] Date 2010-09-25 22:32
Sorry.. am not an expert on evaluating..but from what I got from my system, Black seems to have some advantage. :neutral:
Up Topic Rybka Support & Discussion / Rybka Discussion / Rybka4 database of games where Rybka doesn't play good endga
1 2 3 4 Previous Next  

Powered by mwForum 2.27.4 © 1999-2012 Markus Wichitill