Not logged inRybka Chess Community Forum
Up Topic Rybka Support & Discussion / Rybka Discussion / rybka on human standerd
- - By bluegene (*) [us] Date 2009-01-14 05:38
i have read on several forum topics that rybka 3 level of 3200 elo is exaggerated when compared to human play because they say of the randomness of opening books...can you please explain this a little more? and what would be rybka true level if it was allowed to play of FIDE tournaments like a regular player??   
Parent - - By Labyrinth (****) [us] Date 2009-01-14 10:36

>What would be rybka true level if it was allowed to play of FIDE tournaments like a regular player??  


My guess is that with enough tournaments it would settle around 2915.

In my opinion the elo system gets a little strange for the number one rated player that seems to have a great distance in playing strength from his or her contemporaries (at the very end of the bell curve). Playing strengths are moving targets as it is, like electrons in the electron cloud, no exact position but with some probabilities.
Parent - - By Permanent Brain (*****) Date 2009-01-14 11:42 Edited 2009-01-14 11:58
Yes... also, I think most people will have the strongest human players, 2700+ in mind when such questions are discussed. But 2915 would for example mean that such a player or engine should score 95% against a 2445 IM. How many draws can we expect from an IM, in ten long games against Rybka (e.g. on a quad)? Each draw would make 5%. I would guess that a 2445 can usually get more than one draw, but I'm not sure.

This consideration is probably something which can really raise doubts in very high ratings vs. humans. 3100 would mean, scoring 90% against opponents with 2734! That's hardly imaginable. That is why it makes sense to me that your estimation is below 3000. I guess you had something like 67% vs. Anand or Topalov (~2800) in mind.

Within computer chess competition, it could be observed ever since that these large Elo differences arise... I don't understand much of it but I think it must be a problem of the rating formulas. A calibration in a way to simply reduce all numbers (as SSDF did once or twice), to make the top ratings seem "more realistic", does certainly not solve that problem, because then engines and comps on the bottom seem clearly underrated relative to FIDE Elos. - This is something which would require mathematicians and statisticians, to improve. I don't know if the Bayesian Elo - used by CCRL - is the cure. A calibration is missing anyway, because there are not enough man vs. machine games which would be suitable for that. Those GM tournaments in Argentinia were interesting though, in this regard. Not the very top, but "medium strong" GM and IM play there, and most often a normal retail version of a program was included, no top secret stuff and no supercomputer. I wonder if someone has collected these results for the purpose of human vs. computer rating calculations. I think it were at least 4 or 5 tournaments (once per year), their name being something with "Magistral...".

http://remi.coulom.free.fr/Bayesian-Elo/
Parent - - By Labyrinth (****) [us] Date 2009-01-14 13:38

>2915 for example would mean that such a player or engine should score 95% against a 2445 IM. How many draws can we expect from an IM, in ten long games against Rybka (e.g. on a quad)? Each draw would make 5%.


An excellent point! It is sort of difficult to imagine our 2445 rated player getting sweeped that badly. I would sure hate to be this player and have to play such a match though!
Parent - By SpiderG (***) [us] Date 2009-01-14 20:07
I believe that chess engines will get much stronger than the strength they are at now... chess engines will probably get to a 3300 FIDE rating in 10 yrs... Rybka 3 is not close to the best!
Parent - - By lkaufman (*****) Date 2009-01-15 19:56
I would be happy to score one draw in ten serious even games with Rybka 3 on a quad (I'm 2409 GM). But I don't think this is very relevant. The ratings of top human players are based on playing their close rivals; how often does Anand or Topalov play 2400s? So the question is what percentage would R3 score against her nearest rivals, say the players above 2750. With White the score should be overwhelming, maybe 90%, while with Black Rybka would perhaps win half and draw half for 75%. So this would be 82.5% which against a 2775 average would be around 3040. This is just a guessing game, but I am extrapolating from the results of our many matches with GMs under various conditions. Anyway I'm confident that the real rating would be in the 3000 to 3100 range.
Parent - - By bluegene (*) [us] Date 2009-01-19 05:22
you and other chess engine developers and may be chessbase, convekta....should press for a serious FIDE tournament(s) between man and machine, probably put foreword nice financial rewords as incentive for humans to participate, may be making them play as part of national teams against chess engines for exp. Russia vs rybka, it may be less embarrassing for individual players if they loose a game    
Parent - - By lkaufman (*****) Date 2009-01-19 05:44
I think that the time for this has past. Unless some big corporate sponsor appears, it is unlikely that we would have funding for a team of the very top players; at best we might get a team in the low 2700s. Unless there is some sort of handicap involved, the only real question would be how often the humans could get draws when they had the White pieces, as draws with Black or wins with White would be extremely unlikely based on the Rybka matches and Hydra's tournament and match games. I happened to meet with Kasparov a few days ago (by an extremely unlikely coincidence) and I asked him if he might have any interest in playing a match with Rybka, but he replied in the negative.
Parent - - By Mark (****) [us] Date 2009-01-19 14:35
Must have been pretty neat meeting Kasparov!  Did he have anything to say about Rybka?  Whether he uses it, how he does against it, etc.?
Parent - By lkaufman (*****) Date 2009-01-19 22:17
He seemed quite familiar with Rybka, and indicated by gesture that he knew how strong it was. He was surprised to learn of my involvement with Rybka (although he did hear of my winning the World Senior) but he did know Rajlich's name. It was actually the third time I met him (roughly every decade!), but I don't think he remembered meeting me before.
Parent - By FWCC (***) [us] Date 2009-01-19 21:45
I would agree ,the real rating would still be over 3000.There are hardly any humans that can at least draw with Rybka at 40/120
Parent - - By M ANSARI (*****) [kw] Date 2009-01-19 05:45 Edited 2009-01-19 05:49
If anything, Rybka's true rating against humans would be much higher than 3200 ELO.  Unlike in computer chess where even a weak engine can force a draw by having a very good prepared line that runs 40 moves deep, a human simply cannot store several hundred MB's of book moves.  Plus he has to play perfect chess without a single minor misstep ... another thing which is very hard to do.  And tactically ... well let's not even go there.  Ofcourse once in a blue moon he will get the dream position where the position is closed and tactics are not there, or even get a position where Rybka is known to have a bug such as in endgame or wrong bishop ending ... but even then holding that game is not so easy because the human will never reallly know if the move played by Rybka is a bug or a bluff.  A draw for even the strongest GM in the world would be quite an accomplishment ... a win would be a miracle.
Parent - - By Permanent Brain (*****) Date 2009-01-19 07:41 Edited 2009-01-19 07:45

> If anything, Rybka's true rating against humans would be much higher than 3200 ELO. 


One problem is, true ratings (in the formal sense) require certain results against opponents of certain strengths. The best humans have ~2800 FIDE Elo. 3200-2800 = 400. If you look at the rating tables(*), you can see that such a performance requires a 92% score, against them (and an even higher scored against less strong opponents).

If the master scores eight losses and two draws in ten games, it is already 10%-90%. Against a 2650, this gives a performance of 3016. And you cannot really claim that a 2650 GM can't even get two draws from ten games (IOW. 2 from his 5 white games).

It's not only about wins and losses: Good masters will get draws. - I think Larry Kaufman's estimation is more realistic.

*) Also, the tables show that against 2500s, a 3200 performance is almost impossible: A difference of +700 requires over 99%!

http://www.fide.com/component/handbook/?id=75&view=article
Parent - By M ANSARI (*****) [kw] Date 2009-01-19 08:48
I guess you might be right in the context that a draw gets quite a bit of ELO points.  Anyway it is hard to tell without testing it for real and unfortunately not many 2600+ GM's would want to go mano a mano against the beast.  While I don't know exactly what the rating would be against humans, the result  in a match between Rybka 3 Octa with good book is not in doubt ... the human will get crushed severely.
Parent - By Uri Blass (*****) [il] Date 2009-01-19 07:47
I disagree

weak engines usually do not have a very good prepared book and if we talk about engine like hiarcs than it is clearly stronger than humans.

I believe that top humans can get at least 1 out of 10 against rybka in a fair match.
engines usually do not know to play for a draw.
Humans know better to play for a draw.
Up Topic Rybka Support & Discussion / Rybka Discussion / rybka on human standerd

Powered by mwForum 2.27.4 © 1999-2012 Markus Wichitill