>What would be rybka true level if it was allowed to play of FIDE tournaments like a regular player??
My guess is that with enough tournaments it would settle around 2915.
In my opinion the elo system gets a little strange for the number one rated player that seems to have a great distance in playing strength from his or her contemporaries (at the very end of the bell curve). Playing strengths are moving targets as it is, like electrons in the electron cloud, no exact position but with some probabilities.
This consideration is probably something which can really raise doubts in very high ratings vs. humans. 3100 would mean, scoring 90% against opponents with 2734! That's hardly imaginable. That is why it makes sense to me that your estimation is below 3000. I guess you had something like 67% vs. Anand or Topalov (~2800) in mind.
Within computer chess competition, it could be observed ever since that these large Elo differences arise... I don't understand much of it but I think it must be a problem of the rating formulas. A calibration in a way to simply reduce all numbers (as SSDF did once or twice), to make the top ratings seem "more realistic", does certainly not solve that problem, because then engines and comps on the bottom seem clearly underrated relative to FIDE Elos. - This is something which would require mathematicians and statisticians, to improve. I don't know if the Bayesian Elo - used by CCRL - is the cure. A calibration is missing anyway, because there are not enough man vs. machine games which would be suitable for that. Those GM tournaments in Argentinia were interesting though, in this regard. Not the very top, but "medium strong" GM and IM play there, and most often a normal retail version of a program was included, no top secret stuff and no supercomputer. I wonder if someone has collected these results for the purpose of human vs. computer rating calculations. I think it were at least 4 or 5 tournaments (once per year), their name being something with "Magistral...".
>2915 for example would mean that such a player or engine should score 95% against a 2445 IM. How many draws can we expect from an IM, in ten long games against Rybka (e.g. on a quad)? Each draw would make 5%.
An excellent point! It is sort of difficult to imagine our 2445 rated player getting sweeped that badly. I would sure hate to be this player and have to play such a match though!
> If anything, Rybka's true rating against humans would be much higher than 3200 ELO.
One problem is, true ratings (in the formal sense) require certain results against opponents of certain strengths. The best humans have ~2800 FIDE Elo. 3200-2800 = 400. If you look at the rating tables(*), you can see that such a performance requires a 92% score, against them (and an even higher scored against less strong opponents).
If the master scores eight losses and two draws in ten games, it is already 10%-90%. Against a 2650, this gives a performance of 3016. And you cannot really claim that a 2650 GM can't even get two draws from ten games (IOW. 2 from his 5 white games).
It's not only about wins and losses: Good masters will get draws. - I think Larry Kaufman's estimation is more realistic.
*) Also, the tables show that against 2500s, a 3200 performance is almost impossible: A difference of +700 requires over 99%!
weak engines usually do not have a very good prepared book and if we talk about engine like hiarcs than it is clearly stronger than humans.
I believe that top humans can get at least 1 out of 10 against rybka in a fair match.
engines usually do not know to play for a draw.
Humans know better to play for a draw.
Powered by mwForum 2.27.4 © 1999-2012 Markus Wichitill