One of them has only games where both players are rated over 2100 ELO, so I selected those games from the current HugeBase.
Campbell B. (ELO 2273) Vs Finlayson S. (ELO 2112) has a date of 1899.
Slightly more contemporary are games by Botnivik M. with ratings of 2400 in 1933 and 2385 in 1934.
Does the rating system really extend back that far?
> Does the rating system really extend back that far?
Yes and no, as numbers are serious inflated nowadays.
Here is everything you need to know: http://www.chessmetrics.com/cm/
If you could resurrect Steinitz in his prime would he be able to score well against the current top 10 players in the world?
The United States Chess Federation has one hand fighting the other on this issue.
They try to fight rating inflation by manipulating K values and bonus points, but also put floors on peoples ratings to prevent sandbagging for money tournaments.
These floors guarantee rating inflation in the case of people who actually get worse at chess because of inactivity, advanced age, alcoholism, or whatever else could cause it.
If I play against someone with a floor of 1800 who's actual playing strength is 1600 then my rating gets inflated by gaining more or losing less points than are actually justified.
The problem with pitting older generations against the newer crop of players is largely fashion. Many quite sound openings go out of fashion. Because of this, it is hard to know who would have the advantage. The style of a Morphy vs the current fashion of a Carlsen or Nakamura? I am sure that it would be a wonderful sight to see... but who would win? Both generations have their own strengths. And I am sure Morphy would like to use the modern tools for match prep... so he would play differently now too.
Also, I remember, years ago, when I played at the local Chess Club... the local favorite used to play rated games (not in any event) against club members so that he could pick up a point or 2. The games were legit, but the local favorite was a master level player and the average club player was around 400 points below. Does not seem fair really to inflate ones rating that way either. In fact, I think there was a player from New Orleans that had his rating "fixed" because of such things.
>The problem with pitting older generations against the newer crop of players is largely fashion. Many quite sound openings go out of fashion. Because of this, it is hard to know who would have the advantage. The style of a Morphy vs the current fashion of a Carlsen or Nakamura? I am sure that it would be a wonderful sight to see... but who would win? Both generations have their own strengths. And I am sure Morphy would like to use the modern tools for match prep... so he would play differently now too.
If older players used their "old fashioned" openings, they would surely get bad results; if they updated their repertoire and prep tools, they'd be different players, than those playing in their time. We're talking Elo ratings, not persons, so for a fair comparison, you have to let them play in exactly the same manner they would, back in the day.
I guess you are saying, in effect, the Elo ratings from before their creation are reconstructed from the quality of the games?? If so, then I would agree with you. But then I would have to disagree with how the ratings were reconstructed (or estimated). Elo ratings have never reflected the quality of play. Rather they are based on "expected" future performance, based on past performance.
In isolation there is no such thing as 2800 level play... nor 2400 play. Such a thing does not exist in a vacuum, but only in relation to ones contemporaries.
However, my background is mathematics... mostly statistics. And Elo is based on probablity theory. And I am not sure how that equates to either "quality" or "strength". And not knowing how pre-1970 Elo has been estimated, I am left guessing.
So this discussion is rather pointless. As to this thread, we have come far afield. Personally, I think that assigning low Elo to players pre-1970 is a mistake, and not truly representative of their actual performance... but that is just my opinion, and you know what they say about opinions, everyone has one.
As for "quality" and "strength", the former refers to the moves, while the latter applies to the player.
So generally I consider raw Elo numbers to be misleading - at the very least. Who a player has been playing tells the rest of the story. For example, when playing against players that have very high Elo ratings, a player is likely to have a rating increase, even if you are not playing any better. This has recently happened to me. I have recently started playing against higher (than me) rated opponents and doing about as well as I did before... so my rating has gone up about 60 Elo points over that time period, plus I was able to earn 2 CCM norms. All due to my opponsition being better, not my play, which has remained about the same as before (at 72 things no longer improve very fast).
The accuracy of the Elo system (as implemented by FIDE) leaves much to be desired, both in terms of the formula used (see this table) and the conditions as they've come to be over the years (entry rating of 1,000 points?!). That said, it's all we have to sort games in a DB, having "quality" in mind as the sorting criteria.
This same method could be used to more accurately approximate ratings of players from the past.
There was an article in Chess Life awhile back that described a method of detecting people who cheat with computers in OTB tournaments for big cash prizes.
A strong computer goes over the game and ranks possible moves according to the usual centipawn evaluations, then compares the players moves to this list.
A GM will find the best move most of the time, but tends to make second or third best moves as the time control approaches.
Weaker players will routinely play substandard moves.
It is possible to create a graph of the players moves compared to the computers move ranking, and these graphs are characteristic of the player's strength.
A blunder would show up as a downward spike in the graph, and so these graphs could also be used to determine the overall quality of play in a game.
If you want to know about a game's quality, the best thing would obviously be to analyse the moves, but that's not very practical when dealing with hundreds of thousands of pre-70's master games. Reconstructing Elo values is way faster (although not a trivial matter).
As for "going beyond the advantage a modern player has in opening preparation", how would you analyse opening moves? Automated backwards game analysis let's you stop at a given move, to avoid wasting time considering opening theory, but in the case of outdated openings, there might very well be a case for actually passing them trough the grinder.
The 2 games I am working so hard on are from the WCCC semi-final that I am currently participating in. The games are critical... I hope to at least quality for another semi-final... it is, of course, very hard to quality for the Candidates, but all semi-finalists have hopes <g>.
So back to the thread, as a practical matter, many "strong" or "quality" moves simply are no such thing. So I go back to my point, Elo is based on probability theory, not on undefinable things, like "strong" or "quality". Those things strength and quality are ephemeral, they are extremely hard to pin down. Take Capa as an example, he was a GREAT chess mind, I think he could hold his own against any modern player... perhaps even without modern prep.
I still think the methodology of comparing moves in a game to moves ranked by a 3400+ strength computer is a sensible way to assign retroactive ELO ratings, but of course it would be time consuming with hundreds of thousands of games. Maybe it would be reasonable to use that method on a subset of the players then generate ELO ratings for the rest of the players based on their results against the subset.
I understand that it is based on probability theory, which actually makes things easier.
You could read and understand Mark Glickman's extensive analysis on the subject, but it really isn't necessary.
There is canned software that will generate performance ratings if a sufficient number of people in a tournament are already rated.
Ozzymandias won't tell us what's under the hood of his retro rating machine, but that's how I would do it.
A simpler way would be to backwards propagate through time the results of players with official ratings against contemporary unrated opponents.
Player A was rated 2500 in 1979 and had a 3:1 win ratio against unrated player B. Therefore we assign player B a 2300 rating, and then consider Player B's performance against player C.
Well, that would be pretty time consuming also...
I got around this problem by dumping every game played before 1978 into a database named "Classic Games".
However, I assume it has got to be something that would equate to the win/lose/draw percentages of the particular player and the relative elo of the opponents. Maybe go back in the database to 1900 (or whenever) and assign reasonable ratings to all master level players and just rate the games in the database from that point. Probably easier said than done. But it would give you a result based on performance, i.e. a probability based elo.
I wonder if anyone knows how elo got started? They surely must have assigned reasonable values to players based on some sort of ranking system. Digging out the required historical data might be tough to do that.
If the tournament director suspects you of cheating they could ask you to play a few 5 minute games against a master, where your true strength would be revealed.
They don't actually need to prove anything. The big money tournaments are run by the CCA, and they can assign ratings for use in their tournaments.
They sent me the list recently and I noticed a player from Mexico who had a 1300 USCF rating and 2100 CCA rating. That list is in addition to the USCF floors.
This thread wasn't really about cheating though, and if you asked those different programs to rank moves from 1 to 10 they would likely contain most of the same moves in the top 2 or 3. The probabilistic nature of the method is the percentage chance that a particular human has to make computer strength moves, and I bet someone who scores 55% compared to Stockfish would have a very similar result compared to Komodo. You could narrow the difference further by comparing to the top 2 or 3 moves instead of only the best.
Powered by mwForum 2.27.4 © 1999-2012 Markus Wichitill