Not logged inRybka Chess Community Forum
Up Topic Rybka Support & Discussion / Rybka Discussion / "Everything but a Pawn" -- Conclusions
- - By lkaufman (*****) Date 2007-07-09 16:21
     Now that the match is over, I'd like to offer my thoughts on what we learned from this match.

1. Ehlvest played well, at least until time pressure. The final score was exactly the one I voted for in the poll, but Ehlvest in general had much better positions around move 30 or so than I expected him to have.
2. Rybka's performance rating was around 3000. I get this by adding Ehlvest's rating (2629) to the rating differential for the score (191) plus a reasonable estimate for the cumulative value of the handicaps (180). This figure of 3000 is well below the 3112 rating on the CCRL list, but in my opinion such lists overstate the rating differences between computers as compared to how they would rate against humans, by around a 4-3 ratio. So if we assume that 2700 rated computers are properly rated on CCRL, then a 3100 rating there really implies about a 3000 performance against humans, which is what we saw in this match.
3. As for the individual handicaps, the White pieces make an increasing difference as the level goes up. White is worth around 40 points in grandmaster play, but here the average rating of the contestants was even higher than in a World Championship, so probably 50 points is a more accurate figure for White.
4. The three move opening book proved to be less of a handicap than many expected. By choosing lines that are only slightly inferior and not very risky, I was able to keep the opening disadvantages to a minimum, and in some cases to no disadvantage. I would now say that this handicap was only around 40 Elo (and in this particular match even less).
5. The lack of endgame tablebases should have cost Rybka one win, but didn't. Since Rybka lacks certain basic endgame knowledge because Rajlich assumes that it will be used with at least minimal tablebases, this handicap may be more costly than we thought and is probably not an appropriate one for Rybka in general. Call it 20 Elo.
6. The 2-1 time handicap is supposed to be worth 70 Elo or so, but due to pondering is worth somewhat less. Let's say that the reduceed hash table size offsets the pondering. No way to tell how time affected this match.

     It's clear from this match that strong GMs with the White pieces and ample time have very good drawing chances against Rybka, though whether they can win more than once in a blue moon is questionable. Ehlvest told me that he thinks that strong GMs with the White pieces will always be able to make draws against computers at long time limits if that is their goal -- it's up to us to prove him wrong! I would like to see such draw-odds matches played once we find ways to make Rybka avoid draws effectively, but that goal is some ways off I'm afraid. For now it looks like pawn  handicap matches are the best bet, so that the human will have both the means and incentive to play for the win and so that the matches will be genuinely tossups. 
Parent - - By George Tsavdaris (****) Date 2007-07-09 16:48
First thanks for the match and everything else generally!


>2. Rybka's performance rating was around 3000. I get this by adding Ehlvest's rating (2629) to the rating differential for the score (191) plus a >reasonable estimate for the cumulative value of the handicaps (180).

Rybka's performance was 2820 and not 3000.
I agree that handicaps give a decrease of ELO about almost the same value you gave(i believe it is between 120-180 ELO).
But Rybka didn't play without handicaps so we can't really speak about its performance without handicaps.

We anyway can agree that Rybla's estimated(by you and me at least) performance without handicaps would be 3000.
But Rybka's actual performance with handicaps was 2820....

Nice match.
I wish the following for the future(i know your intentions about future matches and unfortunately they are not the same with mine) :
-NO handicap games at all. I prefer each side to have all its weapons.
-If you have to make a handicap, then no piece handicaps. I hate non-Chess Chess matches. :)
-If you have to make a handicap, then tablebases,hash, and book should be always normal and non-handicapped. I prefer time handicaps.
Parent - - By turbojuice1122 (Gold) Date 2007-07-09 18:03
I very strongly agree about the no piece handicaps.  Fans absolutely don't want to see such matches unless someone from the world top five or so is involved.  If the idea is piece handicaps, you will have quite a lot of trouble getting prize funding.
Parent - By lkaufman (*****) Date 2007-07-09 18:17
The problem with funding is mainly dependent on the rating of the opponent, not the conditions. Players with Elo 2600 will play for $1000-2000. Players with Elo 2700 will play for $25,000-50,000. Players in top five will play for $250,000-$500,000. That's why we play pawn handicap with 2600 players rather than just White and time handicap with Anand, for example.
Parent - - By Banned for Life (Gold) Date 2007-07-09 17:22
Conclusion 4 is totally unwarranted. It was clear, and rather surprising, that Ehlvest did nothing (actually less than nothing) to prepare against your three move book. At a minimum, he should have chosen a single line, and made you adapt to it to prevent him from getting the same favorable position more than once. I'm very sure that someone like GK would not have made this mistake.

For the usual reasons, its very difficult to measure the Elo handicap of a short book against people. Its not that difficult to measure against engines. If you take your quad and 3 move book onto the CB server, and look at your black games, you would see that you end up with a lost position in a significant number of games before your opponent exits book. I suspect that there are probably at least a dozen engine room regulars that given white, would have easily won positions (easily won for a GM :-)) coming out of book against your 3 move book and Rybka, at least 25% of the time.

Parent - - By lkaufman (*****) Date 2007-07-09 18:22
You may be right, but I'm not sure that's too inconsistent with my estimate of 50 Elo for White and another 40 for the short book, total of 90. You do have a point that the longer the match, the greater the handicap would become. If we had planned a twenty game match, I would have insisted on a four move book. In the actual match, I made a decision not to allow any repeat openings, so I had backup lines ready if he had stuck to one line. It's true that it might have been a bit difficult for Rybka by the sixth try if he had not varied. So let's say that his varied openings were just good sportmanship, if you like. 
Parent - By Banned for Life (Gold) Date 2007-07-09 19:04
No doubt that GM Ehlvest is a class act, but I would have preferred that he do everything possible during this match to throw Rybka under the bus. :-)
Parent - - By Vasik Rajlich (Silver) Date 2007-07-10 13:01
Larry ran experiments with his 3-move book, playing against a variety of engines. The handicap seemed to be in the neighborhood of 50-100 Elo in addition to just being black. I'm not sure if you think that the punishment would be worse in the engine room.

Obviously, trying to quantify this based on just 6 games is really hard to do. Each half-point is worth ~100 Elo of performance rating.

Parent - - By Banned for Life (Gold) Date 2007-07-10 20:46
I think the punishment would be much worse in the engine room. With 90+% of the engines in the engine room running Rybka, and many overclocked quads, octals, and more, the only way to get a high percentage win rate is to have book lines that will result in significant advantages when confronting a theory lacking Rybka. In fact, the best engine room book lines are exactly the lines that use Rybka's natural move selection process to lead her into opening traps. Over 100 engines are working to find these lines 24/7. These lines would be perfect for countering Larry's 3 move book (except that the games in the engine room are generally in the 3-5 minute with or without a small increment, and Rybka will make somewhat different choices with a larger speed-time product). The only way these lines could be avoided would be by playing really oddball (and inferior) first moves. If one were to settle on a single opening line, even these could be well countered in this case.

My thesis can be easily tested by running a quad with Rybka and a 3 move book against some of the strong engine room nicks, and looking at the result when your opponent exits book.

Parent - - By Vasik Rajlich (Silver) Date 2007-07-12 08:14
I'm not sure that the engine room guys have tried really hard to refute Larry's lines - in fact, I am not sure that anybody has.

(For some people these statements are a synonym.)

Anyway, your experiment could be tried. I'm not sure there will be use for this book in the future. Larry - if you're reading, what do you think about it?

Parent - By lkaufman (*****) Date 2007-07-12 13:30
Although we have no plans for further matches without handicap now, I would not want to make the book public because we might come across a 2700+ player who would like to try a similar match for modest compensation (not likely but possible). There isn't a great amount of variety in the book, as it was only intended for a six game match. For example, after 1e4 (the most common move in engine room play I believe) I only had one other reply besides the Scandinavian which was played in two games. So for 1e4, those two games constitute half the book already (plus of course how to meet deviations like 3Nf3 or 2Nc3)!
Parent - - By Hetman (*****) Date 2007-07-09 17:27

I agree that GM Ehlvest has played very well and Rybka, too :-). The match was interesting and thanks for organizing it.

I do have different point of view concerning some handicaps, what is handicap and who has it and where.
Let me express my point of view, I am trying to defend human.

-Ehlvest has not had the opennig book nor endgame tablebases, in that circumstances the computer programm shall not have it either, or the human shall have access to the same  additional data as the programm. In that case it was ok, that both sides does not have access to endgame tablebasses.
- Rybka does have the opening book - short one but Ehlvest did not, it was the advantage for Rybka. Here You have helped Rybka and in that part of the game GM was playing vs you and Rybka, here were not the match Ehlvest against Rybka. It has been important factor because you have analysed the Ehlvest games, so here we have to subtract sth from Ehlvest rating :-)

- accepting/refusing proposal of the draw, here were once more program + operator vs Ehlvest, I think that operator role shall be only to transfer the moves and proposals, the program should have take the decision by itself. The operator here was having not bad rating :-) and access to the programm analysis. it is important factor next Elo points ;-).

If that conditions will be fullfilled we could speak about the real match against human and the programm.
The formula of the existing matches, not only that one, is being the combination of the game human against programm & centaur.
The human advatages are removed by adding the human help and then the human have to face the programm.
So the human has been not beaten by the programm itself but with the help of other human.

Parent - - By lkaufman (*****) Date 2007-07-09 18:26
Regarding analysis of Ehlvest's games, he plays a wide variety of openings, so reviewing his games was of little use for me in preparing the book. Regarding draw decisions, the only time it came up I accepted the draw when the score became close to zero and there was no sign that either player would do anything active -- nothing to do with my IM chess skills. We declined it earlier because Rybka still was "happy" and still was pushing pawns.
Parent - - By M ANSARI (*****) Date 2007-07-11 08:44
I have to agree with Alan.  I think if Jaan had gone on the engine room and downloaded all the bookless games, he would have found ways to easily come out with white in a winning advantage.  Rybka is predictable with identical hardware, and there are some positions which quickly become critical and every line is lost.  It takes a Rybka to find another Rybka's weakness, and I am sure if Jaan had done that he would have had totally different outcome.  GM's of that level are strong enough that they can convert an obvious winning position quite easily.  It would be like me taking on Kasparov with 2 rooks vs his lone King and trying to mate him (ok ok maybe that won't work so say rook and queen vs. King).  I was really surprised at how well Rybka came out from most openings and I expected 3 move opening book to play a much bigger role ... I was expecting maybe around a 300 ELO difference for a prepared opponent with pre-established good lines based on thousands of bookless Rybka games.
Parent - - By Uri Blass (*****) Date 2007-07-11 12:03
I think that you overestimate the value of preperation.
If you think that rybka's moves can be predicted then you are wrong
there are some problems:

1)Ehlvest did not know the choice of rybka in the first 3 moves so he needed to prepare more than an hundred of different opening for preparing to the line that rybka is going to choose(against 1.e4 possible 1...e5 1...c5 1...e6 1...d5 1...d6  and even if we assume only 5 choices in every move we get 125 choices).
2)Rybka with 4 processors is not deterministic
3)The version that was playing was different than the commercial version.

Parent - - By lkaufman (*****) Date 2007-07-11 14:16
There is also the fact that Ehlvest would not have known the running speed of the quad until i announced it just before the match. This is all moot because he does not own a quad. If we had faced an opponent whom I thought might do this sort of detailed preparation, I would have made far more radical changes to the eval for the event to foil this strategy. I could easily make major changes to Rybka's play without weakening her by more than 10 Elo points.
Parent - - By Hetman (*****) Date 2007-07-11 15:30
but if You are changin the Rybka during the match, it is next time not the match against Rybka but Rybka + human :-)
Parent - By lkaufman (*****) Date 2007-07-11 15:53
I don't mean during the match, I mean before the match.
Parent - - By Banned for Life (Gold) Date 2007-07-11 16:30
It would actually be fairly easy for Mr. Ansari to estimate the contribution of his opening book to his quad's Elo relative to a 3 move restricted version. He could do this by making a copy of his book and chopping off everything after the sixth ply. He could then play this book on his quad in the engine room, and see how much it affects his Elo. My guess would be that without specially tailoring the 3 move book to avoid Rybka's weaknesses, the loss would be approximately 150 Elo. By finding opening lines that are less susceptible to opening traps, this loss might be reduced to 120 Elo or so. If he used this restricted book in matches against people who have developed a good a anti-rybka book (by playing tens of thousands of games against Rybka in the engine room), I would expect this difference to be close to 200 Elo. If anyone wants to test this thesis, a short match can be arranged on the CB server with a chopped version of the RybkaII book (or any other proposed 3-move book).

Parent - - By Vasik Rajlich (Silver) Date 2007-07-12 08:16
And how about "by finding playable variations which nobody is likely to have analysed"?

Parent - - By Banned for Life (Gold) Date 2007-07-12 08:49
Playing strange variations that are not well analyzed to throw your opponent out of book is a very popular pastime in the engine room. Nelson used to infuriate me by lumping me in with this 1. h3 group. Of course its much more dangerous to try to use this strategy with the black pieces. In any event, playing black with a three move book against a white opponent always playing the same responses will result in only a few hundred playable variations. A month of engine matches using the randomizer would probably cover them fairly well. Many of the resulting lines would lead back to existing theory. The number of new lines that would have to be developed to form a good spanning set for covering reasonable variations would probably end up being pretty manageable with Larry's opening book handicap.

Parent - - By Vasik Rajlich (Silver) Date 2007-07-14 07:29
I am sure that any specific 3-move variation can be seriously damaged with a month of work. However, you have to do all of this preparation before the match for all possible 3-move variations.

Larry wants to keep his book private, but you can easily perform this experiment. Have somebody make a 3-move book with this philosophy, and then run it on Playchess for 50 games or so against a variety of opponents with strong books.

Parent - - By Banned for Life (Gold) Date 2007-07-15 07:20
Actually, this experiment gets run all the time in the engine room and someone like Nelson might even have statistics on how often a white engine staying in book for say 15-20 moves wins/draws/loses against a black engine in book for 3 moves. I can assure you the statistics would equate to a lot more than 40 Elo even though they would be biased by the fact that the one with the poor book would have a lower Elo to begin with.

I got a good laugh from Larry's desire to keep his book private. If he played it against my book, nothing meaningful would be revealed since its hard to imaging any top GM playing 1. b3 with money on the table. The question I have is if Larry put together 10 3-move responses to 1. b3, how many would end up with white leaving book with a very large advantage (say a Rybka eval of at least 1 pawn after a long analysis). I suspect out of 10 unique openings, the answer would be 3-5.

Parent - - By Vasik Rajlich (Silver) Date 2007-07-16 10:58

ok, I'll tell you what - I will here make a 3-move book vs 1. b3. Please run a (fair) match between this book and your book, tournament settings, and tell us the results. You don't have to post the games.

Needless to say, you're not allowed to analyze my lines and add to your book before the match starts.

My lines are (against everything):

1) 1. .. Nf6 2. .. e6 3. .. Be7
2) 1. .. Nf6 2. .. b6 3. .. Bb7
3) 1. .. Nf6 2. .. Nc6 3. .. d6
4) 1. .. Nf6 2. .. Nc6 3. .. e6
5) 1. .. Nf6 2. .. g6 3. .. d6

Let's say, 5 games for each of the four "defenses". That will be enough to tell us if 75 Elo is really badly off.

Parent - - By Alkelele (****) Date 2007-07-16 11:01
Alan's book will score 78%!
Parent - - By Vasik Rajlich (Silver) Date 2007-07-16 11:21
That's your second bad prediction Dagh :)

Parent - By Alkelele (****) Date 2007-07-16 11:24
We'll see... :-)
Parent - - By turbojuice1122 (Gold) Date 2007-07-16 16:27
Oh, boy--this will be fun:

1.b3 Nf6 2.e4 XX 3.e5 XX?? 4.exf6! (fourth move first move out of book, but will certainly be played.
Parent - - By turbojuice1122 (Gold) Date 2007-07-16 16:30
Or even better:

1.b3 Nf6 2.Bb2 XX 3.Bxf6 XX?? 4.Bxe7 or Bxd8 or Bb2, depending on which third move was played.  Hmmm...I think the statement "against anything" should be revised... :-)
Parent - - By Alkelele (****) Date 2007-07-16 16:31
Or the following statement should be carefully studied ;-)

> Needless to say, you're not allowed to analyze my lines and add to your book before the match starts.

Parent - By turbojuice1122 (Gold) Date 2007-07-16 16:34
Well, you see, I took that as meaning "adding" after move 3 :-)
Parent - - By Banned for Life (Gold) Date 2007-07-17 03:39

I am running each of your 5 lines against my book with the black continuation after move 3 being decided by Rybka playing after evaluating to, and including complete depth 20. Each game will be stopped when white runs out of book and the resulting position will be presented here. This is rather time consuming on my 219 kn/s machine (1GB hash, default settings except with EGTBs set to normal), but should provide a good indication of how well Rybka can perform with a simple 3-move book and lots of computation time. Preliminary results indicate that Rybka with the three move book is doing well and I am wishing I had decided on depth 19 for Rybka rather than 20 (after seeing black avoid several traps during depth 20 analysis).

Parent - By Vasik Rajlich (Silver) Date 2007-07-18 07:36

ok, excellent.

This still leaves open the question of converting your results to Elo - after all, we did agree that white would score well.

I guess we can try to come up with some formula based on the eval scores at the end of each variation and white's expected time advantage at that point.

Parent - By Banned for Life (Gold) Date 2007-07-19 08:34 Edited 2007-07-19 08:39
Position 1: 1) 1. .. Nf6 2. .. e6 3. .. Be7 occurred in my database a grand total of 15 times with very few games and no analysis after move 4. Resulting positions, with black's moves after 3-move book provided by Rybka 2.32a MP running to (and including) depth 20, were as follows:

rn1q1rk1/p1p3p1/1p2p2p/3p1p1P/1bPPb3/1P1BPN2/PB3PP1/R2Q1K1R w - - 0 13

r2q1rk1/pb1pbppp/1pn1pn2/2p5/5P2/1PN1PN2/PBPPB1PP/R2Q1RK1 w - - 0 9

rnbq1rk1/p1ppbppp/1p2p3/8/2P1nP2/1P2PN2/PB1P2PP/RN1QKB1R w KQ - 0 7

Only clock advantage from this uncommon opening.
Parent - - By Banned for Life (Gold) Date 2007-07-19 08:43
2) 1. .. Nf6 2. .. b6 3. .. Bb7

Not much doing here either:

rn3rk1/p2qppbp/1p1p1np1/8/2PN4/1PN3P1/PB2PPKP/R2Q1R2 w - - 0 12

r2q1rk1/pb2ppbp/1pnp1np1/2p5/2P5/1PN1PNP1/PB1P1PBP/R2Q1RK1 w - - 0 10

Once again, there were very few games in the database from this opening with very little analysis.
Parent - - By Vasik Rajlich (Silver) Date 2007-07-20 09:38
Thanks. This is even less of an advantage than I would expect. Probably, the best way to publish a 3-move book is with "best moves", and 1. b3 may simply not be good enough (for this task).

By the way - was your book playing all the way to these positions on the basis of just 15 unanalyzed games?

Parent - - By Banned for Life (Gold) Date 2007-07-20 22:35
The book advantage is based on reaching positions that have been played many, many time before, and taking advantage of the accumulated statistics from those positions working backwards toward the opening position. You can't do that with a rarely played opening.

The tendency of most people (and engines) when confronted with a flank opening is to grab control of the center. Thus the great majority of the lines in my database are based on black making an early grab for control of the center. Interestingly, Rybka at depth 20 never seemed to make this early attempt after its third book move.

All of the white moves came from my book. Not infrequently, a line that has only a few continuations at move 4 will merge with other lines at a later move. The multiple positions represent cases where more than one move was present in my book. If any of these openings were played frequently, they would be analyzed with one or more moves highlighted. The test took longer than I thought because I underestimated the time for black to go through depth 20 early in the game. Its actually surprising in this case that black didn't end up with better positions, given the probability that the moves came from games with analysis at a much shallower depth.

I agree that the best openings for this type of test would be ones that have been very well explored. The major surprise for me was that your openings, designed to be quiet so they could be played against anything, were so much more effective than the openings that I see every day on the CB server (or for that matter in freestyle competitions).

Parent - - By Vasik Rajlich (Silver) Date 2007-07-22 08:43
Hi Alan,

of course, I chose my lines based on frequency of occurrence (in RybkaII.ctg). Obviously, I expected your book to have more data (and it did), but I bet that your analysis is proportioned quite similarly to Jeroen's book.

If we repeated this with 1. e4, white would probably do a bit better, simply because the move is stronger (:)). However, the general trend would be the same - black would find slightly inferior moves which escape big theoretical systems without totally destroying his position, and normal chess would be played. This is exactly what Larry did in preparation against Ehlvest and as you can see from that match, it didn't cause huge problems for Rybka.

On the whole, I think an estimate of 75 Elo is still quite accurate.

Parent - - By Banned for Life (Gold) Date 2007-07-23 01:17
ok, so you chose your lines to prefer sound moves with low game counts in RybkaII assuming that my game counts would be fairly similar. That is probably an optimal strategy for a three move book. On the other hand, if I take the RybkaII book, cut it down to three moves, and play it against my book, Rybka 2.3.2a vs Rybka 2.3.2a, with equal time, the result would likely be far different. As you surmised, my analysis is heavily slanted toward:

1 ... e5     41%
1 ... d5     38%
1 ... Nf6     7%
1 ... c5       6%

So only your last line is represented by hundreds of games in the database.

One problem I have with a 75 Elo number is that in main lines, black generally starts with a significant time deficit, I'm guessing he might on average have used 1/3rd of his time to get through the opening. This wouldn't leave many Elo for lower move quality or prepared traps.

Another problem is that it doesn't square with engine room results, unless the top players engine room books are actually hurting them rather than helping. Normally in these matches, the relative level of preparation is much more important than the strength of the move, so a weaker move with a big advantage in preparation can be more successful than a stronger move with equal prep. Anyway, a good record against engines running on hardware 3-4 times as fast minus the advantage for being white still seems to be well over 100 Elo.

Parent - - By Vasik Rajlich (Silver) Date 2007-07-24 07:54
Losing a third of your time will cost you around 40 Elo, but in the examples we used here, the actual loss of time would have been much less.

Re. your last paragraph, I just wonder if you're not "mentally cherry-picking". In other words, when you really nail somebody with preparation, it counts, while when you don't, it's because you just didn't prepare enough yet in that line.

Obviously, an inhumanly-good book would be incredibly useful, worth hundreds of Elo.

Parent - - By Banned for Life (Gold) Date 2007-07-24 08:46
I'm definitely cherry picking by spending preparation time in direct proportion to the number of times a line is played. Obviously this maximizes the probability that the preparation will be useful in nailing somebody. Not coincidentally, median out-of-book evaluation descends in the same order as opening percentage. This seems like a near-optimal strategy to me. What would you do differently?

Parent - By Vasik Rajlich (Silver) Date 2007-07-26 08:47
Of course, this is exactly what you should do.

I am talking about cherry picking when you come up with your estimate of 100+ Elo. In other words, you only count the games where your preparation was actually very good. The value of good deep preparation will be well over 75 Elo.

Parent - - By Uri Blass (*****) Date 2007-07-19 09:25
Note only that without the condition that I am not allowed to add moves to the book white can win easily

My lines against your book are the following:

1.b3 Nf6 2.Bb2 e6 3.Bxf6 Be7 4.Bxg7 Rg8 5.Bb2
1.b3 Nf6 2.Bb2 b6 3.Bxf6 Bb7 4.Bb2
1.b3 Nf6 2.Bb2 Nc6 3.Bxf6 d6 4.Bb2
1.b3 Nf6 2.Bb2 Nc6 3.Bxf6 e6 4.Bxd8
1.b3 Nf6 2.Bb2 g6 3.Bxf6 d6 Bxh8

Parent - By Banned for Life (Gold) Date 2007-07-19 18:20
This would be the trivial solution :-)

Parent - By Vasik Rajlich (Silver) Date 2007-07-20 09:40
You really have a book where white plays 1. b3 Nf6 2. Bb2 e6 3. Bxf6 ?!

Parent - By Dadi Jonsson (Silver) Date 2007-07-23 07:59 Edited 2007-07-23 08:13
Here is an article about the match by Andy Soltis: Computer at odds with booked-up foe.

Edit: I noticed that it is also at Susan Polgar's blog with reader comments: Rybka vs Ehlvest
Up Topic Rybka Support & Discussion / Rybka Discussion / "Everything but a Pawn" -- Conclusions

Powered by mwForum 2.27.4 © 1999-2012 Markus Wichitill