Not logged inRybka Chess Community Forum
Up Topic The Rybka Lounge / Computer Chess / What is the drawrate of a spacebar?
- - By Alkelele (****) Date 2014-12-15 16:12
How hard is it to beat a spacebar?

Let's say you enter a computer in a tournament with the best correspondence players in the world. The computer uses a thoroughly worked opening book and then gets 1 hour or 24 hours per move.

We give the computer Black in all games. How many draws would it get out of 100 games? 10%? 90%? What is the guess of the correspondence experts in the forum?

I don't have any idea, and I would like to hear your thoughts.
Parent - By Dragon Mist (****) Date 2014-12-15 18:54
85-90%
Parent - - By turbojuice1122 (Gold) Date 2014-12-16 11:38
I would guess that one could get a reasonable estimate by looking at the draw rate of current top corr games?  If this rate is D_0, would think that then, the draw rate of spacebar would be, depending on the opponent, something like
(D_0 + (1-D_0)/2) > D > (D_0 - (1-D_0)/2) ) would be a pretty good estimate.  I am basing this off of the fact that the opponents who take the least risks would result in draw rates roughly half-way between the those player's draw rates against themselves and 100%, while that difference from the normal corr draw rate would be that same amount for the strongest opponents, but in the other direction.  Anyway, I'm sure you could probably improve on this estimate greatly. :wink:
Parent - - By Alkelele (****) Date 2014-12-16 16:15
I think a bracket is missing, at any rate I do not easily understand. :red:

Does it make much of a difference if the opponents know they play against a spacebar?
Parent - By turbojuice1122 (Gold) Date 2014-12-19 15:03
Naturally, I assumed no knowledge of opponents, so if the opponents know they're playing against a spacebar, this would probably change things--losses would be more rare (as the spacebar would often be "spacebarred"), wins would be more frequent, though I don't know whether the draw rate would change.

Yes, I had an extra bracket, though if Dagh is having trouble understanding, I obviously didn't do a very good job in explaining. :sad:  (Or, I'm just wrong.)
Parent - - By Carl Bicknell (*****) Date 2014-12-16 17:51

> How hard is it to beat a spacebar?


What's a spacebar?
:confused:
Parent - - By Banned for Life (Gold) Date 2014-12-16 18:49
If the thesis that the spacebar approach was near optimal was correct, the problem would reduce to having a good opening book, the latest top chess engines, and strong hardware. It definitely has not come to that yet, and it is still possible in CC to play cheesy openings that exit 'normal' theory very quickly in what top engines would consider a balanced position, and get good results, even with mediocre hardware (or at least this was true a year ago).

IMHO, the problem with the spacebar approach is that there are many cases where the engine will have a number of moves with nearly equal evaluation which result in very different games. And while the engine will decide between these moves with a similar one dimensional evaluation number in a random manner, a person can evaluate the different positions based on the characteristics of the resulting game. With engines still having issues properly evaluating in certain areas, e.g. king safety, kingside pawn storms, long term positional play, and closed positions, there is still the potential for value added when the time frame is long enough, generally four days per move for cc, to target areas where engines aren't that strong. This seems to hold even when using weaker hardware.

This is not too different than the situation where you won the PAL freestyle event with inferior hardware a number of years ago, although with stronger engines and hardware, the human factor is certainly less critical than it was back then.

But I'm pretty sure that Uri would disagree with my argument, since he has been claiming for years that the spacebar approach is nearly optimal...
Parent - - By Alkelele (****) Date 2014-12-16 20:02
OK, let's say we ask you to play White with a cheesy opening like 1. b3, pretending that we thus exit "normal" theory right away (similar to 1. e4 e5 2. h3 or whatever).

The Black spacebar will play without opening book. It is an unknown engine of strength similar to Komodo and Stockfish (with similar strengths and weaknesses as current engines) and uses strong hardware and 24 hours pr. move.

You get 100$ for a win, nothing for a draw or loss.

How much would you pay to play this game?
Parent - - By Banned for Life (Gold) Date 2014-12-16 20:49
Even 1.b3 has way too much associated theory. I used 1.b3 2.d4 in one round of an IECG event, without really looking at the implications. It's a really horrible opening sequence! :lol:

Let's make some assumptions:

0. The human player plays white.
1. Both sides are using equal hardware.
2. The human player has a copy of each of the top engines. In reality, this would quickly allow figuring out which engine spacebar was using, which would provide the significant advantage of allowing a priori prediction of moves in most cases. I guess you could assume use of an unknown engine developed specifically for this event, or in the past, rental of a one of a kind engine (e.g. the Rybka cluster, although this would violate the equal hardware requirement). In any event, the focus of this game would be on playing the opponent, rather than playing objectively the best moves.
3. The human player has nothing else to do for the duration of the game, other than to eat, sleep, and shit. In the interest of playing the best chess, fornication would be postponed until after the match. Attention to personal hygiene would be optional.

Under these conditions, I would guess that the expected win-draw-loss percentage would be something like 30%-65%-5%.

Betting on the game isn't as straightforward as you seem to believe because you need to take into account opportunity cost and risk aversion. Nobody in their right mind would pay anything to play a game that could go on for three months, for $100 (that's why it makes little sense to have a prize fund for cc events, and I get a good chuckle thinking about the ~$5 an hour that was earned by the winners of the last Infinity Chess freestyle event if you don't count Nelson's preparation time). This problem could be addressed by increasing the stakes to a million dollars, but risk aversion would still be a major factor in determining the size of the bet. The takeaway is that these things aren't really done based on expectations. That said, I would pony up $100K to play this match for a prize of $1M if I won. That would provide a large enough expected return to make the time required for the match a worthwhile investment, without being so high that I would have to listen to my wife complain about a possible loss of the investment for the rest of my life...
Parent - - By Alkelele (****) Date 2014-12-16 21:10
OK, to avoid that opportunity cost and risk aversion cloud the issue, let's assume that we make you play this game, and then erase your memory. We then offer you to buy a ticket that pays 100$ in case you won the game.

Are you saying that such a ticket is worth about 30$?

In other words, can we hope to win about 1 out of 3 games against a spacebar?

What about the importance of the opening book?

What would be the result against no opening book?
What would be the result against an opening book consisting of "currently known critical lines" analysed in a mini-maxed IDEA tree? (Exactly how to construct the IDEA tree can be argued about, the main idea here and now is to just "teach" the spacebar about the known danger spots in White's normal ambitious lines, forcing White to use more cheesy lines.)
Parent - - By Banned for Life (Gold) Date 2014-12-16 22:36
OK, neglecting opportunity cost and risk aversion, I do believe that expecting a 30% win rate against the spacebar without an opening book is very reasonable, based on the following advantages:

1) There are a limited number of top engines, so the play of the spacebar can be accurately predicted in most cases by testing with the same engine.
2) Many openings are very problematic for unassisted engines. I have always liked the Sicilian Dragon and am constantly amazed at how poorly current engines perform when looking at the position from the root. Performance in some KIA and KID openings is even worse. Can a person be clever enough to get an engine without a book to play into one of these types of lines? Maybe. This could lead to an early blowout. Being able to predict the engines moves makes this a viable option.
3) Engines still make ugly, positionally indefensible moves because they don't have a rule against them, and can't see a problem in their search. These moves don't disappear at long time controls. Exploiting these moves may be difficult in a freestyle game with only a few minutes per move. With 24 hours per move, the consequences should be more severe.
4) Due to the many 'enhancements' to alpha-beta search, it's not at all uncommon to have a position that evaluates at one score at the root, but a very different score when you advance a few moves along the PV. A person will always check his plan by moving along the PV and ensuring that things are as expected. Engines don't do this for whatever reason (probably because it's not optimal with a pure alpha-beta search).

An opening book with full coverage of "currently known critical lines analysed in a mini-maxed IDEA tree" would be a tougher animal. The opening is obviously far and away the most studied part of the game, and there are a lot of openings that engines in general don't play well, and if you know the exact engine, you can test it against all of these critical lines and find out which ones it plays worst against.

If the engine has a book, it's almost a requirement to force it out of book as early as possible. Since the most likely outcome is a draw, we are treating draws as equivalent to losses, and we are happy winning a third of the time, we really want to enter into really complicated, really murky positions, with most of the pieces still on board. Ideally, these positions will be likely to end in a decisive manner and will break one way or the other in a manner difficult to predict (especially from the root).
Parent - - By Alkelele (****) Date 2014-12-16 23:26
Thanks for these replies, it's very interesting.

Let's say you wanna play 1 e4.

The spacebar has a book based on:

1 e4 e5 2 Nf3 Nc6 3. Bb5 Nf6
                         3. Bc4 Bc5

The Berlin and the Italian game, thoroughly booked. It also has normal defenses against the Scotch and King's Gambit etc.

Where do you want to take the spacebar out of book? (For instance, you could claim that there must be slow lines in the Italian game around move 10-15-20 where there are so many equally good options for both sides that it's impossible to book much further, or you could suggest some offbeat moves like 3. a3 or whatever).
Parent - - By Banned for Life (Gold) Date 2014-12-17 07:42
:lol: 1.e4 is bad for me. I can do 1.b3, 1.c4, 1.d4, 1.g3 or 1.Nf3. As an aside, even diehard 1.e4 players like Eros, the Prince of Darkness, have noted that 1.e4 is so well booked that it's getting hard to win with it. Non forcing opening like 1.b3 or 1.g3 offer a lot more degrees of freedom for both side, and are therefore more conducive to steering the game into a very playable game in virgin territory very early on.
Parent - - By Alkelele (****) Date 2014-12-17 16:22
OK, then let's say we take a book based on current theory, like a very strong Nelson or Eros book, and give this to the spacebar.

We ask you to play against the spacebar for your life. You get half a year of endless resources etc. to decide on your first move.

What would you play? (and how do you follow up?)

I know this is difficult to answer in less than half a year, but let's get some qualified guess and reasoning. I want to know where current spacebars are most vulnerable!?
Parent - - By Banned for Life (Gold) Date 2014-12-17 17:44
My actions would be driven by the following three goals:

1) Get out of book as early as possible, with a closed but not blocked, roughly equal, and non-drawish position (which is possible for white). I suspect the best way to do this would be with a flank opening that gives a wide margin to all playchess theory.
2) Figure out which engine is being used for spacebar to allow accurate prediction of its moves.
3) Test top engines for vulnerabilities, especially in king safety issues. For example, SF likes to play g3 (without a fianchettoed bishop), and doesn't respect an opponents half open h file with a rook on it.

Of course my opening database is a million miles behind those of Nelson or Eros, but using Convetka's opening Encylopedia as a poor substitute, I seem to be able to get to a sparse point in the book pretty early by fiancheting both bishops and combining this with preparations for a knight on d2 or e2. This seeds the center to black, which I'm ok with, and allows castling on either wing, with a preference for cross castling to increase the probability of a decisive game, and also to take advantage of known issues that engines have with pawn storms.

The biggest question in my mind is how feasible it is to achieve goal number 1. If it is possible to achieve this goal, with knowledge of the opposing engine (and there are only a few top-engines so this wouldn't be that difficult), I think it should be possible to achieve a 30% win rate if one doesn't care if the alternative is a draw or a loss (this is obviously important because it allows for much riskier play).
Parent - - By Nelson Hernandez (Gold) Date 2014-12-18 13:39
It ought to be easy to get out of book quickly.  Top human players very often exit my book around moves 11-14 and it wouldn't be all that hard to shave 2-3 moves off of that if you played an offbeat variation.  And that's exiting book completely.  Before you get to that point you're in the twilight zone where the statistics aren't all that convincing, especially if you have played a line dominated by human players, not engines. 

In short, with a little work you can make anybody's book irrelevant in six moves, maybe less, without your position losing viability.  My rule of thumb is that it takes something like an order of magnitude of more data to advance your coverage one move further out reliably.  Of course, I am talking here about empirical results alone.  If you combine that with evaluative data it's another story.
Parent - - By Alkelele (****) Date 2014-12-18 16:54
I think my query comes down to guessing at the cost of having to avoid already exhausted opening theory.

What are our expected winning rate in the two scenarios:

1) We play against a spacebar that has no opening book.

2) We play against a spacebar that has no opening book, however, we (or black) have to play a novelty/rare move before move 15. (If no novelty is found, we cancel the game and start over.)

A second question:

Let's say our strategy is to play a g3 + b3 system and a knight to d2 or e2, a pawn on d3 or e3, slow stuff. But now, the spacebar is told about this and its operators spend a month preparting against 1.g3.

How much would our winning rate suffer?

If our winning rate suffer a lot, do we have other good alternatives to the g3 + b3 system? How many? When will we run out?

I agree that we can practically always dodge any book with at least a viable position. But will it still be a WINNABLE position?

Some examples I think about:

1) We play 1 e4 e5 2 d3. We have a perfectly viable position, but I would not be happy to bet on winning with white after black probably replies 2... d5 and has a straightforward edge (but probably quickly analysed to 0.00).

2) The spacebar happens to like the Najdorf, and now we surprise it with the rare move 6 h3 (it used to be quite rare). Here I think we would have decent winning chances if the spacebar is on its own. How many truly "critical" semi-novelties are there left against a truly worked opening system? (I don't think there are any left against the Berlin Ruy Lopez, for instance.)

3) Something in the middle between 1 and 2: We don't get an edge, but we also don't give black an edge. We can think of slow lines of the Italian game, for instance.
Parent - - By Nelson Hernandez (Gold) Date 2014-12-18 18:10
This all seems like good stuff for a doctoral thesis.  I can't answer your questions.

I think the Carlsen strategy of playing dry positions and combining that with relentlessly accurate play applies to engines to a large extent.  Viable positions are all you can hope for in an opening; if you get more it's because your opponent didn't play optimally.  Whether you win or not depends on your opponent making mistakes which are subsequently capitalized upon.  No mistakes, no winning chances.  (At least this is the case 99.99% of the time.  There are very rare instances where you can't completely figure out where a player went wrong and you conclude the mistake must have very early in the game and a long series of subsequent moves could not have been improved upon.)

A possibly fruitful avenue of analysis might be a study of draw-rates, which might be a proxy for "exhausted opening theory", though maybe not.  If you were to study ECOs individually or well-traveled positions within an ECO, for instance, and somehow normalize the draw-rates for average Elo (lower average Elos produce fewer draws, as do bigger variances in opponent Elos), you might find lines that are remarkably below-average on an Elo-adjusted basis while not unduly sacrificing corresponding success-rates.
Parent - - By Banned for Life (Gold) Date 2014-12-19 05:54
I think your analysis is great, but it leaves out the fact that lines that seem bulletproof with today's engines may be shown to be deeply flawed with future engines. My history is a case in point. As you know, there was a time ten+ years ago when I would dare people to respond 1... e5 to 1.b3, and was so successful that even Eros started responding with 1...d5, rather than investing the time to navigate the minefield. Those were the good old days! Now, even second tier engines at shallow depths don't fall for my old tricks! :surprised:

Of course it's always tempting to believe that the current crop of engines are nearly perfect, but actually they are far from it, and I'm sure that future engines will make the current best available engine look as bad as Crafty playing against the current Stockfish. The lines that will work ten years from now against today's lines would work just as well today if we could find them...
Parent - By Nelson Hernandez (Gold) Date 2014-12-19 10:54
All very true, yet there is some correlation between a given position's "truth" (in the tablebase sense) and its empirical outcomes and evaluation on various engines which lack tablebase certainty.  If a position is objectively drawn, the further away from 50% success / 100% draws on one hand and 0.00 on the other is error.  And likewise with objectively won positions: the further away you are from a mate score is error.  What is the size of the error, though?  If an objectively won position shows 65% success and +0.80 it may be off the mark but it is at least directionally correct.  As engines improve the degree of error has been reduced bit by bit.  We're far from perfection but we get better.

The point of the analysis I proposed is to identify lines that greater or lesser draw-rates than you would expect given the players involved.  Based on that you might identify openings where existing human/engine theory is more misaligned with evaluation scores.
Parent - - By Carl Bicknell (*****) Date 2014-12-19 11:45

> and I'm sure that future engines will make the current best available engine look as bad as Crafty playing against the current Stockfish.


Hmm...
If current engines are somewhere around 3200 or so then perfect chess is edging towards 4000 elo.
I've heard people argue it may be somewhat lower than that.
Parent - - By Banned for Life (Gold) Date 2014-12-20 23:51
I'm usually almost suicidal watching my engine (Stockfish) playing openings that require positional knowledge, so it's hard for me to believe that perfect chess is right around the corner!

If you want to see this yourself, play the following run-of-the-mill Sicilian Dragon and let black start playing at move 8. You'll be amazed how quickly black's game falls apart!

1.e4 c5 2.Nf3 Nc6 3.d4 cxd4 4.Nxd4 g6 5.Nc3 Bg7 6.Be3 Nf6 7.Bc4 O-O 8.Bb3

Black's play from this position is much closer to 1400 Elo than to 3100...
Parent - By MarshallArts (***) Date 2014-12-21 00:02
Right on, Alan -- so many folks forget that most of the strength of these engines comes from their tremendous tactical vision.

:neutral: :neutral: :neutral:
Parent - - By Alkelele (****) Date 2014-12-21 02:15
I guess you have played the Dragon for many years in corr chess now?

How easy/hard is it for you to get a draw now?
Parent - - By Banned for Life (Gold) Date 2014-12-21 02:40
I've played it many times, but it's been three or four years since I last played it in correspondence, but mainly because I fear that my opponent will go into the Maróczy Bind instead. In the past, the Maróczy Bind made it very difficult to get decent engine assistance, because black plays with a significant space disadvantage. But newer engines are sometimes OK with this. For example, in the past, engines played the black side of the Benoni very poorly for the same reason, and this no longer seems to be true. As an aside, with the Benoni, you sometimes end up with a wall of locked pawns on the queenside. Engines still aren't smart enough to realize that it's futile to put sliding pieces behind this. They also don't realize that it would make a good place for the black king to hang out.

Of course white has a much easier game with the dragon, and I have personally screwed up the move sequence in a freestyle final against Eros a few years ago, but engines frequently screw up from the white side as well. One thing that happens with great frequency is after white pushes the g and h pawns and black plays h5, white pushes the g pawn to g5 ending up with pawns on h4 and g5 locked with black's pawns on h5 and g6, effectively locking up the kingside and killing off white's attack. I've seen Komodo 8 do this in recent engine-engine games. I'm not sure that lots of pondering time would change this behavior.
Parent - - By turbojuice1122 (Gold) Date 2014-12-21 04:26
I always get very delighted when I have the white pieces and my opponent plays the Dragon. :lol:  Even Magnus Carlsen has quite a lot of trouble with the black pieces in this opening.
Parent - By Banned for Life (Gold) Date 2014-12-21 06:51
The amazing thing to me is how little help a strong engine provides black during the first 15 or so moves. On lots of moves, the engine will go to high depths and say everything is fine, then realize just a ply or two further into the line that all is lost...
Parent - - By turbojuice1122 (Gold) Date 2014-12-20 20:56

> I'm sure that future engines will make the current best available engine look as bad as Crafty playing against the current Stockfish.


But keep in mind that just as with scientific theories, improvements over time bring things closer and closer to the truth.  I think it will take a bit longer than the time between Rybka 1.0 Beta and current Stockfish to make current Stockfish look so silly, at least at long time controls.  In fact, I would be more surprised if it happened in the same time frame than if it never happened at all.  The difference between chess and scientific theories is that in chess, we know what perfection will look like (all games are draws), while in the case of scientific theories, we really don't necessarily know in some fields what perfection will look like.
Parent - - By Banned for Life (Gold) Date 2014-12-20 21:15 Edited 2014-12-20 21:18
But keep in mind that just as with scientific theories, improvements over time bring things closer and closer to the truth.

This is not always true. Sometimes the truth has to be modified to bring it closer and closer to the scientific theory. Take for example the best funded of the sciences, climate science. There is now a 35 year record of temperatures from satellites, and a ten year record of temperatures from ocean buoys. This data is ignored in the formulation of models which do not conserve energy and not only cannot predict future events well, but amazingly can't even predict past events. This will continue as long as the funding continues.

Of course financial perversion has never been a problem for chess! :lol: In computer chess now we have two top engines, one searches really well, and the other has a much better evaluation. Since the better search is open source, it seems likely that its ideas will eventually get copied into the engine with a better evaluation to produce a clear number one (actually this has probably been occurring for some time now).

Of course as we approach 'perfect' chess, there can be no improvement with additional time per move. I'm not sure we've seen this so far, at least in cases where books weren't used. When we get to the point when ten hours per move is no better than one hour per move, the end will be near...
Parent - - By Alkelele (****) Date 2014-12-20 21:54

> When we get to the point when ten hours per move is no better than one hour per move, the end will be near...


We may only solve the opening position. There are still zillions of other positions that are more difficult ;-)
Parent - By Banned for Life (Gold) Date 2014-12-20 23:38
Very true! There are still a lot of positions that today's engines can't even begin to fathom, even near the standard opening position. That brings up another requirement for an engine that plays perfect chess, i.e. that it be able to start from any position without a set of instructions for staying out of trouble (i.e. an opening book). As you point out, being able to always draw from the opening position does not infer that the engine could always hold a win or a draw from all other positions...
Parent - - By turbojuice1122 (Gold) Date 2014-12-21 04:33
While there are certainly exceptions to my statement (in the near- and mid-term contexts), I don't think that the example of anthropogenic global warming is one of them because I don't think it counts as a proper scientific theory.  Proper scientific theories should be falsifiable, i.e. making bold predictions that, directly, could conceivably prove the theory incorrect or incomplete.  In the case of AGW, however, whenever such predictions are made, they're either (a) so far into the future that we cannot really test them, or (b) they end up being wrong and require various augmenting hypotheses that themselves prove untestable.  Of course, this by itself doesn't make AGW wrong--it just makes it not a very good scientific theory.  I would guess that the AGW components that actually would count as valid scientific theories tend to be much more restrained in their predictions than those about which we frequently hear in the media.

Anyway, it might turn out that your statement about future engines making current Stockfish look as bad as it currently makes Crafty might turn out to be true in various critical positions, e.g. those relevant to the points shared between you and Dagh above.
Parent - By Banned for Life (Gold) Date 2014-12-21 07:02
While there are certainly exceptions to my statement (in the near- and mid-term contexts), I don't think that the example of anthropogenic global warming is one of them because I don't think it counts as a proper scientific theory.  Proper scientific theories should be falsifiable, i.e. making bold predictions that, directly, could conceivably prove the theory incorrect or incomplete.  In the case of AGW, however, whenever such predictions are made, they're either (a) so far into the future that we cannot really test them, or (b) they end up being wrong and require various augmenting hypotheses that themselves prove untestable.  Of course, this by itself doesn't make AGW wrong--it just makes it not a very good scientific theory.  I would guess that the AGW components that actually would count as valid scientific theories tend to be much more restrained in their predictions than those about which we frequently hear in the media.

I agree with you statement in its entirety, but because AGW is probably getting more funding than everything else put together, I would be surprised if it doesn't become the model for other scientific endeavors, real or imaginary. For some things it fits quite well, e.g. looking for near Earth orbit objects that could potentially hit the Earth and cause problems. In other areas it would be a real stretch, but the key is to get the media to write lots of stories for low information voters, then get politicians involved to 'save the Earth'. If there is enough money involved, the UN might even get involved with a cast of thousands, to collect a management fee. Extra bonus points if remediation requires redistribution of income from more to less developed countries. The UN is very interested in managing this type of effort, and will help with the science to keep it all going! :lol:
Up Topic The Rybka Lounge / Computer Chess / What is the drawrate of a spacebar?

Powered by mwForum 2.27.4 © 1999-2012 Markus Wichitill