Yes, books can have a positive or negative effect and basically a very old book is not going to help any engine at all :-). BTW, are you ready to go for the next Freestyle with team Cato the Younger!? Good luck!
Best wishes, Jeroen
WBEC tournaments should be mentioned too IMHO.
This one could (should ?) be the place where engines compete _with_ their best general book IMHO.
Marc
On the one hand, I think it would be great to test books somehow and give book authors a concrete measuring stick for their work.
On the other hand, testing an older book vs a newer book seems like a pointless exercise for obvious reasons.
Maybe some sort of creative scheme could be invented for this.
Vas
"On the one hand, I think it would be great to test books somehow and give book authors a concrete measuring stick for their work."
The major part of the book making process is testing so I do not doubt that good book authors have good testing schemes at home.
I do not think that they need "external" evaluations.
"On the other hand, testing an older book vs a newer book seems like a pointless exercise for obvious reasons."
What I propose is something similar to Leo's WBEC : everybody knows far in advance that the tournament will begin on a given date. Each engine author is offered to enter the tournament with the book of its choice. Once the tournament has begun no book nor engine change is allowed anymore.
With this rule, combined with the large number of participants, one could hope that one-engine-tailored or one-book-tailored books would not succeed.
Alltogether this would give a chance to engines with unbalanced qualities to perform at their best. By "unbalanced" I mean engines that do perform very well in some kind of positions and much worse in other ones. If such engines do exist, it's true that the way engines are commonly tested with generic books or fixed startpositions can be a severe handicap for them : in human chess we do not ask that every grandmaster should be able to play any existing opening. Some do play the french, some never do.
Marc
In practice, you'd probably find some guys who ignore the deadline and then claim anyway that their book is the best.
Vas
Shaun
Suppose that you have 2 engines:
1)Engine A that understands wider variety of lines but does not understand the specific lines that you play.
2)Engine B that understands smaller variety of lines but understand the specific lines that you play.
Engine B is going to be better for your correspondence games but Engine A is going to be better based on CEGT lists.
I think that the rating lists are boring because they put rybka always as number 1 and it is not correct that rybka is better than all engines in every opening so the lists simply do not give useful information.
I think that it is better to have also rating list when the target is to have a different leader by smart choice of opening.
A possible way to do it is to take openings when rybka lost in CEGT or CCRL and use these opening as a basis for rating list when programs play white and black.
Hopefully rybka is not going to lead this list and it may be interesting which engine is going to lead this list.
I think that correspondence players may be interested to buy the leader of the list that I suggest to have another opinion and today they do not know which engine is going to lead this list so they do not know which second engine is best to buy.
Uri
Suppose that you have 2 engines:
1)Engine A that understands wider variety of lines but does not understand the specific lines that you play.
2)Engine B that understands smaller variety of lines but understand the specific lines that you play.
Engine B is going to be better for your correspondence games
That is a good point and I agree.
I think that it is better to have also rating list when the target is to have a different leader by smart choice of opening.
I very much liked the CSS ratinglist, that had 10 different opening positions. In a way this give you information about which lines the engine prefer and which ones it plays relatively badly. Still, I think it takes quite some analysis and playing a lot of games with several engines to get a real picture here.
BTW, Erik Roggenburg also publishes a ratinglist, based on 30 predefined opening lines. IMO these can be helpful to select an engine that plays your lines well.
Kind regards, Jeroen
To make it even more complicated, I think it is very valuable in each correspondence game to have more than 1 engine verdict in every position. Perhaps in 9 out of 10 positions (just a guess, I don't know the exact number) the strongest engine gives the strongest output, but in that 1 position it could be very beneficial to follow the path of another engine.
Regards, Jeroen
> Perhaps in 9 out of 10 positions (just a guess, I don't know the exact number) the strongest engine gives the strongest output, but in that 1 position it could be very beneficial to follow the path of another engine.
It could also benefit from the surprise factor, specially if your opponent is only using Rybka and has most of Rybka's top suggestions covered very deep, but using Ktulu you find a very playable line that the opponent is not expecting. It's also good for beating the opponent psychically, as he may be shocked that now most of his analysis isn't useful and he has to begin from scratch :)
> If a given position is objectively won, lost or drawn, the best engine is the one that most closely approximates 0.00 or black/white mate. Given that, if Rybka rates a drawn position as +0.10 and another engine rates it -0.09, the other engine produced a "better" analysis. Likewise if a position is objectively won and another engine evaluates +0.01 more highly than Rybka, that engine was "better".
I have a different opinion here. If a position is objectively won, I want the engine which suggests the objectively best move to seal the win. (If lost, the move that puts up the toughest defense.) If the position is won, and another engine rates the position +0.01 more highly than Rybka, but does not suggest an equally good move, I'll stick with Rybka. The difference in numerical evaluation may be due to greater emphasis on material rather than positional concerns, or the decision that a bishop is 3.1 rather than 3 pawns (I won't list all the hypotheticals here). The place where I would worry about the numerical evaluation is when one engine says the position is won, and another says it is lost. Somehow in my mind this discrepancy is different than an argument over +2.40 and 2.39, but maybe I need to review special relativity and convince myself that 0.00 is not a special number as a reference point.
ty
0.00 is clearly a special number because 0 has the same translation for all programs when other numbers have different meaning for different programs.
Uri
I tried looking at a rating list based on some of these, where there were more games e.g. B90 however there are still too few games and the rating list was almost meaningless.
If we wanted to show ratings by opening or type of opening are there any existing groupings we could use based on the ECO or opening.
i.e. has anyone gone thriough these and characterised them.
I would have though anything more than about 8 groups would result in too few games...
Always looking for ways to present additional information
Shaun
My opinion anyiow
Wayne
PS no one cares for Rybka than i :)
Does that say something about the Hiarcs strength, or about the book strength? Will Hiarcs 12 be close to Rybka 2.3.2a if
Rybka plays with a more recent book? As RybkaII is about 16 months old now, it is not too difficult to make a very strong book
against it. So what are you saying? They killed RybkaII and now suddenly this is caused by 'Hiarcs being almost equal strong to
Rybka'?
No. I say that there are apparently some lines that Hiarcs naturally plays quite poorly, and so it needs a book tuned toward its strengths more than most of the other engines, including Rybka. Remember that what I am talking about is with Hiarcs 11, and the Hiarcs 11 book--this was released over a year ago, so that 16 months of yours is irrelevant.
Can you perhaps post the games of the Hiarcs-Rybka match played by Mark, so we can see how many Hiarcs 12 wins were
because of the book?
I don't have those--maybe they are somewhere, but I haven't searched--not much time recently.
CEGT plays with a general book that every program has to use. This means equal conditions, no possibilities for book wins,
just measure engine strength. For all programs the same conditions. The 'extra elo due to book wins' disappears and that gives
a much better view on the engine strength than with books.
Of course, you can be happy believing this if you want, but there are LOTS and LOTS of book wins in the CEGT games. I have seen no evidence that there are fewer book wins in these games than in normal chess. Also, a "book win" for one engine will not always be a "book win" for another engine.
Given the proper conditions (i.e. best book for Hiarcs and
a worse book for the opponent) you are inclined to think that Hiarcs 12 is close to Rybka.
By your own admission above, you recognize that this is not my inclination, i.e. I am using the book released for Rybka, i.e. Rybka II, and the book released for Hiarcs 11. Of course, the Hiarcs 12 book is going to be even better, but I wasn't taking that into account. These estimates are basically for the Hiarcs 11 book, released not too long after the Rybka II book.
Rybka was playing with my latest and private openingbook.
that in Leiden Rybka outbooked Hiarcs convincingly and had a winning position shortly after the opening.
So far we had 4 encounters Hiarcs-Rybka with a Rybka at 'full strength' and the score is 3-1 in Rybka's favour.
>This 32-18 "result" is not a result of testing the programs at playing a game of chess.
And it's about what? Backgammon? :-)
I mean please explain more detailed this:
>It is a result of testing analysis.
>Again, Hiarcs loses at least 60 elo relative to other programs when one makes the change from own books, which is what you have in match >conditions, to random "general" books.
What are the data/proof for that?
> This 32-18... is a result of testing analysis.
So in your opinion Rybka is that much stronger as an analysis engine?
Once you said:
"Fritz basically spotted Kramnik perhaps a 200 elo point advantage with that opening book show and everything, since it allowed Kramnik basically to prepare much deeper than the Fritz book and come out of the openings with a decent advantage."
How much advantage do you estimate that Rybka spotted the Hiarcs team (and hence Hiarcs 12) considering that RybkaII is about 16 months old now and Rybka 2.3.2a was released in the middle of last year? How much do you estimate this may have contributed to the result of Hiarcs-Rybka match performed by the Hiarcs team? Can you estimate that without seeing the games? Do you know if they are available somewhere?
Kramnik was preparing for a specific opponent, and thus could try to leave book fairly early and have very deep memorized variations based on knowledge of precisely what Fritz would do. However, Hiarcs 12 must also play against other opponents, and consistently trying to do this could easily backfire. Also, book learning was enabled, so this wouldn't work for more than a few games.
In any case, I have no doubt that Rybka 3 will be a lot stronger than Hiarcs 12, no matter whether Rybka II or Rybka III is used as the book.
> I don't put much store in the results of the Hiarcs team
OK.
to enter this bet :-)
will like it when it comes out :-).
Don't get me wrong: I was quite skeptical before I downloaded the book, as I had always considered traditional data bases, etc. much better sources regarding opening information. But then I found some very interesting lines in your book which I had not seen anywhere else (okay, maybe this is just because I'm usually quite ignorant).
In general, there are often holes in lines which are a bit off-beat, though not necessarily unsound. Especially when the variation is not recommended as a tournament line for either side. And there are some main lines left out, apparently just because they haven't been played in top-level tournaments (or in engine games) recently.
Instead, the book contains some rather bizarre lines up to move 30 or so. Just look at the abundance of variations after 1.h3. This seems to be some crazy online engine speciality, but I certainly won't spend a second preparing for something like this in my own games.
One last remark: Some tournament lines appear quite risky to me. I played approximately ten games against Rybka, and won one, but I cannot really be proud of it: The book just happened to contain a line in the Anti-Moscow which is just lost. The upshot seems to be that you included some very tricky lines which had just recently been introduced into practice. (I'm not sure whether this last point even is a criticism.)
time available you can imagine that the first priority goes to the main lines. These have to be as OK as
possible. But that means a lot of analysis, testing and so on. Furthermore, I want to present not only the
known games, but also novelties and ideas of my own. They also cost time.... In the end I must do my best
to get this part ready and alas, that means no more time to cover the offbeat lines.
Perhaps it is an idea to make a special book just on offbeat lines, I think that might be more interesting than
incorporating it in the official book. Also the audience for such a book would be different, compared with the
normal book.
To give you an idea: the past 3 months I have been working around 2 hours every mid day evening and full
time in almost all weekends and still I have the idea it is not enough for all I want to do :-)
Well, at least I did some analysis on the offbeat 1.d4 d5 2.c4 e5 line (because Moro is playing this) and the
Moscow is much much better covered now than in RybkaII. I'll take a look at the lines you mentioned.
Kind regards, Jeroen
Powered by mwForum 2.27.4 © 1999-2012 Markus Wichitill
