Not logged inRybka Chess Community Forum
Up Topic Rybka Support & Discussion / Rybka Discussion / Hiarcs 12 worse than Rybka232a ..
1 2 3 4 Previous Next  
Parent - - By Jeroen (*****) [nl] Date 2008-03-31 15:23
Hi Nelson,

Yes, books can have a positive or negative effect and basically a very old book is not going to help any engine at all :-). BTW, are you ready to go for the next Freestyle with team Cato the Younger!? Good luck!

Best wishes, Jeroen
Parent - By Nelson Hernandez (Silver) [us] Date 2008-03-31 17:56
Don't worry, we'll be ready. 
Parent - - By Nelson Hernandez (Silver) [us] Date 2008-03-31 18:00
Something just occurred to me.  If you want to squash everyone like a bug consistently you guys should issue a Rybka3a book, 3b to follow a month later, 3c a month after that, etc.  That might mess up the rating organizations but it would also frustrate pretenders to the throne.
Parent - - By Jeroen (*****) [nl] Date 2008-03-31 18:39
Yeah, that might be an interesting idea. On the other hand, the only official rating list for computer programs that allows own books, the SSDF, only accepts products 'out of the box'. So the book upgrades are not counting for the rating list.
Parent - - By Nelson Hernandez (Silver) [us] Date 2008-03-31 18:51
Most would agree SSDF has had its day and the ones that really count now are CCRL and CEGT on the basis of volume, credible hardware, good testing methodologies and the excellent results-transparency of the respective groups.  If a competitor cited SSDF as proof of their superiority while ignoring the others that would be shameful. 
Parent - - By Jeroen (*****) [nl] Date 2008-03-31 19:10
CEGT and CCRL do not test with engines own book......
Parent - - By Nelson Hernandez (Silver) [us] Date 2008-03-31 19:52
So in other words no rating sites would recognize book updates.  In that case, strike my earlier suggestion and make Rybka3.ctg as good as it can be.
Parent - By Jeroen (*****) [nl] Date 2008-04-01 19:26
I am working on it :-)
Parent - - By Marc Lacrosse (**) [be] Date 2008-03-31 20:09
I agree on CCRL and CEGT.

WBEC tournaments should be mentioned too IMHO.
This one could (should ?) be the place where engines compete _with_ their best general book IMHO.

Marc
Parent - - By Vasik Rajlich (Silver) [hu] Date 2008-04-01 13:17
Book testing is an interesting topic.

On the one hand, I think it would be great to test books somehow and give book authors a concrete measuring stick for their work.

On the other hand, testing an older book vs a newer book seems like a pointless exercise for obvious reasons.

Maybe some sort of creative scheme could be invented for this.

Vas
Parent - - By Marc Lacrosse (**) [be] Date 2008-04-01 13:56
"Book testing is an interesting topic."
"On the one hand, I think it would be great to test books somehow and give book authors a concrete measuring stick for their work."

The major part of the book making process is testing so I do not doubt that good book authors have good testing schemes at home.
I do not think that they need "external" evaluations.

"On the other hand, testing an older book vs a newer book seems like a pointless exercise for obvious reasons."

What I propose is something similar to Leo's WBEC : everybody knows far in advance that the tournament will begin on a given date. Each engine author is offered to enter the tournament with the book of its choice. Once the tournament has begun no book nor engine change is allowed anymore.

With this rule, combined with the large number of participants, one could hope that one-engine-tailored or one-book-tailored books would not succeed.

Alltogether this would give a chance to engines with unbalanced qualities to perform at their best. By "unbalanced" I mean engines that do perform very well in some kind of positions and much worse in other ones. If such engines do exist, it's true that the way engines are commonly tested with generic books or fixed startpositions can be a severe handicap for them : in human chess we do not ask that every grandmaster should be able to play any existing opening. Some do play the french, some never do.

Marc
Parent - By Vasik Rajlich (Silver) [hu] Date 2008-04-06 15:06
This idea would work if all of the teams could be sold on it.

In practice, you'd probably find some guys who ignore the deadline and then claim anyway that their book is the best.

Vas
Parent - By Nelson Hernandez (Silver) [us] Date 2008-04-01 19:06
It has BEEN invented.  But it is proprietary.  :)
Parent - - By billyraybar (***) [us] Date 2008-04-01 22:36
I believe the ultimate goal for a book author is to create a book that stands the test of time -- i.e. the lines are good regardless of how strong the engine or player is that uses the book.  Obviously this is difficult to do.  I wish Suj would chime in on this subject.  He has some interesting views with respect to opening books.
Parent - - By turbojuice1122 (Gold) [us] Date 2008-04-01 22:39
But sometimes uses the wrong ones--recall the Freestyle before last and the book loss in the Poisoned Pawn Sicilian.
Parent - By Nelson Hernandez (Silver) [us] Date 2008-04-02 11:26
In his defense, there is often a world of difference between ideas and practical results.  I've never run across a book that never followed a bad line, if left unattended. 
Parent - By billyraybar (***) [us] Date 2008-04-03 13:16
Yes, the Poisoned pawn loss.  The sensational 24. Bc7!!
Parent - By Shaun Brewer (****) [gb] Date 2008-04-01 15:05
I for one am looking forward to the next SSDF update...

Shaun
Parent - By Werewolf (*****) [gb] Date 2008-04-02 12:27
i like this idea
Parent - - By Uri Blass (*****) [il] Date 2008-03-30 21:20
The number of lines that the engine understand is not always the important thing.

Suppose that you have 2 engines:
1)Engine A that understands wider variety of lines but does not understand the specific lines that you play.
2)Engine B that understands smaller variety of lines but understand the specific lines that you play.

Engine B is going to be better for your correspondence games but Engine A is going to be better based on CEGT lists.

I think that the rating lists are boring because they put rybka always as number 1 and it is not correct that rybka is better than all engines in every opening so the lists simply do not give useful information.

I think that it is better to have also rating list when the target is to have a different leader by smart choice of opening.

A possible way to do it is to take openings when rybka lost in CEGT or CCRL and use these opening as a basis for rating list when programs play white and black.
Hopefully rybka is not going to lead this list and it may be interesting which engine is going to lead this list.

I think that correspondence players may be interested to buy the leader of the list that I suggest to have another opinion and today they do not know which engine is going to lead this list so they do not know which second engine is best to buy.

Uri
Parent - By Jeroen (*****) [nl] Date 2008-03-31 15:28
Hi Uri,

Suppose that you have 2 engines:
1)Engine A that understands wider variety of lines but does not understand the specific lines that you play.
2)Engine B that understands smaller variety of lines but understand the specific lines that you play.

Engine B is going to be better for your correspondence games


That is a good point and I agree.

I think that it is better to have also rating list when the target is to have a different leader by smart choice of opening.

I very much liked the CSS ratinglist, that had 10 different opening positions. In a way this give you information about which lines the engine prefer and which ones it plays relatively badly. Still, I think it takes quite some analysis and playing a lot of games with several engines to get a real picture here.

BTW, Erik Roggenburg also publishes a ratinglist, based on 30 predefined opening lines. IMO these can be helpful to select an engine that plays your lines well.

Kind regards, Jeroen
Parent - - By Jeroen (*****) [nl] Date 2008-03-31 15:55
Hi Uri,

To make it even more complicated, I think it is very valuable in each correspondence game to have more than 1 engine verdict in every position. Perhaps in 9 out of 10 positions (just a guess, I don't know the exact number) the strongest engine gives the strongest output, but in that 1 position it could be very beneficial to follow the path of another engine.

Regards, Jeroen
Parent - By Uly (Gold) [mx] Date 2008-04-01 01:19

> Perhaps in 9 out of 10 positions (just a guess, I don't know the exact number) the strongest engine gives the strongest output, but in that 1 position it could be very beneficial to follow the path of another engine.


It could also benefit from the surprise factor, specially if your opponent is only using Rybka and has most of Rybka's top suggestions covered very deep, but using Ktulu you find a very playable line that the opponent is not expecting. It's also good for beating the opponent psychically, as he may be shocked that now most of his analysis isn't useful and he has to begin from scratch :)
Parent - - By Nelson Hernandez (Silver) [us] Date 2008-04-02 15:32
That's an interesting comment, as it isn't what we've heard from others on the Rybka team before.  Personally I think 9/10 is way too optimistic.  But the real problem is how you define "best".  If a given position is objectively won, lost or drawn, the best engine is the one that most closely approximates 0.00 or black/white mate.  Given that, if Rybka rates a drawn position as +0.10 and another engine rates it -0.09, the other engine produced a "better" analysis.  Likewise if a position is objectively won and another engine evaluates +0.01 more highly than Rybka, that engine was "better".  However there's a problem with this approach: engines all use slightly different measuring sticks, though in fact on a pure basis the numbers should scale perfectly from zero to infinity (i.e. mate).
Parent - - By Ty Nance (**) [us] Date 2008-04-02 17:04 Edited 2008-04-02 17:07

> If a given position is objectively won, lost or drawn, the best engine is the one that most closely approximates 0.00 or black/white mate. Given that, if Rybka rates a drawn position as +0.10 and another engine rates it -0.09, the other engine produced a "better" analysis. Likewise if a position is objectively won and another engine evaluates +0.01 more highly than Rybka, that engine was "better".


I have a different opinion here. If a position is objectively won, I want the engine which suggests the objectively best move to seal the win. (If lost, the move that puts up the toughest defense.) If the position is won, and another engine rates the position +0.01 more highly than Rybka, but does not suggest an equally good move, I'll stick with Rybka. The difference in numerical evaluation may be due to greater emphasis on material rather than positional concerns, or the decision that a bishop is 3.1 rather than 3 pawns (I won't list all the hypotheticals here). The place where I would worry about the numerical evaluation is when one engine says the position is won, and another says it is lost. Somehow in my mind this discrepancy is different than an argument over +2.40 and 2.39, but maybe I need to review special relativity and convince myself that 0.00 is not a special number as a reference point.

ty
Parent - By Uri Blass (*****) [il] Date 2008-04-02 21:17
You are clearly right otherwise people can improve the analysis of programs in part of the positions simply by multiplying the evaluation by 2 and it is not logical.

0.00 is clearly a special number because 0 has the same translation for all programs when other numbers have different meaning for different programs.

Uri
Parent - By Shaun Brewer (****) [gb] Date 2008-04-02 16:52
Hi Uri,

I tried looking at a rating list based on some of these, where there were more games e.g. B90 however there are still too few games and the rating list was almost meaningless.

If we wanted to show ratings by opening or type of opening are there any existing groupings we could use based on the ECO or opening.

i.e. has anyone gone thriough these and characterised them.

I would have though anything more than about 8 groups would result in too few games...

Always looking for ways to present additional information

Shaun
Parent - By Wayne Lowrance (***) Date 2008-03-30 20:01
When you compare strength  of programs, from a buyers point of view, the book is part of the program, therefore you must not discount it/
My opinion anyiow
Wayne
PS no one cares for Rybka than i :)
Parent - By turbojuice1122 (Gold) [us] Date 2008-03-30 20:36
So basically you say: the latest Hiarcs book is very good. And it has been well tested against RybkaII, Perfect 13 and so on.
Does that say something about the Hiarcs strength, or about the book strength? Will Hiarcs 12 be close to Rybka 2.3.2a if
Rybka plays with a more recent book? As RybkaII is about 16 months old now, it is not too difficult to make a very strong book
against it. So what are you saying? They killed RybkaII and now suddenly this is caused by 'Hiarcs being almost equal strong to
Rybka'?


No.  I say that there are apparently some lines that Hiarcs naturally plays quite poorly, and so it needs a book tuned toward its strengths more than most of the other engines, including Rybka.  Remember that what I am talking about is with Hiarcs 11, and the Hiarcs 11 book--this was released over a year ago, so that 16 months of yours is irrelevant.

Can you perhaps post the games of the Hiarcs-Rybka match played by Mark, so we can see how many Hiarcs 12 wins were
because of the book?


I don't have those--maybe they are somewhere, but I haven't searched--not much time recently.

CEGT plays with a general book that every program has to use. This means equal conditions, no possibilities for book wins,
just measure engine strength. For all programs the same conditions. The 'extra elo due to book wins' disappears and that gives
a much better view on the engine strength than with books.


Of course, you can be happy believing this if you want, but there are LOTS and LOTS of book wins in the CEGT games.  I have seen no evidence that there are fewer book wins in these games than in normal chess.  Also, a "book win" for one engine will not always be a "book win" for another engine.

Given the proper conditions (i.e. best book for Hiarcs and
a worse book for the opponent) you are inclined to think that Hiarcs 12 is close to Rybka.


By your own admission above, you recognize that this is not my inclination, i.e. I am using the book released for Rybka, i.e. Rybka II, and the book released for Hiarcs 11.  Of course, the Hiarcs 12 book is going to be even better, but I wasn't taking that into account.  These estimates are basically for the Hiarcs 11 book, released not too long after the Rybka II book.
Parent - - By Jeroen (*****) [nl] Date 2008-03-30 16:25
I want to add that Hiarcs has never beaten Rybka in an official tournament game when
Rybka was playing with my latest and private openingbook.
Parent - - By turbojuice1122 (Gold) [us] Date 2008-03-30 17:22
I guess I'm not sure what you mean by this--has Rybka with your "latest and private openingbook" even played in any official tournament games?  Have there even been any major tournaments since you "last updated" it?  Or do you mean that the tournaments that Rybka has entered recently have not been played with your book?
Parent - - By Jeroen (*****) [nl] Date 2008-03-30 17:47
In CCT - where Hiarcs won - Rybka didn't play with my private book. In Paderborn and Leiden it did. Remember
that in Leiden Rybka outbooked Hiarcs convincingly and had a winning position shortly after the opening.

So far we had 4 encounters Hiarcs-Rybka with a Rybka at 'full strength' and the score is 3-1 in Rybka's favour.
Parent - By turbojuice1122 (Gold) [us] Date 2008-03-30 20:38
Okay, so now we're talking about some evidence in your favor.
Parent - - By George Tsavdaris (****) Date 2008-03-30 16:48

>This 32-18 "result" is not a result of testing the programs at playing a game of chess.


And it's about what? Backgammon? :-)

I mean please explain more detailed this:

>It is a result of testing analysis.


>Again, Hiarcs loses at least 60 elo relative to other programs when one makes the change from own books, which is what you have in match >conditions, to random "general" books.


What are the data/proof for that?
Parent - - By turbojuice1122 (Gold) [us] Date 2008-03-30 17:24
Hiarcs 11 1 CPU "own book" was rated 2880 on CEGT list, with opponents also all using "own book".  Hiarcs 11 1 CPU "general book" is rated at 2805 with opponents also all using "general book".  I decided to be conservative with the 60 elo estimate.
Parent - - By Jeroen (*****) [nl] Date 2008-03-30 17:53
In which CEGT rating list engines are allowed to use their own book?
Parent - By turbojuice1122 (Gold) [us] Date 2008-03-30 20:38
They're not, at least not generally--but this was put in as part of an experiment about a year ago.  It has since been taken out.
Parent - - By Dadi Jonsson (Silver) [is] Date 2008-03-30 18:57

> This 32-18... is a result of testing analysis.


So in your opinion Rybka is that much stronger as an analysis engine?

Once you said:

"Fritz basically spotted Kramnik perhaps a 200 elo point advantage with that opening book show and everything, since it allowed Kramnik basically to prepare much deeper than the Fritz book and come out of the openings with a decent advantage."

How much advantage do you estimate that Rybka spotted the Hiarcs team (and hence Hiarcs 12) considering that RybkaII is about 16 months old now and Rybka 2.3.2a was released in the middle of last year? How much do you estimate this may have contributed to the result of Hiarcs-Rybka match performed by the Hiarcs team? Can you estimate that without seeing the games? Do you know if they are available somewhere?
Parent - - By turbojuice1122 (Gold) [us] Date 2008-03-30 20:44
I don't put much store in the results of the Hiarcs team--I just note that they're basically consistent with the idea that Hiarcs 12 is roughly as strong as Rybka 2.3.2a in "normal chess" based on the rating difference between Hiarcs 11 "own book" (where all opponents are also "own book") and Hiarcs 11, the rating difference between Hiarcs 11.2 and Hiarcs 11, and the rating difference between Hiarcs 12 and Hiarcs 11.2 as found in those tests.  Since we're talking about rating differences between different versions of Hiarcs, and the Hiarcs 11 book was used as the "base", and isn't all that much newer than Rybka II, this is a reasonably good estimate.

Kramnik was preparing for a specific opponent, and thus could try to leave book fairly early and have very deep memorized variations based on knowledge of precisely what Fritz would do.  However, Hiarcs 12 must also play against other opponents, and consistently trying to do this could easily backfire.  Also, book learning was enabled, so this wouldn't work for more than a few games.

In any case, I have no doubt that Rybka 3 will be a lot stronger than Hiarcs 12, no matter whether Rybka II or Rybka III is used as the book.
Parent - By Dadi Jonsson (Silver) [is] Date 2008-03-30 20:50

> I don't put much store in the results of the Hiarcs team


OK.
Parent - - By Jeroen (*****) [nl] Date 2008-03-30 16:28
If I could provide the book for either Naum 3 and/or Zappa Mexico II, I am very much willing
to enter this bet :-)
Parent - - By Werewolf (*****) [gb] Date 2008-03-30 16:39
And may I add that we're all looking forward to the Rybka 3 book!
Parent - - By Jeroen (*****) [nl] Date 2008-03-30 16:49
Thanks! I have been working on Rybka3.ctg for 3 months now, so I hope that you
will like it when it comes out :-).
Parent - - By Roland Rösler (****) [de] Date 2008-03-30 16:58
In the next three months you can complete your work :-)
Parent - By Werewolf (*****) [gb] Date 2008-03-30 16:59
don't be rude.
Parent - By Jeroen (*****) [nl] Date 2008-03-30 17:22
Ha ha ha, good one! :-)
Parent - - By Kappatoo (****) [de] Date 2008-03-30 17:03
Will it have a broader repertoire than Rybka2.ctg? I have found that the book is quite small, and there are many interesting lines which are missing or only covered quite superficially. I concede that this probably doesn't hurt in engine-engine matches, but personally, I'm not interested in this.
Don't get me wrong: I was quite skeptical before I downloaded the book, as I had always considered traditional data bases, etc. much better sources regarding opening information. But then I found some very interesting lines in your book which I had not seen anywhere else (okay, maybe this is just because I'm  usually quite ignorant).
Parent - - By Jeroen (*****) [nl] Date 2008-03-30 17:23
Which lines would you like to see?
Parent - - By Kappatoo (****) [de] Date 2008-03-30 18:10
Ah, a wishlist. :) Now you got me. Seriously, it would take some time to list everything. Just some examples: There are some very important lines missing in the Blumenfeld. For example, there isn't much after 5.Bg5 Qa5+. And after 5.de: fe: 6.cb: d5, 7. Bg5 is missing completely! But this is certainly one of the main moves here (by the way, 7.g4 is given here, amongst others). In the French, there is just one line in the gambit after 2.Nf3 d5 3.e5 c5 4.b4 (, and possibly not the best one). There is hardly anything after 1.Nf3 f5 2.d3, although this is quite a dangerous variation.
In general, there are often holes in lines which are a bit off-beat, though not necessarily unsound. Especially when the variation is not recommended as a tournament line for either side. And there are some main lines left out, apparently just because they haven't been played in top-level tournaments (or in engine games) recently.
Instead, the book contains some rather bizarre lines up to move 30 or so. Just look at the abundance of variations after 1.h3. This seems to be some crazy online engine speciality, but I certainly won't spend a second preparing for something like this in my own games.
One last remark: Some tournament lines appear quite risky to me. I played approximately ten games against Rybka, and won one, but I cannot really be proud of it: The book just happened to contain a line in the Anti-Moscow which is just lost. The upshot seems to be that you included some very tricky lines which had just recently been introduced into practice. (I'm not sure whether this last point even is a criticism.)
Parent - - By Jeroen (*****) [nl] Date 2008-03-30 18:28
Aha, those are quite offbeat lines :-). Here the true problem of a book author appears: time. With limited
time available you can imagine that the first priority goes to the main lines. These have to be as OK as
possible. But that means a lot of analysis, testing and so on. Furthermore, I want to present not only the
known games, but also novelties and ideas of my own. They also cost time.... In the end I must do my best
to get this part ready and alas, that means no more time to cover the offbeat lines.

Perhaps it is an idea to make a special book just on offbeat lines, I think that might be more interesting than
incorporating it in the official book. Also the audience for such a book would be different, compared with the
normal book.

To give you an idea: the past 3 months I have been working around 2 hours every mid day evening and full
time in almost all weekends and still I have the idea it is not enough for all I want to do :-)

Well, at least I did some analysis on the offbeat 1.d4 d5 2.c4 e5 line (because Moro is playing this) and the
Moscow is much much better covered now than in RybkaII. I'll take a look at the lines you mentioned.

Kind regards, Jeroen
Up Topic Rybka Support & Discussion / Rybka Discussion / Hiarcs 12 worse than Rybka232a ..
1 2 3 4 Previous Next  

Powered by mwForum 2.27.4 © 1999-2012 Markus Wichitill