Not logged inRybka Chess Community Forum
Up Topic The Rybka Lounge / Computer Chess / What is the best way to run engine vs engine games?
- - By Zsolt Kántor (**) Date 2019-07-12 14:54 Edited 2019-07-12 16:20
I want to find out which engine is better from 2 UCI engines. What kind of settings should I use (time control, book on or off, the number of threads, pondering on or off, hash size . . .) for this purpose? Eventually it also depends what kind of GUI I use?

Thanks.
Parent - By user923005 (****) Date 2019-07-12 19:33
It depends on a lot of things, starting with what hardware you plan to use and what programs.
Parent - - By Gaмßito (****) Date 2019-07-14 11:27
To find which engine is really better FIRST both engines need to play LOT of games. The MORE games, the BETTER!
For the tests to be reliable, it's also important that both engines can play the SAME openings with reversed colors. This is very important.

You can adapt the number of threads and the Ponder ONLY in a way that both engines do NOT interfere with the ''thinking'' of the other engine, otherwise the final results will not be reliable at all.

There are many GUIs available and each person has their own preferences. I still prefer the old Fritz 12 GUI!

Regards,
Gaмßito.
Parent - - By Zsolt Kántor (**) Date 2019-07-17 16:48 Edited 2019-07-17 16:52
Thanks for the clarifications. Concerning the number of games I saw that thousand of games are played for instance in case of the Stockfish engine development builds.
Unfortunately "I don't have the time" to let the engines to play so much, that would take up more weeks probably . . .

I prefer Fritz 12 also.

I understood everything, less the part with the opening book. How can I make the two engines play the same openings?
I usually select the default Fritz book for both engines, and in the book option I select "Tournament book" (I also click on the optimize button).
Parent - - By Gaмßito (****) Date 2019-07-19 04:37

>I understood everything, less the part with the opening book. How can I make the two engines play the same openings?


To do that in Fritz 12 GUI you need to use the SAME opening book and NOT different opening books. Then use the option ''Alternate Colors'' in Engine vs Engine options.
If you use different opening books for each engine, then every game will have a different opening.

But there is some risk: still using the same opening book (for both engines), it is likely that after enough games played certain openings can be repeated occasionally.

So I think it's a MUCH better option to use a good test-suite where every position can be different, but of course forcing engines to play the SAME opening with reversed colors!
In this way the engines will always play a different opening every two games and the opening never will be repeated.

Regards,
Gaмßito.
Parent - - By Uly (Gold) Date 2019-07-19 05:12

>So I think it's a MUCH better option to use a good test-suite where every position can be different, but of course forcing engines to play the SAME opening with reversed colors!
>In this way the engines will always play a different opening every two games and the opening never will be repeated.


No, that's what most testers are doing, and they're doing it wrong. Rating lists like the CCRL and CEGT are doing what you suggest, so doing more tests like those is useless, you could as well download the PGNs they produce, and they're not worse than what you'd have produced at home.

The problem is 95% of chess lines in opening suites are irrelevant, nobody in their right mind would play them in a serious chess game when trying to win, so it makes no sense to test chess engines for those. Only critical lines should be tested, then 100% of played games are relevant in telling you what engine is best in the chess positions that actually matter.

A new engine tester can surpass easily the relevancy of TCEC or the rating lists by playing chess lines that matter exclusively, let's not recommend them to reinvent a wheel that doesn't really work just because it has been the way the old guard has been doing it, because they don't know better (yes, I'm comparing this to Alphabeta engines vs Neural Networks, innovation comes from doing things differently, and better).
Parent - - By Gaмßito (****) Date 2019-07-19 12:13

>No, that's what most testers are doing, and they're doing it wrong. Rating lists like the CCRL and CEGT are doing what you suggest, so doing more tests like those is useless, you could as well download the PGNs they >produce, and they're not worse than what you'd have produced at home.


I agree that many testers now use mediocre openings that are totally irrelevant! This is why I clearly said to use a ''GOOD'' test-suite :razz:.

Of course, if he only wants to play 50 or 100 games, he can USE several good test-suites available here on this forum. But if he wants to play much more games, then he will need to gather several test-suites in ONE file, or why not, also add his own good openings.

>A new engine tester can surpass easily the relevancy of TCEC or the rating lists by playing chess lines that matter exclusively, let's not recommend them to reinvent a wheel that doesn't really work just because it has >been the way the old guard has been doing it, because they don't know better (yes, I'm comparing this to Alphabeta engines vs Neural Networks, innovation comes from doing things differently, and better).


Yes, a good tester must have a GOOD test suite with all the relevant chess openings. Not ''garbage'' openings, but the most important ones.

Regards,
Gaмßito.
Parent - - By Uly (Gold) Date 2019-07-20 06:17
Try those test suites against a strong opening book and see them reach a lost position every game...
Parent - - By Gaмßito (****) Date 2019-07-20 07:16

>Try those test suites against a strong opening book and see them reach a lost position every game...


Uly: you are comparing apples with oranges.

A good test suite leave room for the engines to calculate and show exactly how good or bad they can play on those openings/positions! That is the goal.

While by rule, a strong opening book would have extremely long lines (see ''Cerebellum'' for example). That would NOT leave room for the engines to show whether or not they can play those openings correctly, unless you tell the GUI to NOT exceed from certain number of moves! So, otherwise the engines would start playing practically at the end of the game. That makes no sense at all, if what you really want to know is how much strength both engines really have.

Two different things for different purposes!

Regards,
Gaмßito.
Parent - - By Uly (Gold) Date 2019-07-20 07:41 Edited 2019-07-20 07:43

> A good test suite leave room for the engines to calculate and show exactly how good or bad they can play on those openings/positions!


How can those openings/positions be good if nobody would play them seriously in a chess game? That's what makes them irrelevant!

>That is the goal.


No, the goal is to know what is the strongest chess entity, and that includes book. If an engine is the strongest one with 1.e4 but subpar with 1.d4, what business does it have playing d4?? How does the Testing Suite know this? A testing suite that includes this entity playing d4 is irrelevant. Starting the engine with a strong book and modifying it to make the engine the strongest is the only way to know its true strength, which should be the goal of testing it, not knowing its performance in random positions it'll never actually play.

>While by rule, a strong opening book would have extremely long lines (see ''Cerebellum'' for example). That would NOT leave room for the engines to show whether or not they can play those openings correctly, unless you tell the GUI to NOT exceed from certain number of moves! So, otherwise the engines would start playing practically at the end of the game. That makes no sense at all, if what you really want to know is how much strength both engines really have.


Yeah, so Cerebellum is terrible to test engines. But its long lines aren't even good at all, Rebel at Talkchess had no problem creating a small bin book that could neutralize Cerebellum and beat it. What you need is a book that plays good against anything the opponent throws at you and leaves engine with advantage on the clock and decent position. Suffices to play a few games to check if the book is suitable for this, and if not, try another.

But only playing the strongest chess lines the engine would play would show how much strength both engines really have, not how much strength both engines have in some positions from a random opening suite nobody cares about.

What if Houdini 6 was the strongest engine if you aim for the best lines it could play against the likes of Stockfish, Leela or Komodo? The rating lists would never know as they're stuck on stupid with their generic opening suites...
Parent - - By Gaмßito (****) Date 2019-07-20 14:25

>How can those openings/positions be good if nobody would play them seriously in a chess game? That's what makes them irrelevant!


Uly: I told you several times that it does NOT happen if you're using a GOOD test-suite. I am not talking about using irrelevant-garbage openings that nobody play!

>No, the goal is to know what is the strongest chess entity, and that includes book.


If what the user really wants is to know what is the strongest chess entity, then yes, you can include several good books on the testing. But my guess is that he only wants to know what's the stronger chess ENGINE. He would have to specify that.

>If an engine is the strongest one with 1.e4 but subpar with 1.d4, what business does it have playing d4?? How does the Testing Suite know this? A testing suite that includes this entity playing d4 is irrelevant.


If your engine sucks with 1.d4 then that is what a good test suite was designed for. To show you the weaknesses and strengths that the engine has with certain openings. But note that a good test suite must be quite balanced. You can not have 100 openings that starts with 1.d4 and 25 openings that starts with 1.e4. That would be really wrong to favor certain openings! Again: here I'm not speaking about those mediocre-irrelevant openings that testers are using today.

>Yeah, so Cerebellum is terrible to test engines. But its long lines aren't even good at all, Rebel at Talkchess had no problem creating a small bin book that could neutralize Cerebellum and beat it. What you need is a >book that plays good against anything the opponent throws at you and leaves engine with advantage on the clock and decent position.


I'm more interested in testing chess engines, and not too much interested in ''opening books''; but of course I know the opening book (as it's the hardware) is quite crucial if we are looking for the strongest chess entity. Anyway I would like to see how using IDENTICAL engines, one engine with a small ''bin'' book can neutralize the other identical engine using a strong book like Cerebellum or even beat it! I'll try to find that on Talkchess because that sounds very strange!

>What if Houdini 6 was the strongest engine if you aim for the best lines it could play against the likes of Stockfish, Leela or Komodo? The rating lists would never know as they're stuck on stupid with their generic >opening suites...


What always tells us who is stronger is the Elo. The engine with more Elo will always play better. No matter what openings they play. The only way to avoid this is by playing specific openings where the stronger engine could lose. But that would be ridiculous to do!

As you know, every top engine (SF, Komodo or Houdini) is tested by playing many many thousands of games (with many irrelevant openings!) against the previous version. In those very long tests even those stupid and irrelevant openings are working relatively well to validate the improvements in the new engine.

So at the end, you can make the new version to play some very good openings against the older version, and the new version will always beat the older engine, (if the new version is enough strong of course). Here the openings they can use (relevant or irrelevant) will not matter at all and will not change the final result at all.

Regards,
Gaмßito.
Parent - - By Uly (Gold) Date 2019-07-21 02:42 Edited 2019-07-21 02:49

> Uly: I told you several times that it does NOT happen if you're using a GOOD test-suite. I am not talking about using irrelevant-garbage openings that nobody play!


Can you point to an actual Suite so we discuss it on the specific? All I've seen could be refuted with stronger moves...

>But my guess is that he only wants to know what's the stronger chess ENGINE


Are you suggesting him to play without books? Because all you're doing with a test suite is creating a chess entity that is forced to play the suite's moves. If a chess entity appears no matter what, you'd better test what is the strongest one...

>If your engine sucks with 1.d4 then that is what a good test suite was designed for.


No, because if a human sucks with 1.d4, the human doesn't need to play 1.d4 at all. Chess entities have no reason whatsoever to play suboptimal moves, and d4 would be a suboptimal move for this chess entity, so you solve it by marking 1.d4 red, you don't waste your time generating irrelevant 1.d4 games (but, their opponents can play 1.d4 if they like, that's why playing same position from both sides doesn't work.)

>I'm more interested in testing chess engines, and not too much interested in ''opening books'';


Opening books just save time. If after a line on the test, the engine plays a move and wins, then it makes no sense to have the engine think about this move the next time, you just make the engine play the move instantly, and it'll play the rest of the game a bit stronger. That's how books are built.

>Anyway I would like to see how using IDENTICAL engines, one engine with a small ''bin'' book can neutralize the other identical engine using a strong book like Cerebellum or even beat it! I'll try to find that on Talkchess because that sounds very strange!



It's not strange at all, any book can be neutralized once it is public, Cerebellum is not special and its lines aren't that strong.

Here's the relevant discussion:

http://talkchess.com/forum3/viewtopic.php?p=801971#p801971

Which is the point, meaning pointless. Any book can be hacked once it is publicly available. Attached a Polyglot opening book I made about a year ago of just 22Kb (1.345 positions) that beats the a CL book of 2018. There is no fun in that kind of competition.
~Rebel (Ed Schröder)

You could ask him to make another one that beats current Cerebellum Light, or Full Cerebellum. Anyone can do it. The process has one looking at the lines Cerebellum plays and refuting them, and adding those refutations to another book. It's simple.

>What always tells us who is stronger is the Elo. The engine with more Elo will always play better.


Yeah, and what engine is going to play stronger? One that is forced to play from weak lines from an opening suite? (if we have established that Cerebellum's lines are weak then what chance do opening suites have to compete?) Or one that plays from a strong private book that is fine-tuned by the tester to play the best lines that it can?

The STRONGEST engine from an opening suite would lose badly against the STRONGEST engine using a tournament book, so why would you test the former?

>As you know, every top engine (SF, Komodo or Houdini) is tested by playing many many thousands of games (with many irrelevant openings!)


Exactly, people could test engines with relevant openings and do it with 5% of the effort, I think you just made my point.
Parent - - By Gaмßito (****) Date 2019-07-21 14:44

>Can you point to an actual Suite so we discuss it on the specific? All I've seen could be refuted with stronger moves...


The Suites that are here in this forum for example (Jeroen have made several). I also have my own test suites but they are private.

Uly, the goal to use a test-suite is only to see how the engines will develop his games on certain openings, and of course check if they can make good games or if they have problems with every opening the suite has. Nothing else. Of course, it has no sense to use openings that nobody plays, but those that are most relevant or known.

>Are you suggesting him to play without books? Because all you're doing with a test suite is creating a chess entity that is forced to play the suite's moves. If a chess entity appears no matter what, >you'd better test what is the strongest one...


He said he want to know how to make engine vs engine tests and know what engine could be the stronger. Here I said that I prefer to use a GOOD test suite (that would have a good number of different openings but also those that could be the most relevant opening positions). There are GUIs that do not allow the engines to play repeated openings, but of course, they are not all. Not so with the Fritz 12 GUI.

>No, because if a human sucks with 1.d4, the human doesn't need to play 1.d4 at all. Chess entities have no reason whatsoever to play suboptimal moves, and d4 would be a >suboptimal move for this chess entity, so you solve it by marking 1.d4 red, you don't waste your time generating irrelevant 1.d4 games (but, their opponents can play 1.d4 if >they like, that's why playing same position from both sides doesn't work.)


The human of course doesn't need to play 1.d4 if he sucks with that. Humans play what they know best. But when we are testing engines in order to know how stronger by using a variety of different openings, we usually make the engines play all those openings that interest us. And that is the purpose of the test suite. To see how the engines develop all those different openings.

>Opening books just save time. If after a line on the test, the engine plays a move and wins, then it makes no sense to have the engine think about this move the next time, you >just make the engine play the move instantly, and it'll play the rest of the game a bit stronger. That's how books are built.


You're right and I've never said otherwise.

When you play in a World Championship or a major tournament, everything is valid, a great hardware, a huge opening book and everything you have to win.
But if you really want to see which engine is really stronger in your own tests, equality is an important point. And a good test-suite with a good number of relevant openings where each engine can play with reversed colors is necessary.

>It's not strange at all, any book can be neutralized once it is public, Cerebellum is not special and its lines aren't that strong.


I heard many good opinions about Cerebellum book in all this time, but I'm not sure how true is that. I can not talk much about it because I usually don't test books.
In fact, I only have made one small opening book (500kb) and made years ago. I like it because it has many relevant lines but his lines are not too long. Anyway I'll try to make some tests with Ed's tiny book against Cerebellum later. Thanks for the link!

>Yeah, and what engine is going to play stronger? One that is forced to play from weak lines from an opening suite? (if we have established that Cerebellum's lines are weak >then what chance do opening suites have to compete?) Or one that plays from a strong private book that is fine-tuned by the tester to play the best lines that it can?


Making a test suite to compete with a good opening book does not make sense. A good book can be designed to surprise the opponent in certain lines, or even solve traps that the engine could not do by itself or would take a long time to do! A good test suite usually is only designed to see how the engines develop his games and in which openings they are failing or not.

>The STRONGEST engine from an opening suite would lose badly against the STRONGEST engine using a tournament book, so why would you test the former?


I don't have doubts about that! But in your case the one that would be playing the openings and a large part of the middle game would be the BOOK and NOT the engine.
As I told you, in important tournaments, or in a World Championship, you must use all your weapons and all the arsenal you have. But if you really want to test engines and see what is really the stronger, an equal hardware for both (like Rating lists do), plus a good and balanced test suite, (not many are right here) is necessary.
With a test-suite we want to see the engines playing, and not the books or other things that can help them!

Regards,
Gaмßito.
Parent - - By Uly (Gold) Date 2019-07-23 01:07

>Uly, the goal to use a test-suite is only to see how the engines will develop his games on certain openings, and of course check if they can make good games or if they have problems with every opening the suite has


That's a good starting point. Afterwards, you'd want to test only the openings the engine is good at, to know what's the strongest engine in the best lines it can play, so at this point the suite is useless.

>He said he want to know how to make engine vs engine tests and know what engine could be the stronger. Here I said that I prefer to use a GOOD test suite (that would have a good number of different openings but also those that could be the most relevant opening positions)


Okay, but after you've figured the openings you want to use to make the engine the strongest, then you're going to need to use a book that plays those, anyway, so you can start with a book and modify it already.

>But when we are testing engines in order to know how stronger by using a variety of different openings, we usually make the engines play all those openings that interest us.


The ratings lists have already done that, so you don't need to test at all, just download their games and see how engines perform in different openings. Then if you want to know what's the strongest engine, at home you test the strongest openings it plays.

>When you play in a World Championship or a major tournament, everything is valid, a great hardware, a huge opening book and everything you have to win.


But if you really want to see which engine is really stronger in your own tests, equality is an important point.

Both results should be the same, if your test says Engine A is stronger, but Engine B would win a World Championship, then your test is wrong. You should be able to play a "World Championship" at home, to decide what is the strongest.

So, opening suites have some 95% openings that wouldn't be played in your World Championship, you can get rid of those and play relevant games to know what's the strongest engine exclusively.

That's what I mean with "openings nobody would play", Jeroen's test suites are full of positions he wouldn't have suggested to play to Rebel or Rybka in their championships.

>I heard many good opinions about Cerebellum book in all this time, but I'm not sure how true is that. I can not talk much about it because I usually don't test books.
>In fact, I only have made one small opening book (500kb) and made years ago. I like it because it has many relevant lines but his lines are not too long. Anyway I'll try to make some tests with Ed's tiny book against Cerebellum later. Thanks for the link!


This wasn't a statement specific about Cerebellum, but about public books in general. You can refute their lines. That's why in a "World Championship" at home you'd need to figure out what are the best lines for the engines you test by yourself.

> A good book can be designed to surprise the opponent in certain lines, or even solve traps that the engine could not do by itself or would take a long time to do!


But here you're managing both engines' books, so you already cover for the trap in the book of the other engine. The point is you do your best to make them as strong as possible, and only then you know what was the strongest one.

>I don't have doubts about that! But in your case the one that would be playing the openings and a large part of the middle game would be the BOOK and NOT the engine.


So make the BOOK play the engine's moves. It's that simple. The book just aims to having the engine play the strongest positions that the engine can, if your Engine A plays the entire game in book without having to think and WINS, then it was your fault for making Engine B play a suboptimal moves against it, you're supposed to make Engine B play as strong as it can as well.

The point is that if the engine Castles after a long think, or it's a book move, it makes no difference, other than the engine has less time on the clock, so you could as well have added it to the book.

>As I told you, in important tournaments, or in a World Championship, you must use all your weapons and all the arsenal you have.


You want to simulate those conditions to know who's the strongest engine, and just playing from an opening suite with reversed colors will not do that.

>With a test-suite we want to see the engines playing, and not the books or other things that can help them!


If the suite plays a move that would have made the engine lose in a World Championship, then it's a bad move, and testing it is a waste of time (that's what you don't see 1.f3 in test suites, other errors are more subtle.)
Parent - By Gaмßito (****) Date 2019-07-24 15:00

>That's a good starting point. Afterwards, you'd want to test only the openings the engine is good at, to know what's the strongest engine in the best lines it can play, so at this >point the suite is useless.


Uly, the test-suite perfectly fulfills its function showing you in which openings the engine fails or develops better. It makes no sense to use this in a tournament.
If your purpose is to make it play a World Championship or other important tournaments, of course you have to use a BOOK with openings the engine knows best.
But there are many who will never compete in any computer event. So using a test suite for testing engines at home is quite valid.

>Okay, but after you've figured the openings you want to use to make the engine the strongest, then you're going to need to use a book that plays those, anyway, so you can start >with a book and modify it already.


Sure, create an opening book is interesting. But how you know which openings it plays better? For that it would be necessary to play many different openings and watch carefully into those games. So using a good test suite first would be a good step.

I'm not saying engine developers use this method. In fact they NEVER see the games their engines play. They only care about checking the ''Elo'' in the ''final result'' after their engines play 20.000 irrelevant openings! This has been the improvement method in Rybka, Houdini, Komodo and Stockfish.

>The ratings lists have already done that, so you don't need to test at all, just download their games and see how engines perform in different openings. Then if you want to >know what's the strongest engine, at home you test the strongest openings it plays.


Yes, that may also be a valid option. It is obvious that not everyone has the knowledge or the patience to test chess engines. These are things that can be expected in true fanatics. For example, I love watching engines playing. But I recognize that there are not many who find beauty in that!

>Both results should be the same, if your test says Engine A is stronger, but Engine B would win a World Championship, then your test is wrong. You should be able to play a >"World Championship" at home, to decide what is the strongest.


Uly, there are many chances that in a ''World Championship'' of only 9 games, the strongest engine does NOT win. That number of games is very small, the margin of error is too high, and of course other factors count a lot: good hardware, a good book, etc. So those tournaments really should NOT be used to know what's the best engine. They are almost like win the lottery! A simple test at home with MANY games will always be better to know the TRUTH.

>That's what I mean with "openings nobody would play", Jeroen's test suites are full of positions he wouldn't have suggested to play to Rebel or Rybka in their championships.


Yes, that's right. But keep in mind that many of these openings has been IMPORTANT in the history of chess and have been played for centuries! Many people really care to know how engines play in those specific openings!

>This wasn't a statement specific about Cerebellum, but about public books in general. You can refute their lines. That's why in a "World Championship" at home you'd need to >figure out what are the best lines for the engines you test by yourself.


Sure. In a real World Championship you go with all your weapons. But anyway I really don't like to call the winner of WCCC the ''strongest'' engine just because the very small number of games they play. Those tournaments almost don't worth anything!

>But here you're managing both engines' books, so you already cover for the trap in the book of the other engine. The point is you do your best to make them as strong as >possible, and only then you know what was the strongest one.


Yes, totally agree.

>You want to simulate those conditions to know who's the strongest engine, and just playing from an opening suite with reversed colors will not do that.


Uly, the ONLY way to know who is the stronger engine is by playing THOUSANDS of games. That clears all doubts.

In a World Championship you can use your best openings, hardware, etc. But in a way you are really accepting that an engine could have also MANY problems with LOT of openings, relevant openings, so the engine could have NO idea of how to play many of them, AND this is what the test-suite could have been SHOWING you!
So your engine may be the greatest in the openings it played in the World Championship, but the foolest in many others. Can it really be called the ''strongest''? NO from my point of view. Of all the long list of relevant openings, it has to be the one that plays BETTER in most of them, winning and demonstrating it with results!
Anything else is an ilusion, a lottery!

>If the suite plays a move that would have made the engine lose in a World Championship, then it's a bad move, and testing it is a waste of time (that's what you don't see 1.f3 >in test suites, other errors are more subtle.)


Sure, I agree that testing engines with doubtful and bad openings are a waste of time. Anyway a decent suite will not have them.

Regards,
Gaмßito.
Parent - - By h.g.muller (****) Date 2019-07-25 16:00 Edited 2019-07-25 16:30

> Unfortunately "I don't have the time" to let the engines to play so much, that would take up more weeks probably . . .


Then it is completely pointless to start doing this. Just look up the engines' ratings in the CCRL list and be content with that. With 1000 games and ~30% draw ratio the statistical uncertainty of the result will still be ~1.3% or ~10 Elo. With a higher draw rate you will need even more games to get that accuracy. And then there is the problem of how you define 'stronger'. Is it the mutual result, or the result against others? Is it how they play from a few specific positions that they would only choose to go to themselves, or do you want to know how they perform on arbitrary positions?

Of course if one engine is a couple of hundred Elo stronger than the other, it becomes much easier, and you would already be able to establish it with a dozen games or so. A 10-0 result is pretty significant. A 52-48 result means nothing.
Parent - - By Zsolt Kántor (**) Date 2019-07-26 12:31
If you get after 186 games a score like +50 =76 -60 does this mean something?
Parent - By h.g.muller (****) Date 2019-07-26 13:12
Barely. A difference like that or even larger is what you would get about once every 4 tries between completely equal engines.
Parent - - By Uly (Gold) Date 2019-07-14 17:34

> book on or off


Book on, but make sure to have a copy of the book for each engine, and modify it accordingly depending on their results, so an engine plays the openings that give it its best performance. Most people forget to do this and end up testing the engine in irrelevant generic lines nobody plays...
Parent - - By Zsolt Kántor (**) Date 2019-07-17 16:51
I optimize the book as I wrote above. Is that OK?
But why to have two copies of the same book when I can select the same book from menus??
Parent - - By Uly (Gold) Date 2019-07-18 02:58

> But why to have two copies of the same book


Because one engine can play better with one move and another with a different move. Say, one engine plays better with 1.e4, and another with 1.d4, if it performs significantly worse then you want to mark the move red so it never plays it. But the other engine is good with this move, so it shouldn't have it red. Thus you need a copy of the book optimized for each engine you test.

Because books lines are good, or bad, depending on who's playing them.
Parent - By Zsolt Kántor (**) Date 2019-07-18 06:33
I understand now, thanks for the clarification!
It seems it is a lot of work involved to correctly test engines.
Parent - - By Vegan (****) Date 2019-07-23 17:46
empirical results suggest more than 200 games per engine are needed to be able to stabilize ratings to overcome opening book shortcomings etc
Parent - By Zsolt Kántor (**) Date 2019-07-23 22:07
Thanks!
Up Topic The Rybka Lounge / Computer Chess / What is the best way to run engine vs engine games?

Powered by mwForum 2.27.4 © 1999-2012 Markus Wichitill