Not logged inRybka Chess Community Forum
Up Topic The Rybka Lounge / Computer Chess / Are today's modern engines really that strong?
- - By Peter Grayson (****) [us] Date 2017-09-01 23:41 Upvotes 1
Are the Elo ratings of todays modern engines inflated or are they really that much stronger than engines from the turn of the Century period? I thought I would run a match to find out if the Elo's were realistic or if somewhere along the line they had lost touch with reality. Could a modern generation chess engine really live up to expectation when playing against an older engine?

For this match I decided to use the well established Stockfish 8 representing modern against the first Chessbase engine I bought that was Junior 7 representing old, available from mid 2001. I have earlier engines from the original Millenium package including the Genius engines to Genius 6.5 together with Shredder 1 through to 4 but I would have had to run using two machines and Autoplayer 232 to play the match. I used to run that way not so long ago but not necessary for this match when both engines would run with a single thread each.

Running on my older Q9550 machine, more than adequate for this single thread engine test, I created the single thread Stockfish 8 64 bit designated SP. Stockfish ran with 512 Mb hash that was consistent with 50% hash fill rate for 3 x average move time and I used 64 Mb hash for Junior 7 that seemed the best match from earlier historic tests with it in the absence of displayed hash fill rate and would have been a high level of hash in its day. Time control was 40 moves in 5 minutes repeating until finish. No adjudication. Ponder=on.

Because performance could be influenced by a bad opening line selection the question arose as to how best select openings? I decided not to experiment and followed my more recent principle of allowing human players to select the lines based on popularity. I had an opening set of 76 opening lines from a recent selection waiting to be whittled down to 50 lines but I decided to run with all 76 lines to give 152 games with both engines playing each line with both colours. I was unsure of what to expect!

The Selective Search magazine issue 131 from 10 years ago gave Junior 7 a rating of 2609. CEGT has Stockfish 8 using 1 thread at 3338 Elo so a difference of 729 Elo. A differential that would be some task to live up to given the potential pit falls from an untested opening line set that may hold a number of lines giving no opportunity to win.

As it turned out Stockfish more than matched expectation conceding, if that is the right term, just 3 draws with Black and one draw with White, winning 148 of the 152 games. The result confirmed Stockfish for almost all of the lines was able to win with both colours and raised the question as to whether or not we should be so concerned about the top flight engines producing so many draws when matched against each other? Perhaps the draw is the more natural result?

In conclusion, the result of just this one match suggests we can have confidence the current engine ratings are realistic enough when compared to their older counterparts.

PeterG
Attachment: SF8vsJ7.7z - The Games (137k)
Parent - - By Venator (Silver) [nl] Date 2017-09-02 07:35
Hi Peter,

Interesting experiment! I have played through a couple of games and the difference is indeed huge (as expected). Nevertheless, I notice that Junior is able to keep up a reasonable fight in the middlegame from time to time, only completely collapsing in the endgame. An example:

[Event "SF8 SP vs J7, 5m/40+5m/40+5m/40"]
[Site "DEVO4"]
[Date "2017.08.31"]
[Round "42"]
[White "Junior 7"]
[Black "Stockfish 8 64 SP"]
[Result "0-1"]
[ECO "C84"]
[Annotator "0.01;0.14"]
[PlyCount "170"]
[EventDate "2017.??.??"]
[TimeControl "40/300:40/300:40/300"]

{Int@l(R) Core(TM)2 Quad CPU Q9550 î 2.83GHz 2833 MHz W=20.1 plies; 2, 547kN/s;
642,823 TBAs B=32.9 plies; 1,500kN/s; 3,266,917 TBAs} 1. e4 e5 2. Nf3 Nc6 3.
Bb5 a6 4. Ba4 Nf6 5. O-O Be7 6. d3 d6 7. c3 O-O 8. Re1 Re8 9. Nbd2 Bf8 10. Nf1
h6 11. Ng3 b5 12. Bc2 {Both last book move} d5 {0.14/23 16} 13. a4 {0.01/16 17
(exd5)} dxe4 {0.00/22 7 (b4)} 14. Nxe4 {0.04/15 13} b4 {0.00/23 0} 15. a5 {
0.05/16 26 (h3)} Nxe4 {0.11/25 32 (Rb8)} 16. dxe4 {-0.07/14 7} Qf6 {0.21/25 0}
17. Qe2 {-0.11/15 9 (h3)} Bg4 {0.02/23 10 (Rd8)} 18. h3 {0.10/15 8 (Bd3)} Bxf3
{0.10/26 27 (Bh5)} 19. Qxf3 {0.21/16 6} Qxf3 {0.00/26 1} 20. gxf3 {0.15/17 10}
Re6 {0.13/26 6 (bxc3)} 21. Rd1 {0.38/16 6} Rb8 {0.12/26 20 (bxc3)} 22. Be3 {
0.45/16 7} Nd8 {0.16/26 36 (bxc3)} 23. Bd3 {0.46/16 24 (Rd5)} Rg6+ {0.00/25 5
(bxc3)} 24. Kf1 {0.53/17 6} bxc3 {0.00/29 2} 25. bxc3 {0.42/17 9} Rc6 {0.00/31
0} 26. Be2 {0.35/17 9 (Rab1)} Ne6 {0.00/29 10} 27. Rab1 {0.25/18 0} Ra8 {
0.00/27 2 (Rxb1)} 28. Rb3 {0.28/17 12 (c4)} Nf4 {0.00/30 23} 29. Bxf4 {0.22/19
0} exf4 {0.00/30 3} 30. Rd4 {0.23/18 4} Rc5 {0.00/29 6 (Bc5)} 31. Rd5 {0.28/19
8} Rxd5 {0.00/31 5} 32. exd5 {0.26/19 3} Bd6 {0.00/33 5} 33. Rb7 {0.27/21 3} g5
{0.00/32 1 (Be5)} 34. Bd3 {0.27/21 18} Kg7 {0.00/35 0} 35. Bc2 {0.26/21 10
(Be4)} Kf6 {-0.09/30 8} 36. Ba4 {0.32/20 2 (c4)} Ke5 {-0.12/28 11 (Kg6)} 37.
Bc6 {0.32/20 15} h5 {-0.20/30 0} 38. Ke2 {0.36/21 17 (Kg1)} f5 {-0.10/30 35
(Rc8)} 39. Kd3 {0.12/20 24 (Bd7)} g4 {-0.68/24 10} 40. hxg4 {-0.01/17 8 (Rb1)}
hxg4 {-0.89/27 17} 41. fxg4 {-0.01/19 0} fxg4 {-0.56/28 22} 42. Rb1 {-0.16/18
0 (Rxc7)} Rg8 {-1.06/24 7 (Rh8)} 43. Re1+ {0.00/17 13} Kf6 {-1.16/28 0} 44. Re2
{-0.26/17 12 (Ke4)} g3 {-1.86/25 8} 45. fxg3 {-0.44/17 1} fxg3 {-1.91/24 3} 46.
Rg2 {-0.56/18 9} Kg5 {-1.83/29 8 (Kf5)} 47. Ke3 {-0.69/18 9 (c4)} Rd8 {-2.29/
24 6 (Rf8)} 48. Kd4 {-0.89/18 16 (Ba4)} Kh4 {-2.92/30 20 (Kf4)} 49. Rg1 {
-1.57/18 10} Rg8 {-3.14/30 4} 50. Rh1+ {-1.70/18 10} Kg4 {-3.31/32 0} 51. Bd7+
{-1.83/19 22 (Kc4)} Kf3 {-4.34/30 13} 52. Bh3 {-1.83/19 0 (Bf5)} Rh8 {-6.97/27
7 (g2)} 53. Kd3 {-1.75/18 21 (c4)} Rxh3 {-10.93/27 23} 54. Rxh3 {-1.71/22 0}
Kf2 {-11.09/20 4 (Kg4)} 55. Rh7 {-1.53/21 9 (Rh6)} g2 {-13.10/22 12} 56. Rf7+ {
-1.90/21 8} Kg3 {-14.47/23 2} 57. Ke4 {-1.90/20 6 (Rg7+)} g1=Q {-128.36/30 5}
58. Rg7+ {-1.90/17 2} Kf2 {-128.40/36 0 (Kh2)} 59. Rxg1 {-1.90/18 4} Kxg1 {
-128.41/47 4} 60. Kd3 {-2.27/27 4 (Kf3)} Kf2 {-128.42/48 9 (Kg2)} 61. Ke4 {
-3.16/30 21} Bc5 {-128.43/52 0 (Ke2)} 62. Kd3 {-3.46/29 13} Be3 {-128.44/52 0}
63. Ke4 {-4.06/30 18} Ke2 {-128.45/50 0} 64. Kf5 {-4.06/27 6 (Ke5)} Kd3 {
-128.46/49 10} 65. Ke6 {-#16/28 0} Kc4 {-128.47/48 4 (Kxc3)} 66. Kd7 {-#16/27 8
} Kxd5 {-128.48/54 6} 67. Kxc7 {-#16/30 1} Kc5 {-128.49/64 1} 68. Kb7 {-#16/35
6} Kb5 {-128.50/77 2} 69. Kc7 {-#16/36 6} Kxa5 {-128.50/80 0 (Bg1)} 70. Kc6 {
-#16/1 0} Ka4 {-#16/0 0} 71. Kd5 {-#16/1 0} Kb5 {-#15/0 0} 72. c4+ {-#14/1 0}
Kb4 {-#14/1 0} 73. Ke4 {-#13/1 0} Bc5 {-#13/0 0} 74. Kd3 {-#12/1 0} a5 {
-#12/1 0} 75. Kc2 {-#11/1 0} a4 {-#11/0 0} 76. Kb2 {-#10/1 0} Kxc4 {-#10/0 0}
77. Kc1 {-#9/1 0} a3 {-#9/0 0} 78. Kc2 {-#9/1 0} Bb4 {-#8/0 0} 79. Kd1 {-#8/1 0
} a2 {-#7/1 0} 80. Ke2 {-#7/1 0} a1=Q {-#6/0 0} 81. Kf2 {-#5/1 0} Bd6 {-#5/1 0}
82. Kf3 {-#4/1 0} Qd1+ {-#4/1 0} 83. Kf2 {-#3/1 0} Kd3 {-#3/1 0} 84. Kg2 {
-#3/1 0} Qe1 {-#2/1 0} 85. Kf3 {-#1/1 0} Qg3# {-#1/0 0} 0-1

It seems that Junior's agressive playing style - sacrificing material quite often - is not the best recipe to tackle Stockfish :-). This strategy was very successful in the days of Junior, Shredder and Fritz, but not today.

It might be interesting to find out if Fritz or Shredder from that timeframe would do a better job (i.e. scoring more than 4 draws). Their endgame is much better than Junior's, so my guess is that they might score a couple of extra draws.
Parent - - By Venator (Silver) [nl] Date 2017-09-02 07:41
Of course there are the usual short games in such matches, like the one below. Strange that Junior is still completely unaware of the danger at move 28 (Junior eval only -1.37, SF is already at +8).

[Event "SF8 SP vs J7, 5m/40+5m/40+5m/40"]
[Site "DEVO4"]
[Date "2017.08.31"]
[Round "103"]
[White "Stockfish 8 64 SP"]
[Black "Junior 7"]
[Result "1-0"]
[ECO "A65"]
[Annotator "0.47;0.19"]
[PlyCount "79"]
[EventDate "2017.??.??"]
[TimeControl "40/300:40/300:40/300"]

{Int@l(R) Core(TM)2 Quad CPU Q9550 î 2.83GHz 2833 MHz W=34.4 plies; 1, 101kN/s
B=18.8 plies; 2,566kN/s} 1. d4 Nf6 2. c4 c5 3. d5 e6 4. Nc3 exd5 5. cxd5 d6 6.
e4 g6 7. Nge2 Bg7 8. Ng3 {Both last book move} h5 {0.19/14 6} 9. Be2 {0.47/24
38 (h4)} h4 {0.01/14 6 (Nh7)} 10. Nf1 {0.75/20 4} O-O {0.03/15 10 (h3)} 11. h3
{0.66/20 8} Re8 {0.08/16 2 (Nh7)} 12. Bg5 {0.79/22 11 (Qc2)} Qa5 {0.21/16 18}
13. Qd2 {0.79/24 0} Qb4 {0.28/15 10 (c4)} 14. a3 {1.53/20 5} Qb3 {0.63/15 7}
15. f3 {1.60/23 4} Bd7 {0.80/15 11 (Nbd7)} 16. Ne3 {2.71/22 15 (Bxh4)} Nh5 {
0.90/16 30} 17. Bxh4 {3.07/26 0} Bh6 {0.94/14 8 (Be5)} 18. g4 {3.33/23 5} Nf4 {
1.06/15 23} 19. Bc4 {3.71/29 4 (g5)} Qb6 {1.06/15 7} 20. Ned1 {3.88/27 1 (g5)}
g5 {1.24/15 14} 21. Bg3 {3.65/29 0 (Bxg5)} Ng6 {1.09/16 16 (Qd8)} 22. h4 {
4.57/22 7} gxh4 {1.33/16 23} 23. Qxh6 {4.95/26 0} hxg3 {1.33/15 5} 24. Bf1 {
5.00/25 3 (Kf1)} Qd8 {1.14/15 43 (Na6)} 25. g5 {6.42/21 9} c4 {1.14/15 0} 26.
Ne2 {7.36/20 8 (Rc1)} Re5 {0.98/12 5} 27. Rh5 {7.37/22 0 (Nf4)} Qa5+ {1.41/13
12 (Rxe4)} 28. Ndc3 {8.02/22 10} Qb6 {1.37/14 5} 29. O-O-O {8.74/23 0} Qf2 {
1.63/14 10 (Qe3+)} 30. Kb1 {11.26/21 11} Qe3 {3.57/14 14 (Qxf3)} 31. Bg2 {
13.35/20 9} Na6 {7.52/14 7 (Nc6)} 32. Rdh1 {22.30/21 16} Rxg5 {7.96/15 1 (Bf5)}
33. Rxg5 {28.18/21 10} Qxg5 {9.15/16 3} 34. Qxg5 {32.27/20 0} Bc8 {9.45/12 0
(Bb5)} 35. Nf4 {#6/48 10 (Nxg3)} Bh3 {#6/12 1} 36. Nxg6 {#5/59 3} Kg7 {#5/15 0}
37. Ne5+ {#4/112 19} Kf8 {#4/56 0} 38. Rxh3 {#3/127 0} f6 {#3/62 0} 39. Qxf6+ {
#2/127 0 (Rh8+)} Ke8 {#2/62 0} 40. Rh8# {#1/127 0} 1-0
Parent - - By Peter Grayson (****) [us] Date 2017-09-02 10:19 Upvotes 2
Thanks for commenting on these games Jeroen.

Junior had a noticeably different, perhaps slightly unorthodox but effective style as I recall from the games I ran back then. Someone pointed out to me a while back that Junior was released specifically for use as an analysis engine although it played an interesting game. The ply depth was handled in a unique way with jumps of 2 or 3 ply in the early stages of move analysis before settling at the more conventional single ply increment. Perhaps this contributed to its interesting aggressive style. Stability issues with 10.0 and 10.1 when running with more than 1 thread caused me to drop the engine from matches and tournaments.

Where Junior 7 appeared to miss threats it was most likely because of search depth. Despite its accelerated method of ply depth it was insufficient to match the deeper depth of Stockfish 8 that often showed as much as 5 to 8 ply deeper in the indicated primary move depth.

Before use, I ran the opening lines through a "doubles" checker so I hope there were no doubles in there but Stockfish 8 really impressed with 148 wins out of 152 games! That was some performance depite the Elo expectancy. Over the coming weeks I may well include some more engines for comparison including Komodo and probably an early Fritz, maybe Fritz 7, from around the same time as Junior 7. Shredder 7 that was released as a multithread engine supporting up to 8 threads may be the candidate for early SMP versus modern too. Unsure if there was any interest in this but having run this match I will run one or two more for comparison.

Peter
Parent - - By Banned for Life (Gold) Date 2017-09-11 03:56
I would love to see the results with Fritz 7. Just thinking back to the days when I was having great success with F7 and 1.b3 brings a tear to my eye...
Parent - - By brunjes (**) [us] Date 2017-09-12 12:54
I don't have Fritz 7. However, I downloaded Rybka 2.3.2a (available for free from Rybka website). I think many people will recall how strong this engine was in its day. It even won some matches against GMs with some odds (draw odds vs some, material odds vs some).

Well, CEGT website shows the rating of the 32-bit single core Rybka 2.3.2a to be 2776, and Stockfish 8 to be 3330. That's a difference of 554.

I ran a match between the two using the HERT book of opening positions. I've had good luck with this book when running Komodo vs Stockfish matches in the past. I ran a 2 minute per game time control with 3 seconds per move increment (on average, a game usually lasts between 7-10 minutes). Opening positions were repeated so that each engine got to play the same opening as both black and white.

I stopped the match after 183 games. The match results:

Stockfish 8    scored 175 wins and 8 draws
Rybka 2.3.2a scored    0 wins and 8 draws

Arena 3.5.1 says this makes for a 676 rating difference (97.8% score for Stockfish 8).

Given how strong Rybka 2.3.2a was in its day, it is amazing to see just how much stronger Stockfish 8 is. And GMs were hesitant to play Rybka back then due to its great playing strength...
Parent - By InspectorGadget (*****) [za] Date 2017-09-13 20:46
Scary thoguht :eek:

Rybka 2.3.2a was strong
Parent - By Lazy Frank (***) [gb] Date 2017-09-14 08:09
Hello!
Just for fun i ran short bullet latest SF (bench 6351176) vs Rybka 4.1 SSE.
TC: 2+6, SALC v3 opening set.

SF - 20 games, 18 wins, two draws, no loses.
Amazing ... :roll:
Parent - By Venator (Silver) [nl] Date 2017-09-13 17:50
Hey, BFL, you are back!
Parent - - By Dr.X (Gold) Date 2017-09-16 23:02 Edited 2017-09-16 23:17
The question that arises for me is this. When it comes to a rather difficult  position to be  analyzed - does elo give a chess engine, commitment with advanced hardware- increased ability to find the best move through speed in calculation?! Whereas I've noted Zappa will hit on the move frequently faster without having to go through to depth 34 or 35 to make a final determination. Of course with more advanced hardware the time speed fact is exponentially whittled down!

I've found it interesting just to test older engines against certain positions in analysis to see how they come to  match up in analysis to Stockfish and Houdini! Komodo will occasionally have a partisan theory on a move against that of Stockfish! But, then I'm only running an i7-6900K. So! What do I know?!

Don't get me wrong I love candidate moves they are lovely to think about! There is just so much you can do with a meatball before you ruin  both the sauce and the spaghetti!
Parent - - By Peter Grayson (****) [gb] Date 2017-09-17 01:15

> The question that arises for me is this. When it comes to a rather difficult  position to be  analyzed - does elo give a chess engine, commitment with advanced hardware- increased ability to find the best move through speed in calculation?!


That has always been the unanswered question. The problem with today's engines is the criteria used to obtain excellent game results is not necessarily conducive to always finding the best move for any particular position for analysis purposes. Indeed, a good move will often suffice to secure the win rather than looking for the absolue best move. However, it is more likely that the top three engines will find the best move faster but not guaranteed absolutely. Running the engines through well established test sets confirms this where there are still some positions the "old school" engines are able to solve faster, usually those of a more tactical orientation. Today's engines are better at finding small improvements through the course of a game that eventually secures the win when the more thorough analysis of the traditional engines misses out on deep depth related moves that produce those small advantages.

That is my take on it.

Peter
Parent - - By Dr.X (Gold) Date 2017-09-17 03:27 Edited 2017-09-17 03:31

> The problem with today's engines is the criteria used to obtain excellent game results is not necessarily conducive to always finding the best move for any particular position for analysis purposes


What I'm gleaning from  your statement is that modern chess engines are developed more to function at peak performance  vis-à-vis against another chess engine in competition, than as a compartmentalized chess tool for analysis purposes! I'm not a programmer so I have no idea why when so many consumers call for better chess analysis tools programmers go for the trophies in engine comp!

I am not of the ilk and kind that really believes that gamers are going to learn very much about their own game from studying thousands of stockfish vs Komodo games -which, I seriously doubt  they will  sit down and do any how. It is a totally different genre!
Parent - By Peter Grayson (****) [gb] Date 2017-09-17 17:55

> What I'm gleaning from  your statement is that modern chess engines are developed more to function at peak performance  vis-à-vis against another chess engine in competition, than as a compartmentalized chess tool for analysis purposes! I'm not a programmer so I have no idea why when so many consumers call for better chess analysis tools programmers go for the trophies in engine comp!


There is much prestige in an author's engine appearing at the top of the rating lists, particularly for the commercial releases when these days most people judge an engine by its long term performance as opposed to a "flash in the pan" performance in a single competition albeit there is some prestige attributed to winning the TCEC tournaments because of the publicity. It was noticeable last year the improvement achieved in Stockfish to produce the "8" engine seemed to shock its competitors as well as many of us as to just how strong it was. I had the sense the Stockfish people wanted to have something to show for their efforts and pulled out all the stops to achieve it.

Robert Houdeart includes a variation within his Houdini engine by means of a "Tactical" switch that when selected makes the engine better performing for analysis but as match results have shown it is to the detriment of game play. There are also variants of the Stockfish engine that have changes to focus more on mates or analysis as well as longer time controls specifically for correspondence players.

One of the more noticeable changes in position solving performance has been that from the Komodo camp with the improvements from 9.42 through to Komodo 11.2.2, the latest release.

My own view is that one of the barriers to producing the best analysis tool is the apparent heavy pruning or discarding of other moves that occurs once a good move has been found, for example there may be multiple iterations at the same ply depth with no change in output and then having completed its analysis at that ply depth the engine moves to the next ply without any indicated analysis of the remaining moves. Clearly this suits game playing better than analysis when a more thorough approach is surely required. However, this latter element takes time that is obviously limited in time control games.

Overall though, the performance of today's engines is significantly better than any of the older engines when for example, looking at the Arasan 18 position test suite, Zappa Mexico II x64 is well down the list and definitely in the "old engines" performance capability category. (List attached). With the 30s per position limit modern engines perform remarkably well when compared to their older counterparts and highlights despite the continued focus on game results, analysis capability has also benefitted.

Peter
Attachment: Arasan18teststo170912.pdf (50k)
Parent - - By Dr.X (Gold) Date 2017-09-21 00:56
What is your take on using a book in an engine match with minimal  opening moves? For example, French def. ( coo) 1.e4 1...e6  2.Nf3 Black out of book 2...d5 White out of book 3.exd5 3...exd5  4.d4 5Bd3

To a large extent the engine's follow Noomen's book to a "certain extent"  and up to a point - even while being out of book. 5..Ne7 6.0-0 ...

I won't go through the entire game.

I was talking with a friend about Komodo 11.2.2 and his take was lukewarm. By his standards it was missing tempo by choosing in some cases the wrong candidate more?! Not that the move was necessarily a bad move but not the best move. Honestly, I haven't looked at any of Komodo 11.2.2 games to pass any judgement either way.

So there are two separate question here - one, about running test with minimal moves before going out  of book and then that issue concerning Komodo 11.2.2.
Parent - - By Peter Grayson (****) [gb] Date 2017-09-21 16:42

>What is your take on using a book in an engine match with minimal  opening moves? For example, French def. ( coo) 1.e4 1...e6  2.Nf3 Black out of book 2...d5 White out of book 3.exd5 3...exd5  4.d4 5Bd3


My choice has changed over the years. I have never used short line books because it seemed to me that engines were and perhaps still are not so good in the opening phase of the game and games from such short lines have little value in assessing the engine. There is always the possibility something new may be found but much of the opening theory has been sufficiently tested over a long period of time to take us where we are today in that realm. I have also placed more importance on how we play the game rather than say a pure but perhaps dull approach from the engines.

Currently I use fixed opening lines in pgn format derived from contemporary games by 2400+ players over the last 12 months or so. Engines have no influence in the chosen lines or opening move cut off points with the choice primarily based on popularity with some other criteria if there is an important thematic to consider.

Using fixed opening lines permits each engine to play each line with both colours and from my perspective gives a much higher degree of certainty in the result than the unwanted imbalance and consequential variability from the use of opening books. The Noomen lines presented for download on the Rybka site are certainly worthy and are very good reference lines that I have used from time to time but I wanted something perhaps a little more reflective of club, County and Federation standard with a couple of my own favourites thrown in for fun too.

>I was talking with a friend about Komodo 11.2.2 and his take was lukewarm. By his standards it was missing tempo by choosing in some cases the wrong candidate move?! Not that the move was necessarily a bad move but not the best move. Honestly, I haven't looked at any of Komodo 11.2.2 games to pass any judgement either way.


Missing key moves is not the sole reserve of Komodo.  Stockfish has its failings too as does Houdini but the styles of the three engines are different when an analogy would be with a GM missing a key move in a position with which they are unfamiliar.

>So there are two separate question here - one, about running test with minimal moves before going out  of book and then that issue concerning Komodo 11.2.2


Selection of engine, opening, depth of line are all influenced by personal choice too. I have tended to consider Komodo as more of a positional engine that may not be so attractive for people looking for aggressive play but, as with the opening lines, choice will probably come down to the engine you believe better represents your style?

Belatedly just caught up with outstanding test of Komodo 11.2.2, see below

Peter
Parent - By Dr.X (Gold) Date 2017-09-21 22:02
I had to cancel out the match. What ended up happening is a number of drawn games. I went back to Perfect t  with 10 move limit. Ponder off , 2048 hash.
Parent - - By Dr.X (Gold) Date 2017-09-21 22:40 Edited 2017-09-21 22:55
I appreciate the recent Komodo vs Stockfish 8 results. But I am more interested in the recent development results in Stockfish vs Komodo! To my way of thinking Stockfish 8 is hold hat!

Perhaps that is the diff between someone into  game analysis against  someone testing against  engine vs engine matches; where the realm  of  recent Initial version release are the only relevant versions to be tested. 

I may consider more relevant for use and testing those versions for testing those  presented in the development area of Stockfish. Not all are perfect in their evolution but just wait a day or two and the kinks and wrinkles are ironed out.
But by and large I'd bet that the most recent development Stockfish has moved ahead of Stockfish 8?!
Parent - - By Peter Grayson (****) [gb] Date 2017-09-21 22:54

> I am more interested in the recent development results in Stockfish vs Komodo! To my way of thinking Stockfish 8 is hold hat!


I am having to catch up now after taking a summer break from computer chess. Generally though, the Stockfish improvements seem to have plateaued since late May with perhaps asmFish 2017-05-22 being the strongest. The last I tested was 2017-05-16

Peter
Parent - - By Dr.X (Gold) Date 2017-09-21 23:06

> I am having to catch up now after taking a summer break from computer chess.


I haven't  been doing any of this kind of testing in over a year. ( Actually, I think it is over a year, may be two years!).
Now, you have my curiosity peaked. Next test would almost have to be Stockfish 8 vs the latest Stockfish in the development area.
Parent - - By Peter Grayson (****) [gb] Date 2017-09-22 00:19

> Next test would almost have to be Stockfish 8 vs the latest Stockfish in the development area.


Just downloaded Houdini 6 Pro so will probably be my next test. Currently looking at Andscacs 0.92 and Fizbo 1.9.

The Stockfish 8 engine dowloadable from the official site is a faster compile than the development engines. The single thread timings to the same output highlights the speed differences of the engine compiles ...

                                         mN  time(s)  kN/s  Speed inc
SF8 Official                          142000  129     1100    0.00%
SF8 MZ with LP                        142000  119     1193    8.40%
SF8 FB                                142000  124     1145    4.03%
SF8 DEV                               142000  132     1075   -2.27%
pedantFishW_2016-11-04_base           142000  96      1479   34.38%
Parent - By Dr.X (Gold) Date 2017-09-22 00:53

> The Stockfish 8 engine dowloadable from the official site is a faster compile than the development engines


I'm beginning to note that myself. Currently running test on the latest development version against 8!  Houdini 6 Pro sounds very promising. I look forward getting it by the first week in Oct.
Parent - By Dr.X (Gold) Date 2017-09-22 01:04
I'm not overclocking - I just stopped pushing the envelope after two machines black screened on me within a weeks time last year. It was a coincidence since I ran them hot for over 5 years so...it was time for them to both fry! This time around I am satisfied with getting 17 to 19,000 kN/s without breaking out a sweat at low volts and 41c  on all cpu as well as 32c on the mobo. If I do crank it to turbo,   I get 26,000 kN/s  , without volt increase just wattage increase. fans start whizzing! LOL.
Parent - - By Kappatoo (*****) [us] Date 2017-09-22 14:14
Very interesting test! Do you know what was the Junior version that played against Kasparov in 2003 and how it compares to Junior 7? (See also my post on 'famous' engines vs. modern top engines.)
Parent - - By Peter Grayson (****) [gb] Date 2017-09-22 21:01

> Do you know what was the Junior version that played against Kasparov in 2003 and how it compares to Junior 7?


I believe it was a development version preceding (Deep) Junior 8. The public release of Junior 8 was always a disappointment because when put into the position of the game, it failed to play the famous Bishop x h2+ sacrifice that made the headlines. Thus the engine used in that game was never made public as far as I am aware.

PeterG
Parent - By Kappatoo (*****) [us] Date 2017-09-23 01:54

> it failed to play the famous Bishop x h2+ sacrifice


Probably due to a bugfix ;)
Parent - - By rocket (***) [se] Date 2017-10-09 11:52

>I believe it was a development version preceding (Deep) Junior 8. The public release of Junior 8 was always a disappointment because when put into the position of the game, it failed to play the famous Bishop x h2+ sacrifice that made the headlines. Thus the engine used in that game was never made public as far as I am aware.<


Isnt it strange that Kasparov never raised that complaint? The hole point about the Junior match was the ability to reproduce the games...:grin:
Parent - - By Labyrinth (*****) [us] Date 2017-10-10 00:49

>Isnt it strange that Kasparov never raised that complaint?


Nah.

*You wouldn't cheat with a speculative sacrifice, that would be insane.

*Garry's demands on how the match was conducted were much more stringent after Deep Blue '97 such that there would be more protection from foul play.

*There wasn't a very large billion dollar company behind the match with billions on the line either, so they had less motivation to cheat and less resources to make it possible.

*The Deep Junior match was played 6 years after the Deep Blue match, at which point both engine and engine hardware had improved a great deal. So while the Bxh2 move was unusual, it wasn't unusual to the point as to be suspicious.

*The move very well could have been reproduced at the venue in private.
Parent - - By MarshallArts (***) [us] Date 2017-10-10 00:55
Still, it remains a mystery as to why that version of Junior was never released.

:roll:
Parent - - By rocket (***) [se] Date 2017-10-10 10:35
I think you are phrasing it incorrectly. It's very strange why the Junior commercially available, under the same version, does not concider the move.
Parent - By MarshallArts (***) [us] Date 2017-10-10 17:29
Phrased correctly or not, I would have liked to have the version that played GK and produced that move.

:yell:
Parent - By Peter Grayson (****) [us] Date 2017-10-10 22:52

> It's very strange why the Junior commercially available, under the same version, does not concider the move.


It was never released. The Junior 8 CD was supposed to have two engines on it being the engine used against Kasparov and the updated Junior 8. It never happened. See the article pages 9 and 10 ...

http://www.chesscomputeruk.com/SS_107.pdf

PeterG
Parent - - By rocket (***) [se] Date 2017-10-10 10:37

>*You wouldn't cheat with a speculative sacrifice, that would be insane.<


Not for marketing purposes it isn't

>*The move very well could have been reproduced at the venue in private.>


Why would Kasparov have access to their exact version? That would be insane. Point is that the later commercially released version does not play the move
Parent - - By Labyrinth (*****) [us] Date 2017-10-10 15:47

>Why would Kasparov have access to their exact version?


Wouldn't need to, just needed to sufficiently demonstrate that the machine produced the move by the operating team.

>Not for marketing purposes it isn't


As long as:

A. The move doesn't lose.

B. It can be reproduced sufficiently so as not to arouse suspicion.

If you can be at least certain enough of A, then it isn't that speculative. If you cannot, and it loses, it's bad. I don't think unsound sacrifices are really a selling point.

I have no idea why the commercial version didn't have this feature, that's something to ask the Junior team about. Probably it wasn't intentional, like they probably made some changes to the code after the match that shored up what they felt were glaring holes in its game, only to find that sadly it no longer played this Bxh2 in that exact position.
Parent - By MarshallArts (***) [us] Date 2017-10-10 17:36
They could have released the match version, warts and all, along with the commercial product, as a separate bonus version.

:sad:
- By rocket (***) [se] Date 2017-10-10 10:47
Having played a fair amount of games at tournament time controls (120/40), it is my opinion that Rybka 1.2 (the last Rybka 1 version) would beat Carlsen in a match at 120/40 by some margin.

1.2 is very, very, strong when given adequate time to ponder.

It appears as if SSDfs 2909 is on the lower spectrum.
Up Topic The Rybka Lounge / Computer Chess / Are today's modern engines really that strong?

Powered by mwForum 2.27.4 © 1999-2012 Markus Wichitill