Not logged inRybka Chess Community Forum
Up Topic The Rybka Lounge / Computer Chess / Comparing Pure NNUE and the Hybrid option.
- By Peter Grayson (****) Date 2020-08-31 14:28 Upvotes 2
With the introduction of the hybrid engine, the decision to remove the pure nnue evaluation option from the Stockfish engine end user choice seemed strange because all of the excitement and euphoria related to the surge in playing strength surrounded the introduction of the neural network and in many position test sets it was was scoring significantly better than its classical counterpart. The classical engine appeared no match for the nnue engine but that was given as one of the two binary options, the hybrid option being the default.

Therefore the purpose of this tournament was to compare the performance of the last available pure Stockfish nnue engine with the hybrid by including the Houdini 6.03 Pro and Komodo 11.3.1 engines to give an indicator of how the engines would compare against other opposition. Both of the latter engines are several years old now but as it turned out had some influence on the final outcome of the tournament. It is unclear how using hyper-threading impacted on Komodo 11.3.1 given the CCRL 40/15 ratings suggest the engine and Houdini are neck and neck but clearly that was not the case here.

Delaying the tournament a few days would have permitted the use of the Chessman compiles when the 3-way radio button evaluation selection switch may have offered a better comparison of the nnue and hybrid engines with both engines having the same time stamp and thus updates. However being unaware the engine was in the offing the tournament started as presented. Another 12 day tournament so soon after this one is not practical. Irrespective, this match gave some interesting information about the two nnue versions tested here.

Match conditions:
PC: Intel i7-8700 + 64Gb RAM H-T on,
O/S: Windows 10 64 Pro v2004
Engines:
abrok Stockfish 060820 +NNUE 64 BMI2 + Sergio 20200801-2257.bin network, the last of the pure nnue engines available from abrok.
abrok Stockfish 160820 +NNUE 64 BMI2 + Sergio 20200801-2257.bin network, running in hybrid default mode.
Houdini 6.03 Pro x64-pext
Komodo 11.3.1 64-bit BMI2
GUI: Fritz 14 64 bit update 46. No adjudication, games played to a finish.
Engine resources: 12 Threads + 4Gb hash per engine. Ponder off.
Time control: G/10 minutes + 3s/m
Openings: Fixed PG 260816 x50 lines openings set. Each engine plays both colours.
Fritz 14 Tournament mode. Engines are unloaded after each game ensuring clean hash for next game. No adjudication.

Games available from:

http://www.mediafire.com/file/4pjpq9h51bsjh49/Stockfish_160820_vs_060820_vs_Houdini_6_03_vs_Komodo_11_3_1.zip/file

There was no doubting that one of the Stockfish engines would finish top, but which one or would it be equal? After the initial batch of games the scores changed very little. The interesting aspect was that the two Stockfish engines were difficult to separate against each other although their styles were different. The major influencing factor was in fact Houdini 6.03 when as late as the completion of Round 94 and 564 games the Stockfish pair's scores were equal and both had the same high score against Komodo 11.3.1 but the hybrid engine had scored 6 more points than the pure nnue engine so Houdini was the thorn in the pure nnue engine's side!

By tournament end the pure nnue engine had taken a 1 point advantage against the hybrid engine but with the hybrid engine's superior score against Houdini 6.03 it now also bettered the pure nnue engine's score by 1.5 points against Komodo 11.3.1 to give the top placing to the hybrid engine but not undisputed champion for the pure nnue engine scored 50.5-49.5 against the hybrid.

Game 1. The very first game highlighted how much today's Stockfish engines have moved on with the Hybrid engine here playing some fascinating chess. Key points were tying in Black's King to h8 and White's King manoeuvre to e6. Komodo saw it was lost about 30 moves too late. The nature of the games produced is reflective of the opening line opportunities offered to the engine and of course its opponent.

Game 24. Registered the quickest win of the tournament with the hybrid engine playing White in the B81 Sicilian Scheveningen, Keres Attack line. It polished off Houdini by delivering mate at move 42.

Games 30 and 31. Despite the Stockfish engine delivering mate at move 86 in both of its games with Black against Houdini and Komodo in the B84 Sicilian Najdorf 6.Bg5..Nbd7 line the moves varied early on with White's 15th move.

Game 94 produced the first win between the two Stockfish engines in the B22 Sicilian 2.c3 line. It all looked reasonably balanced until the Hybrid engine playing Black found itself in the well recorded Stockfish 0.00 score syndrome. It was totally oblivious to the advantage the pure nnue engine had obtained until its score jumped from 0.00 to 5.51 adverse after 41.Rc2. Despite the new network assistance, clearly some old result expensive problems remain! With both engines winning with White in the C11 and B19 lines the Hybrid engine had to wait for its balancing win until game 232.

Game 232. The hybrid engine finally drew level against the pure nnue engine in their own battle with the hybrid playing White in the B06 Modern opening line. No absolute blunder from the pure nnue engine but like so many of its predecessors it struggled with the Black line and was unable to find anything from its King' side Fianchetto and hyper modern set up. The hybrid engine managed a draw in the return game.

Game 284. Houdini managed a single win against Stockfish, winning with White against the pure nnue engine in the E10 Blumenfeld Gambit line. Stockfish gave up the h7 pawn and then the rook for knight exchange to recover the pawn that was too much of an advantage to hand Houdini with the nnue engine unable to bully its Queen's side pawn majority through the Houdini defence for a win. Komodo failed to register a win against the Stockfish engines.

Game 370. Now it was the hybrid engine's turn to suffer playing Black in a King's side Fianchetto line. The E68 King's Indian Fianchetto opening saw the hybrid engine again make a big blunder highlighted after it played 30..Bxe4 and the pure nnue engine's score jumped 31.Nxd4 5.98/28. The hybrid engine remained oblivious with its reply 31..exd4 0.74/38. How could the hybrid engine miss it when searching 10 ply deeper than White will remain a mystery?

Game 391 Was the quickest Black win with the hybrid engine playing Black against Komodo in the D12 Slav Defence line. Komodo ran its clock down too quickly leaving it reliant on the move time accumulator and failed to reply after 48..Rb5 and was the only time loss of the tournament. The engine looked to hit an internal problem when in analysis the short move extensions suggested it expected a draw by repetition around move 41 and went askew when Stockfish avoided it. Unclear why Komodo did not just play Rxe6 at some point. It did not seem worth replaying the game.

Game 396, highlighted something was missing from the hybrid engine's play preventing it from quickly finishing off its opponent using some basic endgame strategy.

Game 581 produced a 15 move draw when Stockfish pure nnue saw merit in accepting a quick 3-move repetition against Komodo in the B03 Alekhine's Four Pawns attack line. 10.d5 looks dubious when 10.Be2 may offer better opportunities.

Game 594 Stockfish hybrid with Black also took a 20 draw against Houdini in the B01 Scandinavian Defence line.

Game 598. Another blind miss by the hybrid engine in the B01 Scandinavian line when the scoring moved swiftly to the nnue engine after 23..b6. 23..Bxd2 may be better but the engine was already struggling with its game.

Final score ...
                                                        1           2               3                   4                                                                                                   
1   Stockfish 160820-H2257     3562     *-*,  49.5 - 50.5, 77.0 - 23.0, 76.0 - 24.0   202.5/300
2   Stockfish 060820-N2257     3549     50.5 - 49.5, *-*,  71.0 - 29.0, 74.5 - 25.5   196.0/300
3   Houdini 6.03 Pro x64-pext  3399     23.0 - 77.0, 29.0 - 71.0,  *-*,  59.5 - 40.5  111.5/300
4   Komodo 11.3.1 64-bit BMI2  3357   24.0 - 76.0, 25.5 - 74.5, 40.5 - 59.5,  *-*    90.0/300
Average elo: 3466 <=> Category: 49

Comment:
Since starting the match there have been some engine releases addressing the loss of the pure nnue option. Chessman's offerings of the Cfish engine now provide a three radio button option to select between the Classical, Pure nnue and Hybrid engines works perfectly and has been an unexpected bonus. The ShashChess 13.1 also offered a similar option but has now gone fully nnue.

An interesting but time consuming exercise at these time controls. There was evidence some of the Stockfish engine issues remain particularly the hybrid engine falling foul of the dreaded 0.00 score syndrome that cost it at least two games against its pure nnue counterpart. At the same time however there was clear evidence of very sophisticated play leaving its opponent bewildered as seen in the very first game.

There was also the issue in Game 396 of the hybrid engine dithering in Lc0 fashion instead of briskly finishing off its opponent as though it was missing some basic and fundamental rules for endgame play.

Deeper analysis of the games may reveal more but the good thing is if those issues can be addressed then even bigger gains must surely follow.

Compared to previous matches run against the original Stockfish classical engine the benefit of around 50 Elo over the classical engine appears to stand up to scrutiny under these tournament conditions using the CCRL 40/15 ratings as a reference. It will be interesting to see how the Elo scaling stands up to running with the TCEC hardware and time control conditions.

Peter
Up Topic The Rybka Lounge / Computer Chess / Comparing Pure NNUE and the Hybrid option.

Powered by mwForum 2.27.4 © 1999-2012 Markus Wichitill