Not logged inRybka Chess Community Forum
Up Topic The Rybka Lounge / Computer Chess / Game examples of NN engines failing to convert wins
- - By rocket (****) Date 2020-06-14 10:00
It was raised that one implication of NN engines selectivity is failure to convert wins

I want to see an example of how severe this is
Parent - - By user923005 (****) Date 2020-06-14 13:00
Every engine fails to convert wins at times.  Since the NN engines are neck and neck with Stockfish at TCEC, it shows that the problem is not the end of the world.

They do not search as deep, so when the AB engines start hitting 70 plies and are seeing far ahead of the NN engines (consulting the TB files like madmen) then the AB engines do gain an advantage over the NN engines in the endgame.

But since they have often built a substantial advantage through the quiet moves where they excel, more often than not they do convert the won position into a win.

But if you are a correspondence player, it is good to consider where the engines are strong and weak and use them for their strengths and avoid their weaknesses.
Parent - By rocket (****) Date 2020-06-14 13:32
Yeah but I want to see a game of this.
Parent - - By h.g.muller (****) Date 2020-06-21 05:27
This is not an intrinsic flaw of NN engines. These just do what they are trained to do. And the people of developing them are in the habit of actively training them to not be able to convert wins.

The problem is that the evaluation head of the NN is trained to reflect the pure winning probability. When that succeeds, they no longer can discriminate positions that are rapidly won from positions that are very tedious to win; they all get the same score. Basically they turn into random movers, only avoiding moves that spoil the 100% win. Take KBNK as an example. That can take up to 33 full moves, and there is no way even an AB engine is going to reach 66 ply in a fixed-depth search. All positions where you still have the B and N are 100% won, so it has no clue what to reduce and what not, it cannot be selective. Of course it sees the 50-move counter go up, but once it sees that as a serious problem,  it is too late to do anything about it; you cannot waste too many moves in KBNK. And what could it do? There is nothing to capture, and forced sacrifycing of the B or N won't help. After fooling around for 20 moves, allowing the opponent to stay close to maximum DTM, it will just have to conceed that it is now in a drawn end-game. You can only win KBNK through search (as opposed to EGT) when you have a sub-goal of driving the bare King to an edge, and then along an edge to the proper corner. If you don't mind whether the King is in the center or near a corner, because it is all 100% won anyway, there is no hope. and if it would have extra material, like KBNNK, it would allow the exces Knight to be taken without offering any resistance, even happily offering it, because hey, KBNK is 100% won anyway, right?

Yet, with a good evaluation, you would not need much depth to win end-games like KBNK. The Interactive Diagrams you can post on TalkChess only search 2 ply deep, and they can already convert KQK, and KRK most of the time. So lack of depth is hardly an excuse.

What would solve the problem is taking the game duration into account duting the training process, i.e. train the policy head to read, say, 70% on a certan win that is slow, allowing the remaining 30% range to discriminate between certain wins that are fast and certain wins that are slow. I.e. teach it to make progress, rather than that it doesn't matter whetre you make progress.
Parent - - By Labyrinth (*****) Date 2020-06-21 16:03
Would training the NN on the tablebase be enough to fix things like that?
Parent - - By user923005 (****) Date 2020-06-22 01:52
It would probably be helpful to add some TB training, but LC0 and the like already have access to perfect TB information during game play.  So why do these NN engines lose in the endgame?  Maybe part is lack of endgame knowledge, but I guess that most of the problem is simply time to depth.  The NN engines search much more slowly, and AB engines accelerate as the board clears.  I often see depths over 100 with AB engines in the endgame.
Parent - By h.g.muller (****) Date 2020-06-22 08:46
It is very easy for NN engines to reach depth over 100 as well. In general NN engines should reach much larger depth than AB engines, even in the middle game.
Parent - By Clementin (*) Date 2020-06-14 17:20
There are much worse cases from these, but look at the games in this post: http://talkchess.com/forum3/viewtopic.php?f=2&t=74091&start=80#p846410
Parent - By MrKris (***) Date 2020-06-22 03:49
From: http://rybkaforum.net/cgi-bin/rybkaforum/topic_show.pl?pid=586811
Lc0 won 55.0-45.0 ShashChess, Nooman TCEC 17 SuFin openings.
(ShashChess 52.0-48.0 Stockfish http://rybkaforum.net/cgi-bin/rybkaforum/topic_show.pl?pid=586796 )

Only G/90"+1" but ponder on and both have Syzygy 6. They both evaluate white as winning.

After 73...Kd7 Stockfish_20061013 at 1 min. has white +17.97 74.Kb7 (!) c5 75.bxc5 Nxc5+ 76.Kb6, if 76...Ne5 77.e5 and the white K can invade
8/3k4/K1p1n2B/8/1P2Pp1p/5P1P/6P1/8 w - - 2 74


After 74.Ka5 Kc7 75.Ka6 c5 Stockfish at 1 min. has white +11.74 76.b5 (!) c4 77.Ka7 Kd7 78.b6 c3 79.b7 c2 80.b8Q c1Q 81.Qb7+ Kd6 82.Bg7 Qc5+ 83.Kb8 Nxg7 84.Qxg7 etc. Q vs. Q white 2 pawns up
8/2k5/K3n2B/2p5/1P2Pp1p/5P1P/6P1/8 w - - 0 76
However Lc0 played 76.Kb5?? +5.66 and its opponent and Stockfish after the game immediately said 0.00 because 76...cxb4 0.00 77.Kxb4 +5.59 Kc6 0.00 keeps the white king out and the bishop trapped.

Then Lc0 takes almost 50 moves to 'pro-rate' its over-evaluation down to near zero while ShashChess says 0.00 the whole time.

[Event "G90''+1'' ponder 2GBh Sy6"]
[Site "2700X 14th | 2060 140w"]
[Date "2020.06.04"]
[Round "3"]
[White "Lc0.24.1_320x24-63681"]
[Black "ShashChess-11_Cabablanca"]
[Result "1/2-1/2"]
[ECO "E99"]
[Opening "King's Indian"]
[Variation "Mar del Plata, 10.f3 f5 11.Nd3 Nf6 12.Bd2 f4"]
[TimeControl "90+1"]
[Termination "adjudication"]
[PlyCount "245"]
[FEN "5n2/3k4/2p5/2K3pp/1P2Pp2/5P1P/6P1/4B3 w - - 1 64"]

64. Bc3 {+2.86/9 0 1.8s} Ne6+
{-3.80/26 0 0.41s} 65. Kb6 {+3.15/7 0 1.5s} Nf8 {-5.95/28 0 0.75s} 66. Bf6
{+3.60/7 0 0.65s} Ne6 {-7.24/30 0 1.9s} 67. Kb7 {+4.08/7 0 0.59s} Nc7
{-7.57/27 0 1.4s} 68. Bxg5 {+6.49/6 0 1.2s} Ne6 {-7.99/27 0 0s} 69. Bf6
{+6.82/6 0 1.8s} Nf8 {-8.04/31 0 0s} 70. Kb6 {+7.01/5 0 1.6s} Kd6 {-8.46/31
0 0.84s} 71. Bg5 {+7.58/5 0 1.1s} Ne6 {-8.74/27 0 0.50s} 72. Bh6 {+6.16/6 0
1.1s} h4 {-8.59/27 0 3.4s} 73. Ka6 {+5.49/7 0 0.96s} Kd7 {-8.95/27 0 0.83s}
74. Ka5 {+5.40/6 0 1.1s} Kc7 {-4.22/25 0 0.47s} 75. Ka6 {+5.28/6 0 1.1s} c5
{-4.83/30 0 0.75s} 76. Kb5 {+5.66/6 0 0.60s} cxb4 {0.0045 0.61s} 77. Kxb4
{+5.59/6 0 0.91s} Kc6 {0.0051 0s} 78. Kc4 {+5.50/6 0 1.3s} Kd6 {0.0055 0s}
79. Kb5 {+5.48/7 0 1.2s} Kd7 {0.0056 0.79s} 80. Kb6 {+4.81/7 0 1.2s} Kd6
{0.0059 0.25s} 81. Kb5 {+4.26/7 0 0.54s} Kd7 {0.0057 0.89s} 82. Kb4
{+3.65/8 0 0.69s} Kc6 {0.0061 1.2s} 83. Kc3 {+3.66/8 0 1.4s} Kc5 {0.0062
0.81s} 84. Kd3 {+3.63/8 0 0.58s} Kc6 {0.0061 0.23s} 85. Kc3 {+3.38/7 0
1.5s} Kc5 {0.0056 0.80s} 86. Kd3 {+3.25/8 0 0.41s} Kc6 {0.0067 2.4s} 87.
Kc4 {+3.10/9 0 1.1s} Kd6 {0.0059 0.49s} 88. Kb3 {+3.27/9 0 0.23s} Kc6
{0.0065 0.89s} 89. Kb4 {+3.36/8 0 1.2s 3-fold repetition} Kc7 {0.0065 0s}
90. Ka5 {+3.35/8 0 1.3s} Kd7 {0.0064 0s} 91. Kb5 {+3.12/8 0 1.4s 3-fold
repetition} Kc7 {0.0068 0s} 92. Ka5 {+2.97/9 0 0.54s} Kd7 {0.0064 0.89s}
93. Kb5 {+2.83/9 0 2.0s 3-fold repetition} Kc7 {0.0070 0s} 94. Ka4 {+2.86/8
0 0.35s} Kb6 {0.0061 0.93s} 95. Kb3 {+2.77/9 0 0.75s} Kb5 {0.00245 1.3s}
96. Ka3 {+1.94/12 0 2.0s} Kb6 {0.0059 0.99s} 97. Kb2 {+3.09/7 0 0.76s} Kc6
{0.0067 0.96s} 98. Kc2 {+2.72/8 0 1.3s} Kb6 {0.0067 0s} 99. Kb2 {+2.94/7 0
1.2s} Kc6 {0.0068 0.95s} 100. Kc1 {+2.05/10 0 0.72s} Kc5 {0.0053 1.0s} 101.
Kd1 {+1.97/12 0 0.66s} Kb4 {0.0053 1.0s} 102. Kd2 {+1.88/11 0 1.3s} Kc4
{0.0057 1.0s} 103. Kc2 {+1.75/13 0 0.46s} Kd4 {0.0054 1.2s} 104. Kd2
{+1.56/14 0 0.23s} Kc4 {0.0052 0.80s} 105. Ke1 {+1.32/16 0 2.5s} Kd3
{0.0061 0s} 106. Kf2 {+1.07/16 0 0.99s} Kd4 {0.0061 0.18s} 107. Ke2
{+1.55/10 0 0.56s} Kc4 {0.0062 1.8s} 108. Kd1 {+1.45/11 0 1.4s} Kc3 {0.0059
1.9s} 109. Ke2 {+1.44/10 0 0.69s} Kd4 {0.0061 1.1s} 110. e5 {+0.65/15 0
0.53s} Kc4 {0.0062 1.2s} 111. Kd2 {+0.67/14 0 1.4s} Kd4 {0.0055 0.93s} 112.
Ke2 {+0.64/16 0 0.34s} Kxe5 {0.0070 0.94s} 113. Kd3 {+0.59/16 0 0.23s} Kd5
{0.0075 0.71s} 114. Kc3 {+0.57/14 0 2.8s} Ke5 {0.0067 0.95s} 115. Kc4
{+0.63/14 0 0.66s} Kd6 {0.0063 0.95s} 116. Kb4 {+0.66/12 0 1.3s} Kd5
{0.0073 1.1s} 117. Kc3 {+0.59/13 0 0.21s} Kc5 {0.0062 1.2s} 118. Kd3
{+0.53/13 0 1.5s} Kd5 {0.0071 0s} 119. Kc2 {+0.53/15 0 0.38s} Kc4 {0.0070
1.0s} 120. Kc1 {+0.27/14 0 1.8s} Kd3 {0.0077 1.0s} 121. Kd1 {+0.21/10 0
0.32s} Ke3 {0.0084 0.70s} 122. Ke1 {+0.16/8 0 0.22s} Nd4 {0.0093 1.6s} 123.
Bf8 {+0.10/7 0 2.2s, Draw by adjudication} 1/2-1/2
- By rocket (****) Date 2020-06-15 11:20
But NNs are much weaker at blitz
Up Topic The Rybka Lounge / Computer Chess / Game examples of NN engines failing to convert wins

Powered by mwForum 2.27.4 © 1999-2012 Markus Wichitill