Not logged inRybka Chess Community Forum
Up Topic The Rybka Lounge / Computer Chess / AZ vs SF - game 99
1 2 3 Previous Next  
- - By Rebel (****) Date 2017-12-23 09:27 Upvotes 3
I have been been analyzing this (beautiful) game with Komodo 9.42 and the latest SF version running the annotated moves for hours. None of them plays the AZ moves. Furthermore I see a pattern in this game I have seen in some of the other 9 games, as in this game both Komodo and SF for many moves only produce forced draw lines (3-fold repetitions) yet AZ never follows them, it always plays a different move (see for instance move 23.Qc4, 27.Be4, 29.Qh3) while being down in material with a knight and pawn.

So what do we have, either 1) a complete new paradigm as the paper promotes 2) SF is simply outsearched, or 3) a scam where every position is learned.

[Event ""]
[White "AlphaZero"]
[Black "Stockfish"]
[Site ""]
[Round "99"]
[Annotator ""]
[Result "1-0"]
[Date "2017.12.01"]
[PlyCount "111"]

1. Nf3 Nf6 2. d4 e6 3. c4 b6 4. g3 Bb7 5. Bg2 Be7 6. 0-0 0-0 7. d5 exd5 8. Nh4 c6 9. cxd5 Nxd5
10. Nf5 Nc7 11. e4 d5 12. exd5 Nxd5 13. Nc3 Nxc3 14. Qg4 g6 15. Nh6+ Kg7 16. bxc3 Bc8
17. Qf4 Qd6 18. Qa4 g5 19. Re1!? {mindblowing} Kxh6 {is the game over already by taking the knight?}
20. h4 f6 21. Be3! {no top engine plays it, not even after hours} Bf5 22. Rad1 Qa3
23. Qc4!! {AZ is down a knight and a pawn but can force a draw and yet it must have seen something better} b5
24. hxg5+ fxg5 25. Qh4+ Kg6 26. Qh1 Kg7 27. Be4! {same comment as on move 23} Bg6 28. Bxg6 hxg6
29. Qh3!! {why not take the Bc1 draw? What again has it seen more?} Bf6 30. Kg2 Qxa2 {does Rh8 make a better defence?}
31. Rh1 Qg8 32. c4!! {another mindblowing move but engines from here on are starting to evaluate Rd6 as positive for white, yet c4 seems to come from a different planet} Re8
33. Bd4!{ it's over} Bxd4 34. Rxd4 Rd8 35. Rxd8 Qxd8 36. Qe6 Nd7 37. Rd1 Nc5 38. Rxd8 Nxe6
39. Rxa8 Kf6 40. cxb5 cxb5 41. Kf3 Nd4+ 42. Ke4 Nc6 43. Rc8 Ne7 44. Rb8 Nf5 45. g4 Nh6
46. f3 Nf7 47. Ra8 Nd6+ 48. Kd5 Nc4 49. Rxa7 Ne3+ 50. Ke4 Nc4 51. Ra6+ Kg7 52. Rc6 Kf7
53. Rc5 Ke6 54. Rxg5 Kf6 55. Rc5 g5 56. Kd4 1-0
Parent - - By Kappatoo (*****) [de] Date 2017-12-23 09:41
Are AlphaGo and AlphaGo Zero also scams then?
If they aren't, what would they have to gain from spending so many resources just to pretend they managed to pull off the same in chess?
Parent - - By Rebel (****) Date 2017-12-23 10:04
Alpha Go is believable, doable.
Parent - By Kappatoo (*****) [de] Date 2017-12-23 13:15
A) In Go, they surpassed previous top players by 1500 Elo points, and previous top programs by 3000 Elo points.
B) In chess, they surpassed previous top programs by 100 Elo points (and people even quibble about that).

Why is A) believable and doable, but B) isn't?
Moreover, why wouldn't the same approach that worked so spectacularly well for Go not work at all for chess?
Parent - - By Chris Whittington (**) [fr] Date 2017-12-23 10:56 Edited 2017-12-23 10:59
Hi Ed, long time no forum!
I'm mind changing a lot trying to get to grips with all this. This is current dunning Kruger position:

It is not 2, it's 1 and 3 but not in way you think.

First is stop thinking old paradigm search and evaluation. A static thing and a finder thing.

If you had a massive 32 man egtb that would be a perfect evaluation function, right?  Theoretically you could dispense with search, because the search is now encoded in a single lookup.
Likewise, an opening book is an encoded search, you look up what to next.
Search is encodeable with a lookup.

We can't encode perfectly the entire search tree, as you know.
But AZ has learnt enough from its 44 million games to imperfectly encode it. The generalization of the ANN means it can give you a (not perfect, we don't know how inaccurate) win rate for any position. High win rate doesn't mean it found a forced line, it just says it found, averaged, more wins than losses. If you want to get philosophical you could say it tries to play to regions where there are lots of wins, hence exciting regions. But nothing necessarily forced. Only AB minimax finds forced.

AZ evaluation is likely a bit unstable, inaccurate in some positions, if connected to an AB search, possibly even quite weak as the inaccuracies would get picked out. But connected to a highly guided MCTS rollout search, the inaccuracies get averaged out, and it is very strong.

Yes, so basically it is your 3, a giant inaccurate lookup, an encoded tree of all games, made accurate/effective at game play time, by running a bunch of guided depth limited rollouts a la mcts. 3,  but not a scam.

This is neither search nor evaluation in terms of old paradigm. IMO. It's whatever you want to call an imperfect 32 man egtb with some added sanity checking.
Parent - - By Rebel (****) Date 2017-12-23 17:33
Hey Chris, good to talk again.

I have been thinking along the same lines as your 32 man TB metaphor, a sort of oracle that based on history values returns a number of reductions or even a complete cut-off, sort of NN TB hit :grin: it would explain the out-searching. Problem with these kinds of hypothesis is that they are based on exact positions and not based on pattern recognition.

I understand how (for instance) an NN can recognize handwritings by pattern recognition and can make a digital copy of a letter, I can even understand NN is suitable for the game of GO, but pattern recognition in chess? Even when a queen down there often hidden escapes, or more simple, a white pawn on a2 or a3 can make all the difference.

And they did not do themselves a favor playing all these 100 games from the start position, MCTS + reinforcement learning (as the paper is full of) is exactly the way how to learn รค given position, in this case the start position. You get the drift.
Parent - - By Chris Whittington (**) [fr] Date 2017-12-23 18:08
Well, if you don't think chess is amenable to pattern recognition then you won't accept AZ as viable and youll need to generate "other" reasons.

Mcts is vulnerable to the existence of a single long pathway refutation, perhaps missed because of the effect of some inconsequential pawn way down the path. I read that the one Go win by Lee? was due to this effect. But, seems in battle AZ Stockfish, the latter found no such winning refutations.

What do you have against AZ self play training from start position? What else would you propose?
Parent - - By Rebel (****) Date 2017-12-23 20:56
Well, if you don't think chess is amenable to pattern recognition then you won't accept AZ as viable and youll need to generate "other" reasons.

I often thought about pattern recognition, failure every time. Many others did. Any theory on your end? The document does not say anything about it, just that it learned. What did it learn and how?

I suppose you understand my (conspiracy) theory how easy it is to out-learn an(y) opponent on a given position by repeating that position until you win in all variations by reinforcement learning, say 500 times in a row. That's why I said that playing all those 100 games from the start position opens the door for this line of thinking. And, how many doubles are there? We don't know. Don't these guys know how to properly play a match? Why no opening book for SF? Why having learned the most common openings (page 6) and leave SF in the dark without an opening book starting from an advantage in every game?

Regarding the game of GO, it's about white and black stones only plus formations, doable IMO with enough comp power.

I will become a believer if they replay that match with an opening book for SF chosen by an independent person.
Parent - By Chris Whittington (**) [fr] Date 2017-12-23 22:54
It learnt to give some sort of win rate value for any position presented to its inputs. By a process of adjusting its weights a very small amount to make its output closer to the desired output value across about 500 millions chess moves. This being a known and proven method to train a neural net. It learnt to approximate to a 32 man egtb.

AZ played TEST games against Stockfish, it did not learn anything from those games. They were not used to train the AZ ANN, only to test it. AZ was trained on itself, not on any other entity.

And, even if those games were trained on, they weren't but for the sake of your argument, you realise the effect of one game on the ANN weights is microscopically small? And very unlikely to get AZ to change its mind on a move? The ANN is the collective wisdom of 4.4 million games. Good luck with changing anything much with 100 other disparate games.
Parent - - By Chris Whittington (**) [fr] Date 2017-12-23 11:57
3 rep is straight forward reason. The ANN inputs carries game history info, so it's evaluation also depends on what happened in the past. If it avoids 3 rep is presumably because the alternative move scores greater than 0.5
Parent - By Rebel (****) Date 2017-12-23 16:01
Exactly, can't think of any other reason, hence learned.
Parent - - By Sesse (****) [gb] Date 2017-12-23 19:17
The ANN inputs don't contain full history information, but they do contain position repetition counts.
Parent - - By Chris Whittington (**) [fr] Date 2017-12-23 19:29
Table S1 of the paper says "repeats in 8 move history".
Parent - - By Sesse (****) [gb] Date 2017-12-23 19:32
Ah, you're right. That's still not full move history, though :-) Just eight half-moves back.
Parent - - By Chris Whittington (**) [fr] Date 2017-12-23 19:35
I am wondering hard about why they would so massively increase the inputs that way, however. If just draw detection repetition count would be enough
Parent - - By Sesse (****) [gb] Date 2017-12-23 22:12
I suppose it was used as a sort of attention mechanism in go (where on the board is the action), and they wanted to use the same thing for chess. In any case, if it's not useful, the network will learn to ignore it.
Parent - - By Chris Whittington (**) [gb] Date 2017-12-23 23:13
very rough back of envelope tells me ANN with N inputs and N neurons in the next layer up has N squared work to do.  increasing the inputs from 2 x 6 planes of piece data to 2x6x8 planes with history seems remarkable. it's very expensive and what is the payofff? ok, ok, lots of stuff runs in parallel, but even so ....
Parent - - By Sesse (****) [gb] Date 2017-12-23 23:57
There's nothing saying there has to be as many (or as few) neurons in the first layer as the inputs.
Parent - By Chris Whittington (**) [gb] Date 2017-12-24 12:05
this is true, nevertheless it seems odd to use eight times the apparently needed inputs as history, when repetition count ought to do. -IF- it was only for repetition proof. I wondered if they were somehow sneaking in legal moves from this position (which would be in keeping with the FIDE knowledge only), and previous position, because then they would have provided almost-accurate attack maps. I saw somewhere, something about moves being part of the inputs, but that goes against the paper table S1 and now I've forgotten where I saw or imagined it.
Parent - By Chris Whittington (**) [fr] Date 2017-12-23 12:11
It could be argued that 2 outsearched is part true in the sense that an  egtb has already outsearched its opponent before even looking up the move. Imperfectly outsearched in the case of AZ 32 man ANN. After the added rollout mcts search the win rate will still be "imperfect" in that there may be  an alpha beta found pathway that refutes it. Stockfish doesn't seem to have been able to find any such pathways though.
AZ Stockfish really are in the deep dark forest (AZ takes Stockfish there) where 2+2=5 and the way out is only wide enough for one. Stockfish AB never once found the way out, even though a perfect 32 man egtb may say there is one, or two or more
Parent - - By Peter Grayson (****) [gb] Date 2017-12-23 17:33
Why didn't you run the analysis with Stockfish 8 Ed? What you would have seen was for the best part Stockfish 8 playing against itself.

Running some analysis with 32 threads and 1 Gb hash, I see Stockfish 8 predicted and would have played 21.Be3, "the move no other engine saw in hours of analysis and if playing as White", in fact it would not not have considered anything else in the context of the game conditions.

To ensure there was no cross reference of hash, I ran Stockfish 8 in infinite analysis mode for two passes, intially as Black and then as White, unloading the engine and GUI between the two runs. I stopped the engine and recorded the engine's last output between 1 to 2 minutes that I pasted into the game. Up to 2 minutes seemed reasonable given the game machine hardware had 64 cores and was probably more up to date and core for core faster than mine.

I also see exploitation of Stockfish 8's big weakness of being unable to see outside the draw box when presented with a drawn position. It is well recorded the engine has blundered when giving a chain of 0.00 scores and so it seems to be the case here.

I have analysed a number of the published AZ vs SF8 games and I see little more than a variant of Stockfish 8 playing against Stockfish 8. Contrary to your comment, analysing with Stockfish 8, for the best part Stockfish 8 predicted AZ's moves and indeed would have played them with colours reversed. The hardware may also influence what is seen in analysis and it cannot be certain if the match hardware gave Stockfish 8 even deeper ply depth than I achieved here. Try using Houdini 6.03 for example and the game moves would change very quickly. My impression is that AZ was heavily influenced by Stockfish 8. The question then arises, would it have achieved such a good result against Houdini or Komodo? We will probably never know.

I include your posted game with notes updated with Stockfish 8 analysis from my machine. Copying the game into a GUI should allow the machine score and ply depth to be seen.

PeterG

[Event "?"]
[Site "?"]
[Date "2017.12.01"]
[Round "99"]
[White "AlphaZero"]
[Black "Stockfish"]
[Result "1-0"]
[ECO "E17"]
[Annotator "Grayson,Peter"]
[PlyCount "111"]
[EventDate "2017.??.??"]

1. Nf3 Nf6 2. d4 e6 3. c4 b6 4. g3 Bb7 5. Bg2 Be7 6. O-O O-O 7. d5 exd5 8. Nh4
c6 9. cxd5 Nxd5 10. Nf5 Nc7 11. e4 d5 12. exd5 Nxd5 13. Nc3 Nxc3 14. Qg4 g6 15.
Nh6+ Kg7 16. bxc3 Bc8 ({Stockfish 8 64 POPCNT:} 16... Bc8 17. Qf4 {[%eval -6,
36]}) 17. Qf4 Qd6 ({Stockfish 8 64 POPCNT:} 17... Qd6 18. Qa4 g5 19. Ng4 f5 20.
Ne3 b5 21. Qd1 f4 22. Nc2 Qxd1 23. Rxd1 a5 24. Nd4 b4 25. Bb2 Bf6 26. cxb4 axb4
27. a3 fxg3 28. hxg3 Ra4 29. axb4 Rxb4 30. Bc3 Rc4 31. Bb2 Bg4 32. Rd2 Rb4 33.
Bc3 Rc4 34. Bb2 {[%eval 0,37]}) 18. Qa4 g5 ({Stockfish 8 64 POPCNT:} 18... g5
19. Ng4 f5 20. Ne3 b5 21. Qd1 f4 22. Nc2 Qxd1 23. Rxd1 a5 24. Nd4 b4 25. Bb2
Bf6 26. cxb4 axb4 27. a3 Ra4 28. axb4 Rxb4 29. Bc3 Rc4 30. Bb2 Rb4 {[%eval 0,
37]}) 19. Re1 $5 {mindblowing} Kxh6 {is the game over already by taking the
knight?} ({Stockfish 8 64 POPCNT:} 19... Kxh6 20. h4 f6 21. Be3 Kg7 22. Rad1
Qe6 23. hxg5 Qf7 24. gxf6+ Bxf6 25. Rd6 Kg8 26. Bh6 Bg7 27. Bxg7 Qxg7 28. Qh4
Qg4 29. Qh6 Qg7 30. Qh4 {[%eval 0,36]}) 20. h4 {Note Stockfish 8 predicts 21.
Be3!} ({Stockfish 8 64 POPCNT:} 20. h4 f6 21. Be3 Kg7 22. Rad1 Qe6 23. hxg5 Qf7
24. gxf6+ Bxf6 25. Rd6 Kg8 26. Bh6 Bg7 27. Bxg7 Qxg7 28. Qh4 Qg4 29. Qh6 Qg7
30. Qh4 {[%eval 0,37]}) 20... f6 ({Stockfish 8 64 POPCNT:} 20... f6 21. Be3 Kg7
22. Rad1 Qe6 23. hxg5 Qf7 24. gxf6+ Bxf6 25. Rd6 Kg8 26. Bh6 Bg7 27. Bxg7 Qxg7
28. Qh4 Qg4 29. Qh6 Qg7 30. Qh4 {[%eval 0,40]}) 21. Be3 $1 {no top engine
plays it, not even after hours ... except Stockfish 8! It also predicted 21.
Be3 in its previous move sequence.    Analysis by Stockfish 8 64 POPCNT:    21.
Be3 Kg7 22.Rad1 Qe6 23.hxg5 Qf7     =  (0.00)   Depth: 6/6   00:00:00  8kN  21.
Be3 Kg7 22.Rad1 Qe6 23.hxg5 Qf7 24.gxf6+ Bxf6     =  (0.00)   Depth: 7/8   00:
00:00  14kN  21.Be3 Kg7 22.Rad1 Qe6 23.hxg5 Qf7 24.gxf6+ Bxf6 25.Rd6     =  (0.
00)   Depth: 8/9   00:00:00  17kN  21.Be3 Kg7 22.Rad1 Qe6 23.hxg5 Qf7 24.gxf6+
Bxf6 25.Rd6 Kg8     =  (0.00)   Depth: 9/10   00:00:00  20kN  21.Be3 Kg7 22.
Rad1 Qe6 23.hxg5 Qf7 24.gxf6+ Bxf6 25.Rd6 Kg8 26.Bh6 Bg7     =  (0.00)   Depth:
10/12   00:00:00  26kN  21.Be3 Kg7 22.Rad1 Qe6 23.hxg5 Qf7 24.gxf6+ Bxf6 25.
Rd6 Kg8 26.Bh6 Bg7 27.Bxg7 Qxg7 28.Qh4 Qg4     =  (0.00)   Depth: 11/16   00:
00:00  35kN  21.Be3 Kg7 22.Rad1 Qe6 23.hxg5 Qf7 24.gxf6+ Bxf6 25.Rd6 Kg8 26.
Bh6 Bg7 27.Bxg7 Qxg7 28.Qh4 Qg4 29.Qh6 Qg7 30.Qh4     =  (0.00)   Depth: 12/19
00:00:00  46kN  21.Be3 Kg7 22.Rad1 Qe6 23.hxg5 Qf7 24.gxf6+ Bxf6 25.Rd6 Kg8 26.
Bh6 Bg7 27.Bxg7 Qxg7 28.Qh4 Qg4 29.Qh6 Qg7 30.Qh4     =  (0.00)   Depth: 13/20
00:00:00  62kN  21.Be3 Kg7 22.Rad1 Qe6 23.hxg5 Qf7 24.gxf6+ Bxf6 25.Rd6 Kg8 26.
Bh6 Bg7 27.Bxg7 Qxg7 28.Qh4 Qg4 29.Qh6 Qg7 30.Qh4     =  (0.00)   Depth: 14/20
00:00:00  79kN  21.Be3 Kg7 22.Rad1 Qe6 23.hxg5 Qf7 24.gxf6+ Bxf6 25.Rd6 Kg8 26.
Bh6 Bg7 27.Bxg7 Qxg7 28.Qh4 Qg4 29.Qh6 Qg7 30.Qh4     =  (0.00)   Depth: 15/20
00:00:00  98kN  21.Be3 Kg7 22.Rad1 Qe6 23.hxg5 Qf7 24.gxf6+ Bxf6 25.Rd6 Kg8 26.
Bh6 Bg7 27.Bxg7 Qxg7 28.Qh4 Qg4 29.Qh6 Qg7 30.Qh4     =  (0.00)   Depth: 16/20
00:00:00  117kN  21.Be3 Kg7 22.Rad1 Qe6 23.hxg5 Qf7 24.gxf6+ Bxf6 25.Rd6 Kg8
26.Bh6 Bg7 27.Bxg7 Qxg7 28.Qh4 Qg4 29.Qh6 Qg7 30.Qh4     =  (0.00)   Depth: 17/
20   00:00:00  137kN  21.Be3 Kg7 22.Rad1 Qe6 23.hxg5 Qf7 24.gxf6+ Bxf6 25.Rd6
Kg8 26.Bh6 Bg7 27.Bxg7 Qxg7 28.Qh4 Qg4 29.Qh6 Qg7 30.Qh4     =  (0.00)   Depth:
18/20   00:00:00  159kN  21.Be3 Kg7 22.Rad1 Qe6 23.hxg5 Qf7 24.gxf6+ Bxf6 25.
Rd6 Kg8 26.Bh6 Bg7 27.Bxg7 Qxg7 28.Qh4 Qg4 29.Qh6 Qg7 30.Qh4     =  (0.00) 
Depth: 19/20   00:00:00  182kN  21.Be3 Kg7 22.Rad1 Qe6 23.hxg5 Qf7 24.gxf6+
Bxf6 25.Rd6 Kg8 26.Bh6 Bg7 27.Bxg7 Qxg7 28.Qh4 Qg4 29.Qh6 Qg7 30.Qh4     =  (0.
00)   Depth: 20/20   00:00:00  205kN  21.Be3 Kg7 22.Rad1 Qe6 23.hxg5 Qf7 24.
gxf6+ Bxf6 25.Rd6 Kg8 26.Bh6 Bg7 27.Bxg7 Qxg7 28.Qh4 Qg4 29.Qh6 Qg7 30.Qh4   
=  (0.00)   Depth: 21/20   00:00:00  227kN  21.Be3 Kg7 22.Rad1 Qe6 23.hxg5 Qf7
24.gxf6+ Bxf6 25.Rd6 Kg8 26.Bh6 Bg7 27.Bxg7 Qxg7 28.Qh4 Qg4 29.Qh6 Qg7 30.Qh4
=  (0.00)   Depth: 22/20   00:00:00  252kN  21.Be3 Kg7 22.Rad1 Qe6 23.hxg5 Qf7
24.gxf6+ Bxf6 25.Rd6 Kg8 26.Bh6 Bg7 27.Bxg7 Qxg7 28.Qh4 Qg4 29.Qh6 Qg7 30.Qh4
=  (0.00)   Depth: 23/20   00:00:00  274kN  21.Be3 Kg7 22.Rad1 Qe6 23.hxg5 Qf7
24.gxf6+ Bxf6 25.Rd6 Kg8 26.Bh6 Bg7 27.Bxg7 Qxg7 28.Qh4 Qg4 29.Qh6 Qg7 30.Qh4
=  (0.00)   Depth: 24/20   00:00:00  300kN  21.Be3 Kg7 22.Rad1 Qe6 23.hxg5 Qf7
24.gxf6+ Bxf6 25.Rd6 Kg8 26.Bh6 Bg7 27.Bxg7 Qxg7 28.Qh4 Qg4 29.Qh6 Qg7 30.Qh4
=  (0.00)   Depth: 25/20   00:00:00  322kN  21.Be3 Kg7 22.Rad1 Qe6 23.hxg5 Qf7
24.gxf6+ Bxf6 25.Rd6 Kg8 26.Bh6 Bg7 27.Bxg7 Qxg7 28.Qh4 Qg4 29.Qh6 Qg7 30.Qh4
=  (0.00)   Depth: 26/20   00:00:00  347kN  21.Be3 Kg7 22.Rad1 Qe6 23.hxg5 Qf7
24.gxf6+ Bxf6 25.Rd6 Kg8 26.Bh6 Bg7 27.Bxg7 Qxg7 28.Qh4 Qg4 29.Qh6 Qg7 30.Qh4
=  (0.00)   Depth: 27/20   00:00:00  371kN  21.Be3 Kg7 22.Rad1 Qe6 23.hxg5 Qf7
24.gxf6+ Bxf6 25.Rd6 Kg8 26.Bh6 Bg7 27.Bxg7 Qxg7 28.Qh4 Qg4 29.Qh6 Qg7 30.Qh4
=  (0.00)   Depth: 28/20   00:00:00  390kN  21.Be3 Kg7 22.Rad1 Qe6 23.hxg5 Qf7
24.gxf6+ Bxf6 25.Rd6 Kg8 26.Bh6 Bg7 27.Bxg7 Qxg7 28.Qh4 Qg4 29.Qh6 Qg7 30.Qh4
=  (0.00)   Depth: 29/20   00:00:00  408kN  21.Be3 Kg7 22.Rad1 Qe6 23.hxg5 Qf7
24.gxf6+ Bxf6 25.Rd6 Kg8 26.Bh6 Bg7 27.Bxg7 Qxg7 28.Qh4 Qg4 29.Qh6 Qg7 30.Qh4
=  (0.00)   Depth: 30/20   00:00:00  428kN  21.Be3 Kg7 22.Rad1 Qe6 23.hxg5 Qf7
24.gxf6+ Bxf6 25.Rd6 Kg8 26.Bh6 Bg7 27.Bxg7 Qxg7 28.Qh4 Qg4 29.Qh6 Qg7 30.Qh4
=  (0.00)   Depth: 31/20   00:00:00  447kN  21.Be3 Kg7 22.Rad1 Qe6 23.hxg5 Qf7
24.gxf6+ Bxf6 25.Rd6 Kg8 26.Bh6 Bg7 27.Bxg7 Qxg7 28.Qh4 Qg4 29.Qh6 Qg7 30.Qh4
=  (0.00)   Depth: 32/20   00:00:00  555kN  21.Be3 Kg7 22.Rad1 Qe6 23.hxg5 Qf7
24.gxf6+ Bxf6 25.Rd6 Kg8 26.Bh6 Bg7 27.Bxg7 Qxg7 28.Qh4 Qg4 29.Qh6 Qg7 30.Qh4
=  (0.00)   Depth: 33/20   00:00:00  1295kN  21.Be3 Kg7 22.Rad1 Qe6 23.hxg5
Qf7 24.gxf6+ Bxf6 25.Rd6 Kg8 26.Bh6 Bg7 27.Bxg7 Qxg7 28.Qh4 Qg4 29.Qh6 Qg7 30.
Qh4     =  (0.00)   Depth: 34/20   00:00:00  1379kN  21.Be3 Kg7 22.Rad1 Qe6 23.
hxg5 Qf7 24.gxf6+ Bxf6 25.Rd6 Kg8 26.Bh6 Bg7 27.Bxg7 Qxg7 28.Qh4 Qg4 29.Qh6
Qg7 30.Qh4     =  (0.00)   Depth: 35/20   00:00:00  5145kN  21.Be3 Kg7 22.Rad1
Qe6 23.hxg5 Qf7 24.gxf6+ Bxf6 25.Rd6 Kg8 26.Bh6 Bg7 27.Bxg7 Qxg7 28.Qh4 Qg4 29.
Qh6 Qg7 30.Qh4     =  (0.00)   Depth: 36/20   00:00:02  75906kN  21.Be3 Bf5 22.
Rad1 Qa3 23.hxg5+ fxg5 24.Qh4+ Kg6 25.Qh1 Kg7 26.Bc1 Qc5 27.Be3 Qa3     =  (0.
00)   Depth: 37/45   00:00:54  1791mN  21.Be3 Bf5 22.Rad1 Qa3 23.hxg5+ fxg5 24.
Qh4+ Kg6 25.Qh1 Kg7 26.Bc1 Qc5 27.Be3 Qa3     =  (0.00)   Depth: 38/45   00:01:
04  2118mN  21.Be3 Bf5 22.Rad1 Qa3 23.hxg5+ fxg5 24.Qh4+ Kg6 25.Qh1 Kg7 26.Bc1
Qc5 27.Be3 Qa3} ({Stockfish 8 64 POPCNT:} 21. Be3 Bf5 22. Rad1 Qa3 23. hxg5+
fxg5 24. Qh4+ Kg6 25. Qh1 Kg7 26. Bc1 Qc5 27. Be3 Qa3 {[%eval 0,39]}) 21... Bf5
{now Stockfish 8 chooses a different move to the game move} ({Stockfish 8 64
POPCNT:} 21... Kg7 22. Rad1 Qe6 23. hxg5 Qf7 24. gxf6+ Bxf6 25. Rd6 Kg8 26. Bh6
Bg7 27. Bxg7 Qxg7 28. Qh4 Qg4 29. Qh6 Qg7 30. Qh4 {[%eval 0,40]}) 22. Rad1 Qa3
({Stockfish 8 64 POPCNT:} 22... Qa3 23. hxg5+ fxg5 24. Qh4+ Kg6 25. Qh1 Kg7 26.
Bc1 Qc5 27. Be3 Qa3 {[%eval 0,40]}) 23. Qc4 $3 {AZ is down a knight and a pawn
but can force a draw and yet it must have seen something better} b5 ({
Stockfish 8 64 POPCNT:} 23... b5 24. hxg5+ fxg5 25. Qh4+ Kg6 26. Qh1 Kg7 27.
Bc1 Qc5 28. Be3 Qa3 {[%eval 0,39]}) 24. hxg5+ fxg5 ({Stockfish 8 64 POPCNT:}
24... fxg5 25. Qh4+ Kg6 26. Qh1 Kg7 27. Bc1 Qc5 28. Be3 Qa3 {[%eval 0,42]}) 25.
Qh4+ Kg6 ({Stockfish 8 64 POPCNT:} 25... Kg6 26. Qh1 Kg7 27. Bc1 Qc5 28. Be3
Qa3 {[%eval 0,44]}) 26. Qh1 Kg7 ({Stockfish 8 64 POPCNT:} 26... Kg7 27. Bc1 Qc5
28. Be3 Qa3 {[%eval 0,44]}) 27. Be4 $1 {[%emt 0:00:23] same comment as on move}
Bg6 ({Stockfish 8 64 POPCNT:} 27... Bg6 28. Bxg6 hxg6 29. Bc1 Qc5 30. Be3 Qa3 {
[%eval 0,43]}) 28. Bxg6 hxg6 ({Stockfish 8 64 POPCNT:} 28... hxg6 29. Bc1 Qc5
30. Be3 Qa3 {[%eval 0,49]}) 29. Qh3 $3 {why not take the Bc1 draw? What again
has it seen more?} Bf6 ({Stockfish 8 64 POPCNT:} 29... Bf6 30. Kg2 Qxa2 31. Rh1
Qg8 32. Rd6 Re8 33. Qh6+ Kf7 34. Rxf6+ Kxf6 35. Qxg5+ Ke6 36. Qg4+ Kd6 37. Bf4+
Kc5 38. Be3+ Kd6 {[%eval 0,42]}) 30. Kg2 Qxa2 {does Rh8 make a better defence?}
({Stockfish 8 64 POPCNT:} 30... Qxa2 31. Rh1 Qg8 32. Rd6 Re8 33. Qh6+ Kf7 34.
Rxf6+ Kxf6 35. Qxg5+ Ke6 36. Qg4+ Kd6 37. Bf4+ Kc5 38. Be3+ Kd6 {[%eval 0,45]})
31. Rh1 Qg8 ({Stockfish 8 64 POPCNT:} 31... Qg8 32. Rd6 Re8 33. Qh6+ Kf7 34.
Rxf6+ Kxf6 35. Qxg5+ Ke6 36. Qg4+ Kd6 37. Bf4+ Kc5 38. Be3+ Kd6 {[%eval 0,47]})
32. c4 $3 {another mindblowing move but engines from here on are starting to
evaluate Rd6 as positive for white, yet c4 seems to come from a different
planet} Re8 ({Stockfish 8 64 POPCNT:} 32... Re8 33. Rd6 Be5 34. Rd2 Bf6 35. Rd6
{[%eval 0,39]}) 33. Bd4 $1 {it's over} Bxd4 ({Stockfish 8 64 POPCNT:} 33...
Bxd4 34. Rxd4 Rd8 35. Rxd8 Qxd8 36. Qe6 Nd7 37. Rd1 Nf8 38. Rxd8 Nxe6 39. Rxa8
Kf6 40. Rxa7 Ke5 41. cxb5 cxb5 42. Ra6 Nd4 43. Kf1 Ke4 44. Rxg6 b4 45. Ke1 Kd3
46. Kd1 b3 47. Rb6 Kc3 48. Kc1 Nf3 49. Rc6+ Kd4 50. Kb2 Ne5 51. Rd6+ Kc5 52.
Re6 Kd5 53. Re7 Nd3+ 54. Kxb3 Nxf2 55. Rf7 Ne4 56. g4 Kd4 57. Kc2 Kc4 58. Rf1
Kd4 59. Rd1+ Ke3 {[%eval 144,34]}) 34. Rxd4 Rd8 35. Rxd8 Qxd8 36. Qe6 Nd7 37.
Rd1 Nc5 38. Rxd8 Nxe6 39. Rxa8 Kf6 40. cxb5 cxb5 41. Kf3 Nd4+ 42. Ke4 Nc6 43.
Rc8 Ne7 44. Rb8 Nf5 45. g4 Nh6 46. f3 Nf7 47. Ra8 Nd6+ 48. Kd5 Nc4 49. Rxa7
Ne3+ 50. Ke4 Nc4 51. Ra6+ Kg7 52. Rc6 Kf7 53. Rc5 Ke6 54. Rxg5 Kf6 55. Rc5 g5
56. Kd4 1-0
Parent - - By Rebel (****) Date 2017-12-23 20:59
My impression is that AZ was heavily influenced by Stockfish 8.

That's an interesting remark. Care to elaborate?
Parent - - By Chris Whittington (**) [gb] Date 2017-12-23 23:24
ridiculous. absolutely no evidence. in direct contradiction to the written paper details from Deepmind. AZ learnt from playing itself. Stockfish was not involved.
you both seem to be confusing TEST games used to prove the network, and TRAINING games used to train the network.
Parent - - By Rebel (****) Date 2017-12-24 12:12
ridiculous. absolutely no evidence. in direct contradiction to the written paper details from Deepmind. AZ learnt from playing itself. Stockfish was not involved.

Where they (Deepmind) obliged to mention that?

Maybe the paragraph about that mysteriously fell off the document :grin:

It's not an academic paper.

you both seem to be confusing TEST games used to prove the network, and TRAINING games used to train the network.

But you are ignoring my comments and questions in my previous post and just repeat your pov.

How do 44 million games of (say) 80 moves (only) 3.5 billion positions (of which the vast majority is garbage) magically transform into realistic patterns?

To read - https://en.wikipedia.org/wiki/Shannon_number
Parent - - By Chris Whittington (**) [gb] Date 2017-12-24 12:41
Bonjour Ed,

Ok, let's cut away all stuff about paper and who said what and and and, and get to the basic core of your position. you just stated it.

"It's beyond belief that a neural network can untangle enough information out of a few billion positions and generalise across the conceivably attainable chess search space". (C) Ed.

Well, ok, there has to be a possible NN that encodes (imperfectly) the 32EGTB and its lesser versions.
Doing the perfect egtb via looup table is not feasible, we know, because every piece is a 64 size multiplier (ok, compression bla bla, but doesn't alter the picture). 32egtb requires something like a 64^32 address space. we can do perfect 7egtb with current technology.

NN does not have this problem. at max, every piece adds another "plane" but in fact AZ architecture needs a max of 12 input planes. at a very rough rule of thumb the NN work is proportional to the inputs squared. very rough rule of thumb. so problem of going from imperfect NN 3 man egtb to imperfect 32 man NN egtb, is nothing like the perfect egtb exponentiation.

you know NNs work. you accept NN works for ZeroGo, so what is your scaling problem with Zerochess?

btw, the imperfections in the ANN output are dealt with and averaged by the MCTS search at gameplay time, See earlier post.
Parent - - By Rebel (****) Date 2017-12-24 22:24
Chris, perhaps you remember that before I started to participate in this debate I told you that I was going on a mission and so I did starting in CTF. What amazed me (and triggered me to do so) was the willingness of the CC community to accept the biggest breakthrough in CC without hardly any scepticism and I thought, you guys need an opposition, a rebel so to say. And so I operate in the role of DA and/or in follow_the_money_mode. BTW, the latter expression, the first time I heard of it was from you :lol:

What has been claimed by Deepmind is monstrous, they come along and wipe out 50 years of CC development by thousands of people in 4 hours of computer time without (as they claim) any domain knowledge, not even piece values. Common sense says, no way. Not without a proper re-match. Out-learning a(ny) engine from the start position can be done by any engine programmer. They (Deepmind) made a big mistake.

Secondly, an example of the giant black hole we are standing in front, the Shannon number. A good starting point would be the thesis of Matthew Lai author of Giraffe [ https://arxiv.org/pdf/1509.01549.pdf ] (deep NN + RL) and we know from his words he was recruited by Deepmind because of his thesis and is involved in the AZ project.

Giraffe (2016 version) has learned the STS positions of Swamithan, let's take a position.

[Event "?"]
[Site "?"]
[Date "2017.12.24"]
[White "?"]
[Black "?"]
[Result "*"]
[FEN "1r1b2k1/2r2ppp/p1qp4/3R1NPP/1pn1PQB1/8/PPP3R1/1K6 w - -"]
Giraffe instantly plays the right move, 1.g6 with a convincing +5.xx score.

But now we move the white king to c1 and Giraffe is totally lost in the desert while 1.g6 is still the move to play.

The NN was not able to solve it, did not recognize the pattern.

There are 1500 STS positions, I think that most of them will suffer from this kind of shortcomings. That's one reason why I doubt your 32 man theory, it's based on exact positions, not on patterns.

-----

OTOH, I heard that some programmers are already busy to implement the Deepmind model, we will see.
Parent - By Chris Whittington (**) [gb] Date 2017-12-24 23:29
I fast read the thesis (which means I need to read it again), but, can you explain what you mean in the STS case?

the thesis says Giraffe didn't train on STS, STS was a kind of verification set.

is there a version of Giraffe that this has geen tested on? where it plays "correct" in the set position, but not in some minorly altered position? I assume the "minor" alteration has been tested for "minorness"?

I assume if this is so, some suspicious person will have tested the entire set on Giraffe and so on .......?

Or am I imagining an implication into your example that you didn't mean?

On Shannon number. I already posted why chess tree exponentiation is a way way lesser problem for ANN32 than EGTB32.  We trade imperfection for size. Then we average out the imperfections with MCTS. No comment from you .....
Parent - - By Chris Whittington (**) [gb] Date 2017-12-25 00:49
"doubt 32 man theory .... because based on exact positions, not patterns".

Ed, I'm finding this comment of yours incomprehensible ;-)

I said, we can conceptually treat the AZ ANN as a 32 man egtb with an imperfect output. I said this because:

1. it forces you to accept that the entire search tree is encodable into a perfect lookup. In practice we can't do it because of size.
2. that forces you also to accept that an ANN with lots of network connections and lots of weights could do the same task. Trade off being imperfect outputs.
3. we know that making a 3man egtb is trivial, because somebody coded it and got it working in a few days. CCC tech pages last few days.

You can't rationally argue with any of those points above.

It's also reported that several NN chess progs, not very strong, and using domain knowledge have been created last few years. That indicates the problem is not exactly intractable, even using current PC hardware.

Your argument bases on what is well known in many fields, scaling issue. A system can generally be scaled up until a certain point, when complexities of size then defeat the scaling, and the system fails. Businesses that get too big too fast. Chess engines that try for one more ply. Whatever.

You're claiming that a neural net chess position analyser system won't scale from 3man to 32man. The complexities are too large, no network can cope.

Deepmind are claiming that an ANN at 32men is possible with outputs good enough for purpose. That generalisation, good enough for purpose, over the necessary sub tree, sufficient for purpose, is possible.

Chess players will tell you that a neural net for 32men with outputs good enough for purpose is possible. they each have one in their heads.

ANNs can do incredible things already in many disparate fields. long and incredible list. Did you read about GAN? Nobody believed the guy who thought that one up either.

I already explained the ANN input count is not problematic, in the way that the address space for a perfect 32egtb would be. a few hundred inputs appear to,be the sort of "address space" required.

it's not a Shannon problem, it's a generalisation problem, can we produce something "good enough". It won't be long before people will be calling chess ANNs "trivial". Will you still be kicking and screaming? No, I think you already gave up ;-)
Parent - - By Rebel (****) Date 2017-12-25 09:36
Ed, I'm finding this comment of yours incomprehensible ;-)

Story of my life :roll:
 
I said, we can conceptually treat the AZ ANN as a 32 man egtb with an imperfect output. I said this because:

It's not that, I do understand your model, but we starting to run in circles so let's resolve the incomprehensible first.

Are we in agreement that whatever AZ learned from these 44 million self-play games is stored as patterns?

After all that is what NN's are good in and used for --> pattern recognition.

Then we can go from there.
Parent - By Chris Whittington (**) [gb] Date 2017-12-25 11:13
I'm not sure at all how I want to describe it. Obviously an ANN stores a table of weights, tuned on the outputs the ANN was trained on. I wouldn't want to say the net stores "patterns", it recognises patterns in a fuzzy sort of way. Less fuzzy the more it learns. For example, my brain recognises "elephant", one day it came across a furry elephant and said elephant, but then was told "woolly mammoth", and now it discriminates a bit better in the class of elephant things. I suppose a few neurons got changed a bit.

I'm not a neural net designer, certainly not remotely of the expertese of the Deepmind Phds. I wrote a NN program, including back-propagation, in a very inefficient basic way out of a textbook description, back in 2002. It played, after training, a pretty good game of Backgammon. That impressed me that, I think, 22 inputs and 250000 training games could work. But I never really though of it much more than this: in the two layered network I used, there was some weight pattern hiding in there, that would be a Backgammon AI, what I had to do was jiggle around the weights until it found itself. The previous version I wrote did just that also, but random, I had two versions, current and current-1. I made current by making a small random change to all the weights. Then played one game, if newcurrent beat oldcurrent then I kept the randomweight changes, else I junked them. After huge numbers of cycles, this played backgammon also.
Then I tried an ANN to predict stock market, based on histirical prices and an idea I found in somebody's paper. Couldn't get it to work. Stopped all software in about 2007 or so. Then came back just now, read stuff, learnt about a lot of new ideas from youngest son neurology researcher. Decided to write ANN nought and crosses MCTS, simple network. Got that to work in a few days, ANN encodes entire tree, imperfectly, but good enough to play perfect tictactoe. I am since discovering ANNs have all manner of ways of internal architecture, it's complex, and I'm notgoimg to try and understand it with "patterns". I just think "jiggled weights" and hills and valleys. Use smart experience (which I don't have)  to design the network architecture, size and so on, and set it training on lots of data. Test, if it is not performing, use smarts to modify architecture and start again. And so on. If the network is organised the "right" way, and you can train it in the "right"way for long enough, then those weights will jiggle gradually into ever better values, until the beasty is "good enough". In chess a very compressedway to represent 32man egtb with trade off of noisy outputs (imperfect score) against the compression.

So, because of what I've done myself, and from reading, and from talking to smarts, I am quite ok with 44 million training games and a big smart designed network using powerful hardware to do enough weight jiggling. I can't do it, because of insufficient experience and lack of hardware, but I have no problem that DeepMind and its 100 Phds and a shedload of TPUs can.

Maybe you can settle for "recognises fuzzy patterns, with increasing discrimination on increasing learning".

Your turn ....
Parent - - By Chris Whittington (**) [gb] Date 2017-12-26 12:03
Good morning Ed,

You say "ANN stores patterns"
I say "ANN recognises patterns in a fuzzy sort of way"

Depending on the development of your argument position this nuance ought to either not be very important (in which case you can continue to develop) or important (in which case your argument depends on the "store" concept).

Silence implies the latter. I guess you were going try and argue the Shannon rule applied to "patterns"? Shannon count of patterns therefore ANN can't "store" enough? By going back to 32EGTB(imperfect), the way the tradeoff between size and imperfection works is via fuzzyness, generalisation is a fuzzy thing.

Your turn or null move ....
Parent - - By Rebel (****) Date 2017-12-27 11:35
I am way behind in answering, solly about that, Christmas time with its traditions gave my only one cpu little time for chess related stuff. There is also the realization that the paper isn't about the 100 games without any loss but that before that (final) match at least 1200 other games against SF were played.

Checked the fuzzy numbers algorithm on the CPW, it's about domain specific knowledge. And the paper states, no domain knowledge, only the rules of chess.

I am currently into Giraffe, Lai makes an amazing claim in his thesis running his learned NN with the STS test:

Page 24
Figure 4 shows the result of running the test periodically as training progresses. With the material only bootstrap, it achieves a score of approximately 6000/15000. As training progresses, it gradually improved to approximately 9500/15000, with peaks above 9700/15000, proving that it has managed to gain a tremendous amount of positional understanding.

Page 25
It is clear that Giraffe's evaluation function now has at least comparable positional understanding compared to evaluation functions of top engines in the world

Page 25
Since Giraffe discovered all the evaluation features through self-play, it is likely that it knows about patterns that have not yet been studied by humans, and hence not included in the test suite.

And there is the word --> patterns.

So I would say understanding AZ starts with understanding Giraffe.

Basically all that Lai had to do is to connect the NN to a good search (or add it to SF) and voila, a new top engine.
Parent - - By Chris Whittington (**) [gb] Date 2017-12-27 11:59
yeah, well, Giraffe paper is a Master thesis, whch is subject to nowhere near the same level of peer critique as PhD. Comments like "proving tremendous amounts of positional understanding"?! Really? what is proof? how big is tremendous? "it is clear that ....."!? "likely that ....". hmmmm.

all we can really say is that Girafffe has had its weights jiggled, and that it plays moves, the reasoning for which, if it were human, would include our human model, which is "seeing patterns, or understanding". we're in danger of getting into false projection, as you'll remember from some ancient post of mine made famous by Bruce Moreland.

AZ isn't Giraffe. I would guess that, once the learning has got going, which is probably more difficult than if some domain knowledge were added as kickstarter, the result will be better, and not hamstrung by initial human input. that's only a hunch though.
Parent - - By Rebel (****) Date 2017-12-28 05:46
Daniel S - Texel actually replaced its evaluation function with Giraffe's NN and showed that the eval is actually better but it would need a time odds to be competitive on the same hardware.

Statements like these could make me a believer.
Parent - - By Chris Whittington (**) [fr] Date 2017-12-28 11:32
I am sure you have already accepted that a system of connected weighted neurons can untangle chess.
Parent - - By Rebel (****) Date 2018-01-06 08:26
Well, I did enough tests by now with Giraffe to conclude it adds considerable elo depending on the time given it learns. So I can agree that the model you proposed could work. In my own words, the 44 million self-play games create a database with patterns with a probability win percentage. Then playing a game against another opponent discovered patterns during search are played-out or rolled-out with MCTS to verify if the probability win percentage needs a correction. Kind of what we call QS in regular CC.

And so I am stepping out my DA role.
Parent - By Chris Whittington (**) [fr] Date 2018-01-06 20:50
Good role. Is not easy, this paradigm switching, we carry too much of old way of thinking ito the new.

The AZ ANN is two ANNs melded into one. One ANN indeed outputs win probability for the move, that's its evaluation output. The other ANN outputs (at the same time, part of the same calculation) move probabilities for all of the children of the move (e.g. the opponents replies). Note the first is win rate (wins/playouts from this move), the second is child move probability rate (child plays/total plays of all children from this node).

In gameplay mode AZ doesn't select move by the one with the maximum win rate, it selects the one with maximum playouts. Which is often the same move, but not always. It's dangerous to select on winrate because the max winrate move may be based on only a relatively few numbers of play outs and therefore unreliable.
Parent - - By Chris Whittington (**) [fr] Date 2018-01-07 19:38
This is funny. You were more right than you ever realised. Return to DA mode.  I just had one of my eureka moments. This AZ is not what anyone thinks it is, or, I've not read anything anywhere, especially not in computer chess forums, that suggests anyone worked it out. Maybe someone at AZ, but when I read or saw what was in hindsight the key,  a few weeks ago, it was from a Deepminder, but it is not clear from what he said, that they realise themselves. Not sure, maybe the paper will reveal, or maybe not. AZ works fantastically well, not denying that, but not as we had imagined, or whatever Spock said. Sorry to be remaining obtuse, but I like my breakthrough moments. It will all become obvious and very very annoying, especially for everybody. Two paradigm shifts to absorb. Hahaha!!
Parent - - By Rebel (****) Date 2018-01-08 10:26
it was from a Deepminder

I am watching the AlphaGo Netflix documentary right now, something one of these guys said?
Parent - - By Chris Whittington (**) [fr] Date 2018-01-08 14:21
Not sure. Not being deliberately obtuse here, if I read hear or see some useful thing,I remember the thing and not where/ how I read/saw/heard it. Oh, the thing and the source quality, of course.
Parent - - By Rebel (****) Date 2018-01-08 15:54
Well, there was an interesting remark at 0:38, AlphaGo typically looks 50-60 moves ahead, not sure if moves are plies in GO, but even if plies that's incredible. Translating to chess, my initial theory in CTF -- a database that returns the number of reductions or a total cut-off --- is starting to make sense and SF was simply out-searched.
Parent - - By Chris Whittington (**) [fr] Date 2018-01-08 17:14 Edited 2018-01-08 17:41
Results today:
Connect4 MCTS ANN rolling out to 8 ply vs Connect4 AB, fixed depth N, tactical trap finder at N , and avoider, it can't be caught out at less than N, alternating colour games, checked for doubles

MCTS 8 vs AB10 w278 l61 d21 winrate 82%
MCTS 8 vs AB11 w20 l11 d0  winrate 65%
MCTS 8 vs AB12 w15 l6 d0 winrate 71%
MCTS 8 vs AB13 w23 l7 d2 winrate 77%

yeah, yeah,  I know not a lot of games in last three batches, but it seems not to make a lot of difference how far AB can look ahead, it looks ahead deeper in all the testing games. I would not say my MCTS ANN was "outsearching" AB when its depth was 8 and AB had at least 10, then 11, then 12 and then 13.

what I mean to say is that, it may be that AZ is "outsearching" Stockfish, but I don't think "outsearching" is a necessary condition for winning.

How do I get hold of the AlphaGo movie, I don't have Netflix?
Parent - - By Rebel (****) Date 2018-01-08 19:49
How do I get hold of the AlphaGo movie, I don't have Netflix?

Here is the relevant part.
 


 

If they can look ahead 50-60 moves in GO it will be a lot more in chess.
Parent - - By Chris Whittington (**) [fr] Date 2018-01-08 20:36
Thanks. Is the whole thing available somewhere? youtube?

you mean the comments about depth?
those are MCTS rollout lookaheads, ie branching factor==1. each child in the rollout is chosen by a policy network, so ideally the playout follows a "best line", but that is by no means a certainty. if the rollout looks good, then the tendency is for more rollouts into the same region to check and then average out the results. if more looks into the same region start to appear less good, then the algorithm will switch and try looking elsewhere where it may find promising regions (on average). its obviously vital how well the policy network can choose each child, else the MCTS will end up stabbing wildly and ineffectively into the search space.

Yes, of course depth is good, although not if it sacrifices width AND the search policy is not good enough. Obviously, with perfect policy, MCTS would only need one rollout.

Their initial paper goes so far out of the way to hardly mention what has to be the absolute key to AZ that either a) the paper writer doesn't understand or b) its a meta-issue and they don't need to mention it or c) he/she is deliberately avoiding it, maybe having fun at everybody barking up the wrong tree for a few months. hahaha!! this is so funny. Maybe they tell us in the second paper. Although, with hindsight they already did tell us and nobody noticed. Did they even notice, not sure. Not if the first paper is anythign to go by. It's how they made, or stumbled apon, making the impossible possible. So simple, but anyone who proposed it in the first place would be laughed out of the room. Or maybe I am wrong. Always possible.
Parent - - By gsgs (**) [de] Date 2018-01-09 01:07
could this maybe be implemented,added to normal chess engines somehow ?
A bit Monte Carlo occasionally mixed into the alphabeta

I mean, is it really better to go the detour through neural nets ?

Another program may create the database of whatever was "learned"
Parent - - By Rebel (****) Date 2018-01-09 04:37
But the paper says, self-play games without any domain knowledge, only the rules of chess (legal moves, checkmate, draw rules). Thus no (domain) knowledge of mobility, pawn structures, king safety, passed pans etc. Not even the value of pieces.
Parent - By gsgs (**) [de] Date 2018-01-09 10:01
mobility, pawn structures, king safety, passed pawns etc. Not even the value of pieces.
This is all being recognized and learned. How could it play chess without
value of pieces ? Every human who starts just with the rules and learns chess
by self play only will eventually come up with some piece values.
And also the other things.

A0 will have to compare 2 positions, how close they are, or just which one is better.
And here it will assign some value to pieces.
And if it is just by looking up the outcome of positions with similar material distribution.
(pawn structures, king safety, ...)

You will "create" all sorts of function to assign some number to a position
and these functions will be checked for usefulness = correlation with game-result.

assign numbers to positions and to moves.

hmm, as first approximation I would evaluate the moves in very long notation
like 23.Nf3xQg5 , there are only some million such moves
This would "include" piece values.(captures)  king safety ~ number of possible checks

2 positions are similar, when the sets of possible moves are similar,
pawn-moves weighted high here.
Parent - - By Rebel (****) Date 2018-01-09 04:48
Yeah, we are only trying to understand what goes on the deep mind of the beast. The AlphaGo depth (of 50-60 moves) could be those rollouts but rollouts were skipped in AlphaGo Zero. see here [ https://www.youtube.com/watch?v=XuzIqE2IshY&t=1082s ] aprox 4:44.
Parent - - By Chris Whittington (**) [fr] Date 2018-01-09 09:08
it doesn't know or care about any single one of our, well not mine anymore, "things that you must know to understqnd chess", not attack tables, not material, not pawn structure, not any kind of position pattern. It's such a beautiful idea. It might even be possible to hand code it. In fact, I think we could hand code it. Not sure if should be in MCTS or AB format though. Will think about it.
Parent - - By Rebel (****) Date 2018-01-09 13:12
CSTAL ZERO - I would love it!
Up Topic The Rybka Lounge / Computer Chess / AZ vs SF - game 99
1 2 3 Previous Next  

Powered by mwForum 2.27.4 © 1999-2012 Markus Wichitill