Topic Rybka Support & Discussion / Rybka Discussion / Fixed-Depth Rybka 4.1 Match, Multi-PV versus Single-PV
I ran a fixed-depth 13 engine match of Rybka 4.1 in 5-PV mode versus itself in Single-PV mode.
Both were Win32 versions, each 1 cpu, ponder off, and the 100-game match took three days.
The 5-PV engine won 61.5-38.5, with 27 victories and 69 draws against only 4 losses.
This surprised me because in this ChessBomb blog post it says:
They don't clarify whether their tests are fixed-depth or fixed-time. Has anyone else done such fixed-depth tests, and are my results consistent with them?
I am aware of issues such as in this November thread here.
Both were Win32 versions, each 1 cpu, ponder off, and the 100-game match took three days.
The 5-PV engine won 61.5-38.5, with 27 victories and 69 draws against only 4 losses.
This surprised me because in this ChessBomb blog post it says:
Engines are much stronger in SinglePV mode than in MultiPV mode. This is because they don't waste time exploring suboptimal positions, and also because some powerful optimizations are only enabled in SinglePV mode (e.g. aspiration windows). And when we say "much stronger", we really do mean "much". According to our internal testing, Stockfish 1.9.1 in SinglePV mode is about 230 Elo points stronger than in MultiPV=4 mode. They don't clarify whether their tests are fixed-depth or fixed-time. Has anyone else done such fixed-depth tests, and are my results consistent with them?
I am aware of issues such as in this November thread here.
I guess the 5-PV engine took about five times as much time.
much more than 5x
Run with a time/move time control and the results will be much different.
Right---of course---and of course the ChessBomb people are working "in real time" so time is an issue for them
But it is not an issue for me, as all my work uses a fixed depth. The ChessBomb item seemed to me to be saying, even when you factor out the time issue, Single-PV is still stronger as it uses more-powerful heuristics---and my comment to that item said 130 Elo stronger when you factor out time. But for Rybka 4.1 my results are the opposite, and I wish to know if others can corroborate them.
But it is not an issue for me, as all my work uses a fixed depth. The ChessBomb item seemed to me to be saying, even when you factor out the time issue, Single-PV is still stronger as it uses more-powerful heuristics---and my comment to that item said 130 Elo stronger when you factor out time. But for Rybka 4.1 my results are the opposite, and I wish to know if others can corroborate them.
No tests are needed and all these results are expected, Multi-PV is going to have much wider search and get better results, obviously, but the time taken to move is not worth the extra time.
Rybka could reach much higher depth, and play much better moves, if you let her analyze in SinglePV for the time that you let her reach that depth in MultiPV, and that's the point.
Rybka could reach much higher depth, and play much better moves, if you let her analyze in SinglePV for the time that you let her reach that depth in MultiPV, and that's the point.
... and some engines don't use some features (aspiration window comes to mind) when in multi-PV mode. This makes them a lot slower.
OK, thanks to you and Uly in particular, I'm getting the picture that the heuristics shelved for Multi-PV matter to speed but not to quality of fixed-depth results. In fact, perhaps you are saying that turning off the heuristics (such as null-move pruning too?) actually makes the search better at fixed depth.
Way back when I began my project in August 2008, Vas Rajlich himself suggested that the best way to analyze multiple moves m to depth d in a position P is to play each move m to go to the position P_m, then analyze P_m to depth d-1 in Single-PV mode. However, I've never figured out how to write and run such a script, e.g. in the Arena Debug window, and the multipv_cp feature of Rybka's Multi-PV mode (which I set to 4.00 pawns to leave some slack before the "5.09" eval problem) is a huge practical time (and stall) saver.
Thus thanks---this seems to confirm the wisdom of what I have been doing, in published papers on my professional website that I draw to your interest.
Update on my engine matches: with ponder on, after 27 of 100 games, Rybka 3 depth-13 has opened up a 10 wins, 2 losses, 15 draws lead on Rybka 4.1, both engines in Single-PV mode. They were basically tied after 100 games with ponder off: 23 wins for Rybka 4.1, 22 for Rybka 3, 55 draws. Of course, Rybka 4.1 gets to depth 13 much faster, and is killing Rybka 3 at equal-time (2' + 6"/move) blitz.
Is there any reason "ponder" should matter to the results of a fixed-depth match? I'm running the Rybka 4 vs. Rybka 3.1 match in Aquarium because Arena versions are often off-by-2 in fixed-depth engine matches with Rybka versions, while I haven't figured out how to adjudicate games won at (say) +4.00 advantage in Fritz 13 which seems needed to keep Rybka 3 from stalling. Maybe "ponder" allows one engine to go past depth 13 while the other is thinking? Well that should favor the fleeter engine, i.e. Rybka 4.1, but maybe if Rybka 3 makes more (inferior) moves that Rybka 4.1 does not predict, Rybka 4.1 finds itself using hashed search results less, so it re-does all depths while Rybka 3 gets more time to ponder...---?
Thanks again, ---Ken Regan
Way back when I began my project in August 2008, Vas Rajlich himself suggested that the best way to analyze multiple moves m to depth d in a position P is to play each move m to go to the position P_m, then analyze P_m to depth d-1 in Single-PV mode. However, I've never figured out how to write and run such a script, e.g. in the Arena Debug window, and the multipv_cp feature of Rybka's Multi-PV mode (which I set to 4.00 pawns to leave some slack before the "5.09" eval problem) is a huge practical time (and stall) saver.
Thus thanks---this seems to confirm the wisdom of what I have been doing, in published papers on my professional website that I draw to your interest.
Update on my engine matches: with ponder on, after 27 of 100 games, Rybka 3 depth-13 has opened up a 10 wins, 2 losses, 15 draws lead on Rybka 4.1, both engines in Single-PV mode. They were basically tied after 100 games with ponder off: 23 wins for Rybka 4.1, 22 for Rybka 3, 55 draws. Of course, Rybka 4.1 gets to depth 13 much faster, and is killing Rybka 3 at equal-time (2' + 6"/move) blitz.
Is there any reason "ponder" should matter to the results of a fixed-depth match? I'm running the Rybka 4 vs. Rybka 3.1 match in Aquarium because Arena versions are often off-by-2 in fixed-depth engine matches with Rybka versions, while I haven't figured out how to adjudicate games won at (say) +4.00 advantage in Fritz 13 which seems needed to keep Rybka 3 from stalling. Maybe "ponder" allows one engine to go past depth 13 while the other is thinking? Well that should favor the fleeter engine, i.e. Rybka 4.1, but maybe if Rybka 3 makes more (inferior) moves that Rybka 4.1 does not predict, Rybka 4.1 finds itself using hashed search results less, so it re-does all depths while Rybka 3 gets more time to ponder...---?
Thanks again, ---Ken Regan
I'd expect a fixed depth 14 Rybka to beat those two, while using less time than the 5-PV one.
So it's a matter of strength within a time frame.
So it's a matter of strength within a time frame.
Topic Rybka Support & Discussion / Rybka Discussion / Fixed-Depth Rybka 4.1 Match, Multi-PV versus Single-PV
Powered by mwForum 2.27.4 © 1999-2012 Markus Wichitill
