Not logged inRybka Chess Community Forum
Up Topic Rybka Support & Discussion / Rybka Discussion / Quad Rybka 2.3.2a - underperforming ?
- - By Ray (****) Date 2007-06-23 08:46 Edited 2007-06-23 08:49
IN CCRL 40/40 Quad Rybka 2.3.2a is not doing as well as expected - although from just 138 games to date you just can't be conclusive about anything

But I ran some quick tests. They may mean absolutely nothing, and be complete rubbish. If so then please tell me :-)

I started with the opening board, and measured the kn/s of the engine after 1 minute, on 1,2 and 4 CPUs. Exiting the GUI in-between tests.

1-2-4 CPUs

For Rybka 2.2 x64
kn/s - 141 - 270 - 496
Ratio - 1.00 - 1.91 - 3.51

For Rybka 2.3.2a x64
kn/s - 159 - 262 - 417
Ratio - 1.00 - 1.65 - 2.62

I should add that these tests aren't exactly repeatable, the figures do vary a bit if I re-run. But the overall picture is the same

Now - as I said, maybe these tests mean nothing and be no indication of performance. But if I'm looking for a reason that Quad Rybka 2.3.2a seems to be under-performing in CCRL  testing, could this be a factor ???

(Figures on an Intel QX6700)
Parent - - By Lukas Cimiotti (Bronze) Date 2007-06-23 08:57
kn/s don´t mean too much. To compare performance, it is best to make engines matches with hundreds or even thousands of games. If you do so, you will find out, that 2.3.2a is much better than 2.2.
Parent - - By Uri Blass (*****) Date 2007-06-23 09:19 Edited 2007-06-23 09:25
As far as I know CCRL got a disappointing performance for 2.3.2a and when it is better than 2.2 with 2 processors it is not better than 2.2 with 4 processors.

kn/s certainly has meaning if you compare 4 cpu with 1 cpu and you can see by the comparison that rybka2.3.2a is less efficient in using the 4 processors relative to rybka2.2

CCRL test with ponder off and maybe the problem is only with ponder off games but the results suggest that there may be a problem.

Edit:You can see that after 138 games of the CCRL 2.3.2 is underperforming
http://64.68.157.89/forum/viewtopic.php?topic_view=threads&p=126232&t=14599

CEGT did not test rybka with 4 processors so I corrected an error that I wrote earlier in this edit
but I think 138 games are enough to suspect that there is a problem.

It is better not to waste time on hundreds of games because if there is a problem there are easier ways to prove the problem.

Uri
Parent - - By Lukas Cimiotti (Bronze) Date 2007-06-23 13:31
You should not compare kn/s of one version of Rybka with another one. If you do so, Rybka 2.0 Beta 2 x64 is the best version.
Rybka 2.2 has a problem with multi core performance. I made a whole lot of tests for Vas on my quad and my oct. The problem is now solved in 2.3.2 and 2.3.2a. I also made tests 2.3.2 - 2.3.2a - i couldn´t find any significant change in performance.
Btw. i am the guy on who's computer Rybka won WCCC 

Lukas
Parent - - By Ray (****) Date 2007-06-23 13:37
I think you've totally missed the point.

We are not comparing raw kn/s between versions. We are looking at the kn/s of a SINGLE version and seeing how that scales with more CPUs

Rybka 2.2
Ratio - 1.00 - 1.91 - 3.51

Rybka 2.3.2a
Ratio - 1.00 - 1.65 - 2.62

Get it ???

Sure, Rybka won WCCC, that is not in dispute. A well deserved victory.
Parent - By Fulcrum2000 (****) Date 2007-06-23 13:42
As far as I know the kn/s for multiple cores is hardcoded in Rybka. 2 cores will show always a factor of 1.7 improvement over 1 core etc. It are just numbers which can not be compared (not even the relative change in kn/s between # of cores used)
Parent - - By Lukas Cimiotti (Bronze) Date 2007-06-23 14:50
Sorry, Ray, i at first missed the point, indeed.

but:

Vas made special benchmark versions to measure multi core performance. Their output looks like this for an oct:

Starting experiment: 1 process, full algorithm
Current process ct: 8 Max CPU ct: 1 existing CPU ct: 8
Current process ct: 1

ct: 250 avg: 16.984 nodes: 769464038 nps: 181211 raw nodes: 769464038 raw nps: 181211 raw clock nps: 181210

Starting experiment: 2 processes, full algorithm
Current process ct: 1 Max CPU ct: 2 existing CPU ct: 8
Current process ct: 2

ct: 250 avg: 10.384 nodes: 785955973 nps: 302745 raw nodes: 462326999 raw nps: 178085 raw clock nps: 178084

Starting experiment: 4 processes, full algorithm
Current process ct: 2 Max CPU ct: 4 existing CPU ct: 8
Current process ct: 4

ct: 250 avg: 7.338 nodes: 855261768 nps: 466181 raw nodes: 305450568 raw nps: 166493 raw clock nps: 166489

Starting experiment: 8 processes, full algorithm
Current process ct: 4 Max CPU ct: 8 existing CPU ct: 8
Current process ct: 8

ct: 250 avg: 5.480 nodes: 908928648 nps: 663392 raw nodes: 206574611 raw nps: 150770 raw clock nps: 150760

and like this for a quad:

Starting experiment: 1 process, full algorithm
Current process ct: 4 Max CPU ct: 1 existing CPU ct: 4
Current process ct: 1

ct: 250 avg: 11.583 nodes: 769464038 nps: 265709 raw nodes: 769464038 raw nps: 265709 raw clock nps: 265707

Starting experiment: 2 processes, full algorithm
Current process ct: 1 Max CPU ct: 2 existing CPU ct: 4
Current process ct: 2

ct: 250 avg: 6.921 nodes: 754182569 nps: 435825 raw nodes: 443636772 raw nps: 256368 raw clock nps: 256367

Starting experiment: 4 processes, full algorithm
Current process ct: 2 Max CPU ct: 4 existing CPU ct: 4
Current process ct: 4

ct: 250 avg: 4.904 nodes: 843326255 nps: 687789 raw nodes: 301187880 raw nps: 245639 raw clock nps: 245635

the results of these tests are that 2.3.2 is better than 2.2 in multi core efficiency.
the algorithm of displaying kn/s changed a lot from first mp versions to now.
Parent - - By turbojuice1122 (Gold) Date 2007-06-23 23:10
Then the next question is why Rybka 2.3.2a on 4 cores is either not as good or definitely not better than previous Rybka versions on longer time controls both in the rating lists and in people's individual tests.  The extra elo that you've mentioned occurs in Rybka vs. Rybka matches at exceedingly short time controls seems to disappear in long time controls in matches against other engines.  However, I still agree that Rybka 2.3.2a is by far the best engine ever made for single CPU.
Parent - By McHugh (*) Date 2007-06-23 23:37
I remember a post from Vas several weeks ago (prior to 2.3.2's release) that he was not directly working to improve the multi processor. I think he said that will be done in the version 3 releases.
Parent - By Lukas Cimiotti (Bronze) Date 2007-06-24 07:01
Yes, it is true, i only tested 1+0 and 3+1 so far and i only tested 2.3.1 - 2.3.2 and 2.2 - 2.3.2. Both tests showed a much better result for 2.3.2.
2.3.2 - 2.3.2a i only made a short test of 200 games 2+1 , it ended at exactly 50%
Parent - - By turbojuice1122 (Gold) Date 2007-06-23 13:40
Uri is correct, and he's not comparing kn/s of one Rybka version with another one--he's seeing how a particular version of Rybka scales with how another particular version of Rybka scales, each while running the same version on different numbers of CPU's, and his result is the same as that arrived at by many, many other people: Rybka 2.3.2a is the worst-scaling Rybka of the 2.xx series.  The reason that the tests show it to be better is because Rybka 2.3.2a on single CPU is an incredibly performed engine in terms of evaluation accuracy, endgame play, etc.--thus, its lack of scaling is generally made up for by its incredibly improved evaluation and overall game play.  Nonetheless, it scales very poorly, enough such that by 4 CPU, the performance of 2.3.2a is beginning to be comparable with that of some previous Rybka versions.  It is still true that Rybka 2.1d3 is easily the best Rybka in terms of scaling, and its performance on 16 CPU's or more is by far the best; I would guess that Rybka 2.1d3 on 16 CPU would probably be 80-100 elo points stronger than Rybka 2.3.2a on 16 CPU, in spite of the fact that Rybka 2.3.2a on 1 CPU is probably on the order of 60-80 elo points stronger than Rybka 2.1d3 on 1 CPU.
Parent - - By Rybka_King88 (*) Date 2007-06-24 03:31
For me,

2.3.2a is a lot worse than Rybka 2.2.

My processor Pentium HT 4CPU

What engine should be strongest for me?
Parent - By Gaмßito (****) Date 2007-06-24 04:06 Edited 2007-06-24 04:09
Hi,

In what are you basing to say that 2.2 is better?

Regards,
Gambito.
Parent - By Fulcrum2000 (****) Date 2007-06-24 12:43
What is a "Pentium HT 4CPU"???
Parent - - By turbojuice1122 (Gold) Date 2007-06-24 13:08
I'm afraid that if you try splitting Rybka among "4 CPU" on a hyperthreading machine, you're going to have very random results, and I wouldn't be at all surprised if Fritz is better than Rybka.  The quality of the games will be quite low in any case.
Parent - - By Fulcrum2000 (****) Date 2007-06-24 13:19
Indeed, Vas already stated using HT will give Rybka a performance DROP of about 15% so, if you have a HT machine let Rybka use only the available physical cores. So for a Pentium 4 HT --> 1 thread only, for a Dual Core with HT (if they exist) not 4 but 2 threads!.
Parent - By Rybka_King88 (*) Date 2007-06-24 22:36
Pentium 4 ht 4 CPU

;-)
Parent - - By Vasik Rajlich (Silver) Date 2007-06-25 14:49
This is a cosmetic issue. At some point (I believe starting with 2.2n2), I scaled down the nps figures for multi-processor Rybka versions so that the nps reflected overall search ability as opposed to raw nodes. This makes it easier to compare for example the performance of a 4-core machine with the performance of an 8-core machine.

Strength-wise, Rybka 2.3.2 is just as improved relative to earlier versions on quads as on weaker machines, at least in blitz self-play.

Vas
Parent - - By Ray (****) Date 2007-06-25 16:12
Thanks for the clarification !

Quad Rybka 2.3.2a rating will have improved by the next update. It is currently running against Quad Junior 10, and performing very much better than 2.2
As with any engine update, you always hope that it will do better against all opposition. But mostly it will be a mixture of better against some, about the same against others, and maybe slightly worse against others.
Parent - - By billyraybar (***) Date 2007-06-25 16:50
Underpeforming on Quad? - may be the case.   I would go one step further and say that we are possibly seeing diminishing returns with 2.3.2 - the longer the time control, the smaller the improvement. This hunch is supported by the current CEGT rating lists.

40/4
1 Rybka 2.3.2 mp x64 2CPU 3074 21 21 800 75.2 % 2882 31.4 %
2 Rybka 2.2 mp x64 2CPU w/Fix 3046 27 27 500 72.9 % 2874 31.0 %

40/40
1 Rybka 2.3.2a x64 2CPU WM-2007 3041 16 16 1212 73.9 % 2860 36.7 %
2 Rybka 2.3 LK x64 2CPU 3016 24 24 566 71.6 % 2855 34.5 %

40/120
1 Rybka 2.3.2a x64 2CPU WM-2007 3006 48 47 125 66.0 % 2891 40.8 %
2 Rybka 2.3 x64 2CPU 2994 20 20 750 72.1 % 2829 38.1 %
Parent - By Felix Kling (Gold) Date 2007-06-25 16:57
I guess those differences are also due to the elo calculation algorithm they use and the lower number of games with long time controls. Also maybe the top engines don't profit that much from the extra time as the worse engines...
Parent - - By Lukas Cimiotti (Bronze) Date 2007-06-25 17:07
there is one possible explaination: Rybka has a very good time contol - i never saw a loss on time since version 1.2 came out. Even on bullet this never happens, whilst most other engines tend to loose on time in very short games. In longer games they seem to work better. I once used Zap in a bullet tournament and had ~10% losses on time.
But this is only a theory - i don´t know, if it is really the reason.
It might also be due to different number of games.
Parent - - By Uri Blass (*****) Date 2007-06-26 20:04
This is clearly not the explanation because the time control is x minutes/40 moves so no losses on time.

Uri
Parent - - By Lukas Cimiotti (Bronze) Date 2007-06-26 20:12
if you assure me, that there was no loss on time, i'll believe it - well, that was only a theory
have you got access to the games?
Parent - - By Uri Blass (*****) Date 2007-06-27 03:15
I believe that in case of losses on time the testers are going to mention it.
There is no chance that they are going to be blind to something like that.

It is also possible to download games here

http://www.computerchess.org.uk/ccrl/404/games.html
Parent - By Ray (****) Date 2007-06-27 05:43
Confirmed that there are no losses on time in CCRL Quad Rybka 2.3.2a testing
We currently have 250+ games, on our way to 400 (8 opponents at 50 games each)
Parent - By Ray (****) Date 2007-06-26 19:39
Now with 258 games, Quad Rybka 2.3.2a has overtaken Quad Rybka 2.2 on the ratings list, thanks to a stunning performance against Quad Junior 10 :-)
Up Topic Rybka Support & Discussion / Rybka Discussion / Quad Rybka 2.3.2a - underperforming ?

Powered by mwForum 2.27.4 © 1999-2012 Markus Wichitill