Not logged inRybka Chess Community Forum
Up Topic Rybka Support & Discussion / Rybka Discussion / Difference between 2 and 4 cores
1 2 Previous Next  
- - By Linus (***) [at] Date 2007-08-18 09:06
When I look at the new CCRL rating list (complete list, with engine versions) the difference between Rybka 2.3.2a 64-bit 4CPU and Rybka 2.3.2a 64-bit 2CPU is merely 15 ELO. Is it worth the money?

Rybka 2.3.2a 64-bit 4CPU    3118
Rybka 2.2 64-bit 4CPU         3107
Rybka 2.3.2a 64-bit 2CPU    3103
Parent - - By Uri Blass (*****) [il] Date 2007-08-18 09:18
Unfortunatley ccrl does not play rybka-rybka games and we cannot know the rating improvement based only on games against opponents that cannot compete with rybka.

I think that if the CCRL should give non rybka players more time
if we want to learn about 4 cpu against 2 cpu improvement for rybka.

Giving the opponents 400 minutes/40 moves against 40 minutes/40 moves may do the list more interesting.

It may be interesting to know if Deep Junior 4 cpu can beat Rybka 4 cpu when Rybka get 40 minutes/40 moves against 400 minutes/40 moves of Junior.

The gap in rating is 206 elo and assuming 50 elo per doubling the speed Junior cannot do it.

Uri
Parent - By Linus (***) [at] Date 2007-08-18 16:34

>Unfortunatley ccrl does not play rybka-rybka games ...


I did not know that. So the real test would be to let Rybka 2CPU play hundreds of games against Rybka 4CPU.
Parent - - By Vasik Rajlich (Silver) [hu] Date 2007-08-18 19:55
Indeed, in Rybka vs Rybka games the difference is much more.

Vas
Parent - - By Quapsel (****) [de] Date 2007-08-18 22:06
I believe, in Rybka vs. Rybka games the difference is larger.
But isn't it the more interesting thing to see, how effective this advantage acts in games against other opponents?
I think so.

Quap
Parent - - By Uri Blass (*****) [il] Date 2007-08-19 05:40
In this case you should give other opponents some time advantage otherwise you may get misleading results for the future.
Parent - - By Vasik Rajlich (Silver) [hu] Date 2007-08-20 20:10
I'm pretty sure that if opponents are given a time advantage, then again the importance of the extra cores will increase.

Vas
Parent - By Shaun Brewer (****) [gb] Date 2007-08-21 22:05
Hi Vas,

it is very inconsiderate of you to have produced an engine so much better than everything else :)

It does make our testing more difficult....

Keep up the great work!!!

Shaun
Parent - - By Jim Walker (***) [us] Date 2007-08-22 23:05
I'm sorry guys but this logic escapes me.  First of all the only thing that is important is Rybkas performance against other programs/humans.  Rybka vs Rybka is practically useless.
Second point:  If an opponent of Rybka is rated at say 2950 and Rybka w/2cpu scores +150 then Rybka with 4 cpus should show more than a couple of Elo higher or something is rotten in Denmark. 
People have been saying for years that a doubling of speed will flatten out.  I see that as only an excuse for not using the extra search power effectively.  Of course if you let your search tree explode then the extra search becomes almost useless. 

Disclaimer:  I am not an expert in chess.  I do not know how to program engines or anything else for that matter.  Anything I say cannot be used against me later if I'm proven wrong by empirical data.  Nothing I say here is fixed in stone.  Therefore I reserve the right to change my mind next week.
Jim
Parent - - By Vasik Rajlich (Silver) [hu] Date 2007-08-24 09:03
I suspect that the more closely matched two opponents are, the more important basic search and evaluation are; while the more mismatched they are, the more important are 'little things' like bugs, weird positions and the likelyhood of entering them, etc.

Vas
Parent - - By Banned for Life (Gold) Date 2007-08-24 15:55
Yes!!! The corollary here is that when you have a significant mismatch, it may make sense to concentrate on trying to increase the probability of the 'little things' rather than go head-to-head in the traditional style. At least this way when you lose, it may be interesting or at least amusing. :-)

Alan
Parent - By Vasik Rajlich (Silver) [hu] Date 2007-08-26 07:51
Sure, but these things are by definition relatively unimportant.

Vas
Parent - - By Shaun Brewer (****) [gb] Date 2007-08-21 22:17
Hi Uri,

A while back I tried Rybka with 40/4 v a few of opponents with 40/40 - I only ran about a hundred but the rating advange seemed to be in the 300-350 range - however I suspect 40/4 to 40/40 would give a much larger difference to 40/40 v 40/400 - although I have no proof.

Shaun

Off topic

P.S. I will be running more Movei games but the machine I was using for the Movei gauntlets is not happy :( and I have not had time to investigate...
Parent - - By Uri Blass (*****) [il] Date 2007-08-22 01:27
I think that this is an interesting information.

300-350 elo difference for 10 times difference suggest
90-105 elo differece for being twice faster.

If this is correct it suggest that the difference between rybka and other programs is not so big and programmers only need to optimize their program to be faster to become better than rybka

Rating list gives

Rybka 2.3.2a 64-bit 3081
Toga II 1.3.1 2868 (not that this is not the best version of toga at 40/4)

difference is only 213 elo

Based on your statistics
if you make toga only 5 times faster than it is not clear based on your statistics if rybka is better than toga.

I think that based on this statistics it may be interesting to play matches between rybka and TogaII1.3.1 when Rybka64 bit get only 20% speed(people who use rybka 32 bit may decide about 3:1 time advantage for rybka)

The main question is if rybka performs better against that toga at longer time control or maybe it is the opposite and rybka's advantage relative to toga except supporting parallel search is only speed.

Uri
Parent - By Vasik Rajlich (Silver) [hu] Date 2007-08-22 09:48
I'm pretty sure that the X-elo-per-doubling figure gradually flattens out.

For example, if you took a 1000-Elo weaker engine, and gave it (let's say) 14 doublings in speed, I don't think it would be enough.

Vas
Parent - - By Shaun Brewer (****) [gb] Date 2007-08-22 22:48 Edited 2007-08-22 23:28
Hi Uri,

I don't think that we will see a linear improvement in strength with increased time however I have started a test to find out.

The 50 Noomen positions 100 games

Rybka 2.3.2a 64-bit v Toga 1.3.1

40/4 v 40/4
40/4 v 40/8
40/4 v 40/12
40/4 v 40/16
40/4 v 40/20
40/4 v 40/24
40/4 v 40/28
40/4 v 40/32
40/4 v 40/36
40/4 v 40/40

Times CCRL equivalent I might not run the games in this order but I will run them all eventually to see if performance is linear.

It will be interesting to see when Toga beats Rybka my guess is 40/28 or 40/32.

Once we know I will double the time for both Rybka and Toga and see if Toga still wins I suspect not???

Shaun
Parent - - By Uri Blass (*****) [il] Date 2007-08-23 10:16 Edited 2007-08-23 10:48
Thanks
100 games may be not enough to see one time when toga beat rybka consistently
It may be possible that toga beats rybka at 40/20 but lose at 40/24

Based on your data I got the impression that the difference between 40/40 and 40/4 is 300-350 elo so the difference between 40/20 and 40/4 should be more than 200 elo and it means that 40/20 may be enough for toga to beat rybka.

Edit:Note also that linear improvement does not mean the same difference between every 2 matches and you can expect bigger difference between 40/4 and 40/8 relative to the difference between 40/12 and 40/8 not because of diminishing returns but because of the fact that 8/4=2 and 12/8<2

If you want to check diminishing returns you can multiply by 1.5 and find the rating of toga in the following time controls.

1)40/4
2)40/6
3)40/9
4)40/13.5
5)40/20.25
6)40/30.375
7)40/45.5625

Uri
Parent - - By Shaun Brewer (****) [gb] Date 2007-08-23 12:08
Hi Uri,

Yes 100 games is not enough however 50 positions played as both white and black should give reasonable indicative results.

In terms of n elo per doubling being linear we can look at the 40/4,40/8,40/16,40/32 results and if necessary I could run 40/64.

I want to see at what point Toga overtakes Rybka and this may not be at a doubling point.

Shaun
Parent - - By Uri Blass (*****) [il] Date 2007-09-03 10:07 Edited 2007-09-03 10:12
I think that the difference between 40/4 and 40/40 is smaller than 300-350 elo
I played a match between movei 40/40 and rybka 40/4 on pIV 3.19 ghz(64 mbytes hash for the engines)

I think that this is faster time control than 40/40 and 40/4 ccrl that use better hardware.

from
http://computerchess.org.uk/ccrl/4040.live/rating_list_all.html

Rybka 2.3.2a 32-bit 3018 +24 −24 75.9% −178.2 36.0% 644
Movei 0.08.438 2755 +62 −61 56.8% −47.4 31.8% 88

I guess that Movei is probably weaker than 2750 CCRL but it is not more than 300-350 elo weaker than 2.3.2a so I expected movei to score at least close to 50%.

It seems that rybka is going to win that match(match not over but result after 63 games is 23-11 for rybka with 29 draws and it seems that game 64 is going to be finished in a draw when only rybka can win).

I wonder if your 300-350 elo difference were based on games at 40/40 or you simply gave rybka to play 40/4 with 10% strength.

Note that I think that 40/4 with 10% strength may be not reliable because there are interfaces that simply steal time from the engine when the engine prints output so the engine practically get less than 10%.

Edit:based on your post it is clear that you gave rybka 40/4 against 40/40 of the opponents so the question is if you gave rybka to play against opponents with rating of 2700-2800 or maybe gave her to play only against stronger opponents.

Uri
Parent - - By Uri Blass (*****) [il] Date 2007-09-04 10:16
update of this match
After 80 games result is 30-16 for rybka and 34 draws or 47-33 for rybka.

Note that tablebases are not used in this match and movei but I doubt if tablebases change much.

Uri
Parent - - By Vasik Rajlich (Silver) [hu] Date 2007-09-05 07:42
Uri,

IIRC, Movei has a CEGT rating. Can you calculate the value of each doubling of speed in this match based on the Elos and the performance?

My prediction is that the value of each doubling in this case will be lower than for a Rybka vs Rybka match at the same TC.

Vas
Parent - - By Uri Blass (*****) [il] Date 2007-09-05 08:26 Edited 2007-09-05 08:34
CEGT rating

http://www.husvankempen.de/nunn/40_40%20Rating%20List/40_40%20All%20Versions/rangliste.html

11 Rybka 2.3.2a w32 1CPU 2975 14 14 1613 70.6 % 2822 36.3 %
159 Movei 0.08.438 2648 24 24 564 51.1 % 2641 32.6 %

Rybka practically played 40/4 but the 40/4 rating list of her is similiar

11 Rybka 2.3.2a w32 1CPU 2977 22 22 729 74.1 % 2794 31.4 %

losing 47-33 against rybka2.3.2a is performance that is near 60 elo worse than rybka and it means performance that is slightly above 2900.

In other wordes movei earned near 260 elo from multiplying the speed by 10 and it earned something near 80 elo per doubling the speed(I assume movei blitz rating is the same for the discussion).

Movei has no blitz rating and I may be wrong but usually there is no big difference between blitz rating and long time control rating so probably different blitz rating will not change the result by more than 10 elo.

Note that it is possible that movei suffered from smaller hash relative to CEGT but I do not think that it is very important for the rating list.
Maybe movei suffered more from the longer time control relative to cegt and I guess that I need to play 40/30 against 40/3 if I want equivalent time control.

Note that I plan to continue the match for 100 games but at this time the match is not continued and the computer does other things.

Uri
Parent - - By Vasik Rajlich (Silver) [hu] Date 2007-09-05 08:36
Uri,

thanks, that's interesting. The speed is really helping Movei here. Maybe your eval is better than your search. Or maybe my theory is not correct, at least to a significant degree.

Vas
Parent - - By Uri Blass (*****) [il] Date 2007-09-05 10:27
Some questions:

1)Did you expect smaller improvement from movei?
2)What is the rating improvement that you got from rybka?

Note that there are many factors that can explain different results:

1)different starting positions relative to CEGT(I used the old noomen positions)
2)different time control relative to CEGT
3)different conditions relative to CEGT(I used 64 mbytes for hash and no tablebases)

I think that it may be interesting to define opening positions that are less sensitive to speed improvement not because of small variety of the results)

I wonder if FRC starting positions are more sensitive to speed improvement relative to the chess positions that I used in the noomen match because
Rybka earned 101 elo only from using 64 bits instead of 32 bits and it is more than 100 elo for doubling the speed considering the fact that 64 bit is less than twice faster than 32 bits.

http://computerchess.org.uk/ccrl/404FRC/cgi/compare_engines.cgi?family=Rybka&print=Rating+list&print=Results+table&print=LOS+table&print=Ponder+hit+table&print=Eval+difference+table&print=Score+with+common+opponents&match_length=30

Uri
Parent - - By lkaufman (*****) Date 2007-09-05 13:19
     That is a very interesting observation. The number of games is quite large, and the 64 bit speedup is much less than 2 to 1, so it certainly appears that your theory about FRC is correct. This makes perfect sense, because there are more moves to be played by the engines rather than by the book, so more chance for superiority to count. I have the impression from human FRC events that the best players dominate even more than they do in normal chess. Kasparov once said that if all tournaments were FRC he would win with 90% scores instead of 70% scores. I'm sure he was exaggerating, but basically I think he is right that FRC will magnify the advantage of the better player, whether human or computer. The opening is a hugely important part of chess, and with opening books removing that element it's much harder for the better player to win in normal chess unless he also has better opening knowledge than his opponent.
     Are there any available statistics on the percentage of draws in FRC vs. normal chess, given equal time limits, either for computers or for humans? I would expect the draw percentage to be less in FRC.
Parent - By Vasik Rajlich (Silver) [hu] Date 2007-09-08 07:32
Actually, the 32-bit FRC Rybka just did very badly in Ray's test. It's really not clear why. One theory is that in FRC, search is somehow relatively more important.

Vas
Parent - - By Vasik Rajlich (Silver) [hu] Date 2007-09-08 07:30
Let me put it like this.

There is an old question in computer chess of which is more important, the search or the eval. Generally, I think that whereever you are weaker is what is more important.

Movei with 10x more search time should be relatively better in searching than in evaluating, compared to Rybka. So, I would expect the value of additional search time for Movei to be relatively less. In other words, going from 5x to 10x more time should give a lower improvement.

I guess, looking at this data (although it's limited and I know nothing about Movei), the above trend probably exists but is relatively weak.

Vas
Parent - - By Uri Blass (*****) [il] Date 2007-09-08 08:06
I think that you mean to the question if it is more important to work about search or about evaluation.

I think that there are 2 types of work on the search and 2 types of work on the evaluation.

For the search:
Making the program faster is one type of search improvement and making the search more efficient is another type of search improvement.
First type of improvement is more important for fast time control when second type of improvement may be more important for long time control.

For the evaluation:
Teaching the program knowledge that is relevant to all stages of the game is one direction of improvement and teaching the program about specific endgames is another direction of improvement.

I think that correct knowledge that is relevant for all stages of the game is more important at longer time control but this theory is untested
by me.

Talking about movei I think that I can probably earn more from more effective search relative to improving the evaluation.
The main direction that I may try to improve the evaluation is deleting counter productive knowledge(that may also help movei to be faster in nodes per second) because I guess that movei has some counter productive knowledge in the code even if we talk about games at fixed nodes per second.

Another direction for improving the evaluation should be knowledge about specific endgames because movei does not know to evaluate
simple endgames like KRN vs KRP.

Uri
Parent - By Vasik Rajlich (Silver) [hu] Date 2007-09-10 18:51
Actually, search heuristics themselves can be put into two categories: those which can be applied once along any path from the root to a tip (for example, futility), and those that can be applied multiple times along any path (for example, LMR).

The first category should have roughly the same effect as speed improvements, and the second category should help more in longer time controls. There are many other issues, though.

Re. evaluation, I don't understand why this theory should be correct. Anyway, I bet the evaluation becomes more important with longer time controls relative to the first category of search improvements. (Not sure about the second category.)

Vas
Parent - - By Uri Blass (*****) [il] Date 2007-09-06 11:03 Edited 2007-09-06 11:08
Movei seems to perform slightly better after more games.

After 95 games the result is 34-21 for rybka and 40 draws when game 96 is probably draw but only movei can win(KRB vs KRP for movei that does not know to evaluate it correctly).

This means that movei is something like 50 elo weaker than rybka 32 bit/10 at 40/40.

I may repeat the match later at 40/10 when I expect movei to win.

Uri
Parent - - By Uri Blass (*****) [il] Date 2007-09-07 12:22
final result is 35-21 with 44 draws that mean 57-43 for rybka
Parent - - By Vasik Rajlich (Silver) [hu] Date 2007-09-08 07:26
So this is 85 Elo or so per doubling, right?

Vas
Parent - - By Uri Blass (*****) [il] Date 2007-09-08 08:21 Edited 2007-09-08 08:30
movei's cegt rating also slightly improved so final result based on my calculation is 81 elo per doubling.
To be more correct we need to use the same positions and the same hardware and the same conditions so I need to test move*10 also against other programs like toga1.3.1.

I may do it later after finishing a similiar match with the same hardware at 40/10(I do not plan faster time control than 40/10 because I am afraid that arena is slowing down rybka because of actions like printing the pv and I generally do not trust arena not to steal significant time from both programs at fast time control so I am even afraid that this has significant effect against rybka at 40/10.

I hope that rybka gui will enable making matches between engines at fast time control without stealing time from them.

Uri
Parent - - By Uri Blass (*****) [il] Date 2007-09-09 10:27
I already finished the match at 40/10

final result

Rybka2.3.2a 51-movei00.8.348 51-49(36-34 for rybka and 30 draws)

Rybka improved her performance by 41 elo relative to movei when the time control was 4 times slower(7 elo advantage for rybka at the faster time control relative to 48 elo advantage for rybka at slower time control)

Uri
Parent - By Vasik Rajlich (Silver) [hu] Date 2007-09-10 18:55
Thanks, that's interesting.

There are three obvious candidates for the improvement:

1) Luck
2) The GUI robbing the engines of time
3) Rybka scales better to longer TCs under these conditions - perhaps, she has more of an eval advantage than a search advantage compared to Movei.

Vas
Parent - - By Shaun Brewer (****) [gb] Date 2007-09-13 11:14
Some initial results - delay due to holiday - plus I am using the same machine for all tests to remove as many variables as possible.

Handicaped Rybka test.

50 positions (reversed 100 games)
Elo calculated with Bayeselo

Rybka 2.3.2a 64-bit v Toga 1.3.1

40/4 base (CCRL equivalent)

Rybka 40/4 - Toga 40/4

+ 202 elo (78.5 :  21.5) draws 27%

Rybka 40/4 - Toga 40/8

+ 128 elo (70.0 :  30.0) draws 38%

Gain from doubling 74 elo

Rybka 40/4 - Toga 40/16

+ 60 elo (58.5 :  41.5) draws 29%

Gain from doubling 68 elo

Rybka 40/4 - Toga 40/32

- 8 elo (48.5 :  51.5) draws 43%

Gain from doubling 68 elo

The elo benefit from doubling seems very flat (I am surprised) I will run 40/64 however I may run other tests first.

Shaun
Parent - By Banned for Life (Gold) Date 2007-09-13 15:43
This also seems to confirm Vas's assertion that there is 70 Elo gain for a doubling of computation time.

Regards,
Alan
Parent - - By lkaufman (*****) Date 2007-08-23 19:35
I don't know anything about Toga, but I will make the general observation that if you give any other program enough time handicap to equalize the score against Rybka, and then you double the times for both programs, the result will depend very much on which version of Rybka is being tested. The earliest version of Rybka had very limited chess knowledge, and so should gain less than other programs with increased time. Each successive version had more knowledge, and my guess is that by 2.3.2 the chess knowledge was better than most other programs (though not all), so the benefit of more time should be above average for Rybka, but not the best. By the time the next Rybka version comes out, I'll predict that she will be as good or better than all others in chess knowledge, and hence also in the benefits of more time.
Parent - - By Uri Blass (*****) [il] Date 2007-08-23 20:40
The question how much you earn from doubling the time is not only question of evaluation knowledge.

If 2 programs have the same evaluation knowledge when one has better order of moves you can expect the program with better order of moves to earn more from doubling the time.

Uri
Parent - - By Uri Blass (*****) [il] Date 2007-08-23 20:48
I can add that I disagree that early version of rybka will earn less than other programs from doubling the speed and it seems that they will earn nearly the same.

from the ccrl 40/40 list

Fritz 10 2883 +18 −18 54.1% −28.6 36.7% 1030
49.3%
Rybka 1.0 32-bit 2883 +16 −16 60.8% −72.4 40.4% 1269
57.9%
Hiarcs 11 2876 +28 −28 55.1% −31.4 42.4% 389
51.2%
 

from the ccrl 40/4
Hiarcs 11 2891 +11 −11 56.0% −41.7 32.9% 3114
Rybka 1.0 Beta 32-bit 2891 +14 −13 64.6% −104.7 32.0% 2010
66.8%
Fritz 10 2882 +22 −22 53.1% −22.9 29.8% 715
51.8%
10
Parent - - By lkaufman (*****) Date 2007-08-23 20:59
Thanks for that information. It tells me that most likely certain aspects of Rybka's search (probably not just move ordering that you mention) benefit more from increased depth. In that case, newer versions of Rybka should benefit more than most or all other programs from increased time, if they are brought to the same level by a time handicap, since they retain the same basic type of search, but have much more chess knowledge.
Parent - - By Uri Blass (*****) [il] Date 2007-08-23 21:27
Interesting that you assume that rybka beta has less chess knowledge than other programs.

I remember that at the time of releasing rybka beta vasik wrote that he believes that the main advantage of rybka relative to opponents is knowledge in the evaluation.

I remember that
He claimed that the target is to have knowledge based program and to make things clear knowledge is what win chess games.

Uri
Parent - - By lkaufman (*****) Date 2007-08-24 05:16
Certainly that is our target, to have more knowledge than other programs. We surely believe this is very valuable. Perhaps we are already ahead of most other programs in knowledge, but probably not all. But I believe that the early versions, at least the first, were pretty weak in that area. If Vas thought that the first version of Rybka was already ahead of her main rivals in knowledge, then in my opinion he was underestimating how much knowledge they had. Some very basic things were missing from Rybka's eval until very recently.
Parent - - By Vasik Rajlich (Silver) [hu] Date 2007-08-24 09:13
Well, I don't really want to criticize other programs here. They are all fine and interesting.

Generally, the work of improving a chess program consists of adding, refining, reformulating and tuning chess knowledge. It's a kind of tautology - there is really nothing else you can do.

In terms of # of hours worked, Rybka 1.0 is probably about half-way between my starting point and the current Rybka. So, you could say that Rybka 1.0 has half of the chess knowledge of the current Rybka. This is just one way to quantify it and there are others.

Vas
Parent - By joshua2 (*) [us] Date 2007-09-04 01:44
After Rybka, what engine do you think has the most amount of chess knowledge?
Parent - - By PSalomon [de] Date 2007-08-23 20:41
Hi Shaun,

which GUI are you using for your test?
Under Fritz 8 for example it´s not possible to give the engines different time controls.

Thank you!

(I miss this feature since many years)!

Peter
Parent - - By Shaun Brewer (****) [gb] Date 2007-08-23 21:44
Hi Peter - Arena as it allows greater flexability in time handicap.

Shaun

P.S. In F9/10 you can play handicaped Engine Matches just not Tournaments.
Parent - - By PSalomon [de] Date 2007-08-24 08:06
Hi Shaun,

thank you for your answer. I looked at Arena (1.99 beta 3) but I also dont´t find any possibility to set different time controls.
Can you help me, please?

Thank you!

Peter
Parent - By Shaun Brewer (****) [gb] Date 2007-08-24 11:13
I am at work so don't have Arena here but...

Look at manage engines - I think it is on one of the engine tabs you set % of time or something.

So What I have done is set both Rybka and Toga to 10% initially and I am running 40/40 matches both engines get 40/4 I will increase the toga % 10 to 20 etc...

Shaun

P.S. the displayed time looks odd but the games look okay.
Parent - - By Uri Blass (*****) [il] Date 2007-08-24 11:46
I think that it is better to use arena1.1 that also support unequal time control.
beta arena have bugs (I remember a case when beta adjudicated a game wrong because it believed that there was a mate when there was aenpassent capture,the bug was corrected later but I think that it was corrected only in beta4 that may have other bugs)

Uri
Up Topic Rybka Support & Discussion / Rybka Discussion / Difference between 2 and 4 cores
1 2 Previous Next  

Powered by mwForum 2.27.4 © 1999-2012 Markus Wichitill