Not logged inRybka Chess Community Forum
Up Topic Rybka Support & Discussion / Rybka Discussion / Rybka 3 -- a hundred elo!
1 2 3 4 5 Previous Next  
- - By lkaufman (*****) Date 2008-07-08 14:51
     Since yesterday I've been testing a version of Rybka that is very close to Rybka 3, with the improved scaling and all my latest eval terms added. I'm running it against 2.3.2a mp. It appears that on a direct match basis, we will reach the goal of a 100 Elo gain, at least on quads. As of now, after 900 games total, the lead is 110 Elo (105 Elo on quads, 120 on my octal). This is with both programs using the same short generic book, each taking White once in every opening. To achieve this result Rybka 3 has to win about 4 games for each win by 2.3.2a on the quads and about 5 for 1 on the octal, due to draws. How this will translate to gains on the rating lists remains to be seen.
Parent - - By jpqy (**) Date 2008-07-08 15:08
One word "Amazing"!!!
Parent - By richbell (**) Date 2008-07-08 15:13 Edited 2008-07-08 15:15
Wow, wow, wow, i was waiting for this result for a long time, Thanks a million LK. Lovely to see the beast unleashed. I further hope Vas's final magic touches are pending to make this version more immortal.
Parent - - By SillyFunction (**) Date 2008-07-08 15:45

> How this will translate to gains on the rating lists remains to be seen.


let's talk straight.
I believe that you've seen enough games against the others.
Please tell us what you guess about the new rating of Rybka.
Parent - By lkaufman (*****) Date 2008-07-08 16:04
I don't know how much of the 100+ Elo gain is due to better scaling, probably on the quads not too much. I have never run any games between the recent versions of Rybka in MP mode against other programs, because the results would be too one-sided to tell us much. I did verify that Rybka sp (about a month ago) against Shredder 11 on 4 cores gained roughly as much over 2.3.2a as in direct tests. I'll know much more about this pretty soon. If you want me to guess, I would expect that the gain on the rating lists will be in the 100 ballpark on the quad, maybe a bit less on 1 or 2 cores.
Parent - - By duncan (**) Date 2008-07-08 17:08
congratulations.

if the goverment gave you a billion dollar grant for (a)research into chess or(b)improving rybka as much as possible, how would you spend it
Parent - - By lkaufman (*****) Date 2008-07-08 20:09
I don't think I could make good use out of a billion dollars for this purpose, but with maybe ten ot twenty million I could buy maybe a hundred octal computers and hire enough qualified people to operate them (with a team of programmers) to run countless experiments designed to learn what does and doesn't work in chess. We would also hire a few top GMs to play the programs under various conditions and try to improve on our results based on what we learn. Basically, it's what we do now but we could progress maybe fifty times as fast.
Parent - By Miroslav Kvíčala (**) Date 2008-07-08 20:18
It looks very impressive...+100 ELO points it simply great! :)
Parent - - By gmnotyet (*) Date 2008-07-08 20:30
Congratulations.

Could you please post PGN for some of the games involving the late beta Rybka 3,
so we can see the new engine in action?

If this is still proprietary information, then please forgive my request.
Parent - By lkaufman (*****) Date 2008-07-08 21:12
I don't know if Vas wants me to do this, and anyway all I have now is these bullet games. The match on Sunday with FM Meyer was played with only a week old version, so that at least gives you four examples. That version didn't have the better scaling, but that just means that those games could have been played by the new version with somewhat less time on the clock.
Parent - - By Kapaun (****) Date 2008-07-09 00:19
Lol! If you ever do this, please accept my application for being one of the experimenters... :-)
Parent - By FWCC (***) Date 2008-07-09 06:11
I'm running and hiding from anything that is 100+ELO above Rybka 2.3.2a,that is simply frightening.Good work Rybka team!With this Rybka may be number 1 for another 100 years BRAVO!!
Parent - - By Shaun (****) Date 2008-07-09 00:14
I am looking forward to testing Rybka 3!!!

Given the expected improvement it will be interesing if any other engine running with 4 threads on a quad can even equal R3 with 1 thread - the only possibility will probably be R2.3.2a and that is looking unlikely.

I think Vas and you are going to have to share some secrets to level the playing field a bit ;).

Shaun
Parent - - By lkaufman (*****) Date 2008-07-09 01:50
We already did share some secrets, some unintentionally. I do expect R3 on 1 thread to outrate R2.3.2a on 4, but to lose to it in a direct match. So please avoid such a pairing!
Parent - - By Italian81 (****) Date 2008-07-09 02:21
Any tests on how much better Rybka 3 is than 2.3.2a MP   64 bit OS??  on a dual?
Parent - - By lkaufman (*****) Date 2008-07-09 05:52
All our testing is on 64 bit. I haven't tested on dual, but I think it's safe to say that the gain on a dual will be just a few Elo points less than on a quad. The difference in the amount of gain would be too small to detect without playing thousands of games, but speed measurements will determine it fairly accurately.
Parent - - By Sesse (****) Date 2008-07-10 14:15
So Rybka 3 improves almost nothing from two to four cores? :-)

/* Steinar */
Parent - By Lukas Cimiotti (Bronze) Date 2008-07-10 15:45
No. Going from 2 to 4 cores gives you a speed increase of roughly 70%
Parent - By lkaufman (*****) Date 2008-07-10 16:14
Rybka 3 should add a few Elo points on top of whatever gain was achieved by Rybka 2.3.2a going from 2 cores to 4. If Rybka 2.3.2a gained 32 Elo on the rating list (roughly the case on average) going from 2 to 4 cores, maybe Rybka 3 will gain close to 40 Elo. The increase in the gain should be more going to 8 cores.
Parent - - By Uri Blass (*****) Date 2008-07-09 09:28
I think that it is going to be interesting to have also a direct match.
If your theory is right then it can be proved because I assume that big majority of the games are not going to be
rybka-rybka games

It may be also possible that rybka3 is simply better against weak opponents and hopefully it will be possible to test this theory
in 2010 or 2011 when we have some non rybka programs that are better than rybka2.3.2a

Uri
Parent - By lkaufman (*****) Date 2008-07-09 15:45
But if Rybka 3 beats Rybka 2.3.2a by a hundred elo or so (as my testing shows) and then loses badly to Rybka 2.3.2a giving 4 cores to 1 handicap (as an earlier version did, I haven't tried it with Rybka 3 beta yet), while the rating difference between Rybka 2.3.2a on 4 cores vs. 1 is only in the sixties, this would not fit with any notion of "weak" opponents being a factor, unless by "weak" opponents you mean good programs on lesser hardware. 
Parent - - By Nelson Hernandez (Gold) Date 2008-07-09 16:46
Uri, if Rybka 3 is as good as advertised I have a feeling some of the commercial competition may get demoralized and call it a day.  What would you do, if if you were 200-300 points behind and trying to sell your product?
Parent - - By Banned for Life (Gold) Date 2008-07-09 19:11
The interesting thing for me will be seeing if this large advantage is really across the board, or is something that occurs in most positions, but still shows significant weakness in other positions.  I'm still of the belief that Zappa plays a lot of positions better the R2.3.2a, and suspect it may be due to Zappa's wider but shallower search. I am waiting to see how this changes with R3 (which presumably will feature an even lower branching factor than R2.3.2a).

As an aside, continuing in a long tradition, it turns out that Anthony was much better at writing engine code than in turning the few parameters he offers in the UCI window...

Regards,
Alan
Parent - By onursurme (***) Date 2008-07-09 21:14
Yes, wider but shallower search.
This is what I asked for many times.
An analytical tool for who have enough time, and who wants to be sure about not missing some close shots.
A correspondence player may use one of his computers for this type of search.
Parent - - By NATIONAL12 (Gold) Date 2008-07-09 23:31
when hamsters match is over i will do extensive testing of R3 against Zappa at long time controls,even if i have to build another Skulltrail or whatever else is available in 2009.this will be of value to all top correspondence players
Parent - By turbojuice1122 (Gold) Date 2008-07-10 01:11
Well, at roughly 24 hours per game, Rybka 2.3.2a still beats Zappa Mexico II in CEGT testing, so I'm not sure what you have in mind unless somehow you have been able to hack into Erdo's computer and get his book. :-)
Parent - By Uly (Gold) Date 2008-07-09 23:05

> What would you do, if if you were 200-300 points behind and trying to sell your product?


SmarThink has been some 200 ratings points behind and still selling.
Parent - - By InspectorGadget (*****) Date 2008-07-09 05:39
So you and Vas are doing a very good job behind the scene. The waiting is worth it. Amazing, very amazing! :)
Parent - - By M ANSARI (*****) Date 2008-07-09 07:49
Good news!  I always thought that Rybka had a lot of lag when it came to MP efficiency.  This is very clear when you play Zappa against Rybka on very powerful hardware ... Zappa gains much more than Rybka .... so much so that it seems there is a linear point where it basically equals Rybka 2.3.2a and might even surpass it.  From your post it  seems that this disadvantage has been removed and Rybka's MP scaling has maybe reached the level of Zappa or even bettered it.  Look forward to testing it.
Parent - - By 8lrr8 (***) Date 2008-07-09 10:20
"Zappa gains much more than Rybka .... so much so that it seems there is a linear point where it basically equals Rybka 2.3.2a and might even surpass it."

i keep hearing this, but dont the results of the rybka/zappa marathon match provide evidence to the contrary?  rybka scored 60/100 over 2 matches (of 50 games).  i think both were using either their default or generic opening books, so of course mating the programs w/ home-cooked books will have different results.

am i missing something?
Parent - - By NATIONAL12 (Gold) Date 2008-07-09 10:23
the hardware was not very big by present standards,ie Quad 2.4 i believe.
Parent - - By 8lrr8 (***) Date 2008-07-09 23:51
but the time controls were ~3.3x as long as classical, which is a larger equivalent "speed increase" than going from quad to an octal, no?
Parent - - By NATIONAL12 (Gold) Date 2008-07-10 00:07
those time controls are mickey mouse as far as i am concerned.
Parent - - By 8lrr8 (***) Date 2008-07-10 00:40
care to elaborate?
Parent - By Uly (Gold) Date 2008-07-10 02:15

> care to elaborate?


He's saying that those time controls are:



:-D
Parent - - By George Tsavdaris (****) Date 2008-07-10 08:15
I don't think you get the point.

It must be true that Zappa Mexico gains from more cores more than Rybka 2.3.2a.
For example if Rybka gains X ELO by going from Quad to Octal, then Zappa gains Y>X ELO.

But that doesn't mean that the same will happen when we go to a time control of A seconds per move, to a time control of B>A seconds per move.
Why? Because what it is true, is that Zappa uses the more processors/cores more effectively from Rybka and NOT that Zappa is better at playing longer time controls better than Rybka.

If you are on a Quad and play at time controls of A s/move a game Rybka-Mexico you will have some effectiveness for the 2 engines of how they use the Quad to calculate their moves. If you stay on the Quad and just increase the time control to 6·A s/move the effectiveness of the 2 programs will remain the same when speaking about the hardware.
So comparing the 2 results of the 2 matches, it will be a wrong method of testing their effectiveness of using more cores/processors. It will just be a method of testing of how their strength will vary when going to a 6 times slower time control.
Parent - - By 8lrr8 (***) Date 2008-07-10 08:25
yes, i realized this.  however we need to consider the exponential increase in time it takes to search 1-ply deeper esp. at the higher ply-depths.  we also need to consider the probability that at these high ply-depths the program will like a different move at ply X than at ply X-1.  it just doesnt happen all that often.  and even when it does happen, u need to look at just how often the outcome of the game actually changes (i.e. from loss to draw, or from draw to a win).

i am claiming that at classical time controls, zappa on an octal is NOT as strong as rybka 2.3.2 on a quad.  do u disagree?

zappa might be stronger on a 16-core than rybka on a quad, but not octal vs. quad.
Parent - - By turbojuice1122 (Gold) Date 2008-07-10 13:48
In a match on octal vs. quad with the two using their own books, Zappa would be very clearly stronger.
Parent - - By 8lrr8 (***) Date 2008-07-11 00:37
do u have actual match data/results for this?  if so, can u share w/ us?
Parent - - By turbojuice1122 (Gold) Date 2008-07-11 02:29
Zappa won in Mexico on equal hardware in these conditions, and the Rybka version that played there was stronger than Rybka 2.3.2a.  Zappa would demolish Rybka in that case if you double its hardware and keep Rybka's the same.
Parent - - By 8lrr8 (***) Date 2008-07-11 02:35
1. it wasnt at classical time controls.
2. it was a short match w/ inconclusive results.
3. wasnt this match more of a book vs book battle rather than engine vs. engine?
Parent - - By turbojuice1122 (Gold) Date 2008-07-11 02:38
(1)  Classical time controls depend on who you ask.  It was at tournament time controls (i.e. not rapid, blitz, or bullet).
(2)  Short matches are much more conclusive when you have people tuning the engines and books during the match; the fact that this doesn't occur in engine testing is the primary thing that gives so much variance.  In a match on my computer, it turned out that Zappa got better results from manually tuning its book than Rybka got from manually tuning its book.
(3)  It was both to some extent, and that's what is always the case in chess.
Parent - - By 8lrr8 (***) Date 2008-07-11 02:45
1. how many people consider 60 min/game as "classical?"  like under 50%.
2. well fwiw, wasnt it vas (or someone else on the rybka team) that said everything that could've gone wrong (in that match), did.
3. not so if 2 engines use the same generic book, no?
Parent - - By turbojuice1122 (Gold) Date 2008-07-11 02:56
I think that the difference between "classical" and "tournament, non-classical" is negligible in relative quality of the games.  As for the same generic book, this defeats the purpose of having a match, since some programs play better in some lines than in others.  For example, I did some testing with Hiarcs 11 and found that it gains something in the range of 70-80 elo points over its rivals if you have everyone using their own books compared with everyone using those generic books that generate somewhat random openings.
Parent - - By 8lrr8 (***) Date 2008-07-11 03:01
"For example, I did some testing with Hiarcs 11 and found that it gains something in the range of 70-80 elo points over its rivals if you have everyone using their own books compared with everyone using those generic books that generate somewhat random openings."

so basically hiarcs comes w/ a kick-ass default opening book, right?
Parent - - By turbojuice1122 (Gold) Date 2008-07-11 03:08
Some have said that this is the reason for this phenomenon, but I really don't know.  I know that it's a quite aggressive book.
Parent - - By InspectorGadget (*****) Date 2008-07-11 06:16
Turbo, where is that can of Turbo Juice that I saw int the Hiarcs forum? :)
Parent - - By turbojuice1122 (Gold) Date 2008-07-11 06:18
Ask Dylan Sharp. :-)
Parent - By InspectorGadget (*****) Date 2008-07-11 11:07
Maybe Felix shold create one for you, and ONLY you :)
Parent - - By Ty Nance (**) Date 2008-07-11 06:10

> As for the same generic book, this defeats the purpose of having a match, since some programs play better in some lines than in others.


A program that wins matches with "same generic book" may impress us as being able to correctly analyze a broad variety of positions, while programs that only excell with their own (possibly narrow) books might not be so trustworthy for analysis of general chess positions.

We play comp-comp games, and it is fun, but I hope the perspective is that there is a correlation between the winning program and the best program for analysis, in the eyes of the serious chess player.

ty
Up Topic Rybka Support & Discussion / Rybka Discussion / Rybka 3 -- a hundred elo!
1 2 3 4 5 Previous Next  

Powered by mwForum 2.27.4 © 1999-2012 Markus Wichitill