Not logged inRybka Chess Community Forum
Up Topic The Rybka Lounge / Computer Chess / CEGT - rating lists February 12th 2012
- - By Werner (***) [de] Date 2012-02-12 12:49
Hi all, :D

our actual rating lists are online and can be found under the attached links. We have adjusted our lists. New reference engine is now Deep Shredder 12 x64 1CPU with 2800 points. The difference in startelo was (-)181 points here in our 40/20 list.

40 / 20:
New games: 1862 ; 52 different engines
Total:  573.179

NEW Engines

785 DanaSah 4.88: 2379 - 600 games (-1 to version 4.66 here - and +26 at the moment in our blitz-tests)

UPDATES
2 Houdini 2.0c x64 4CPU: 3097 - 2642 games (+2)
88 Deep Junior 13 x64 4CPU: 2867 - 252 games (+10 and +5 to version 12.5)
191 Deep Junior 13 x64 1CPU: 2767 - 1048 games (-14 and +5 to version 12.5)

40 / 4:
New games: 8600
All games now: 978.710
New startelo here is 2588 (-204). New reference engine with 2800 points is Deep Shredder 12 x64 1CPU!

New Engines
202 Deep Junior 13 x64 4CPU : 2744 - 1200 games (+9 to version 12.5)
263 Deep Junior 13 w32 1CPU : 2693 - 800 games (+-0 to version 12.5)
542 Cheng 3 v1.07 x64: 2496 - 1000 games (+23 to v. 1.06)
766 DanaSah 4.88: 2388 - 1000 games (+26 to v. 4.66)
807 GreKo 9.0 x64: 2362 - 1000 games (+5 to v. 8.2 here)
1036 EveAnn 1.67: 2154 - 800 games (+45 to v. 1.66)
1074 Waxmann 2011: 2086 - 800 games (-13 to v. 2010)

Updates
3 Critter 1.4 x64 4CPU : 3063 - 2100 games (+-0)
746 Arasan 13.4 w32 1CPU : 2397 - 1100 games (+15)
750 Tornado 4.25 w32 1CPU: 2395 - 1200 games (+1)
857 Murka 2.0 x64 : 2330 - 1400 games (-2)
1029 ECE 12.01: 2165 - 900 games (+11)

40/120
See here our new single-list:
http://www.husvankempen.de/nunn//40120new/40_120_ratinglist/40_120_AllVersion/rangliste.html
 
A big „Thank you“ to all testers as usual!!

Links

40/20: http://www.husvankempen.de/nunn/rating.htm
Blitz: http://www.husvankempen.de/nunn/blitz.htm
40/120: http://www.husvankempen.de/nunn/rating120.htm
Tester: http://www.husvankempen.de/nunn/testers/testers.htm
Elo-comparison: http://www.husvankempen.de/nunn/Replay/ELOcomparison.htm
Games of the week: http://www.husvankempen.de/nunn/40_40%20Rating%20List/Coordination/gow.jpg

Werner Schuele
CEGT-Team
Parent - - By turbojuice1122 (Gold) [us] Date 2012-02-12 16:34

> New reference engine is now Deep Shredder 12 x64 1CPU with 2800 points. The difference in startelo was (-)181 points here in our 40/20 list.


Why such a drop?  There is no way that this is realistic--if you compare with human matches over the past decade, the previous basis with Shredder 9 1-CPU at around 2750 was fairly realistic, and even that was perhaps slightly low.
Parent - By Werner (***) [de] Date 2012-02-12 16:59
We think: this is high enough. And we are in a good companionship with other German lists now.
Parent - - By Werewolf (*****) [gb] Date 2012-02-12 17:14
Your adjustment means Fritz 6 is rated 2392! I organised a 10 game match with an I.M in 2001 who was around 2350. Fritz 6 beat him 9-1 at 40 in 2 hours. It is DEFINITELY above 2500 at the least.
Parent - - By leavenfish (***) [us] Date 2012-02-12 19:14
I really don't think this shows anything, does it. 10 games against a person who might miss tactics a bit more on day than the next, this particular IM may not play well against computers, etc. 10 games...just don't think this is even relevant or reliable really.
Parent - - By Werewolf (*****) [gb] Date 2012-02-12 19:57 Edited 2012-02-12 20:00
Then you miss that this result is typical against humans of his grade. The point is that Fritz 6 is worth a lot more than its new grade. And so are most programs, c/f DF10 for example or the Junior programs. You can't argue that EVERY human who has played a machine, including the world's greatest like Kasparov and Kramnik are ALL bad at playing machines.
Parent - By turbojuice1122 (Gold) [us] Date 2012-02-13 01:45
I agree--the match Kramnik vs. Deep Fritz 7, Kasparov vs. Deep Junior and vs. Deep Fritz, and Kramnik vs. Deep Fritz 10 have results that are all in very close support to the previous ratings basis.  The previous rating of Shredder 8 4-CPU also agrees with this based on the results of a dumbed-down version against a dumbed-down version of Hydra, whose rating we have a good estimate based on its matches and games with humans.

I think that the motivation might be to get the numbers comparable with IPON, which was never intended to be on the same scale as human ratings in the first place.
Parent - By Werewolf (*****) [gb] Date 2012-02-12 20:13
Sorry Werner,
What you're doing is great, I love your list and the testing is fantastic. It's good to see a tester underselling when the typical history over the years has been to exaggerate the elo of chess programs. I just think the pendulum has swung too far in the other direction now.
Up Topic The Rybka Lounge / Computer Chess / CEGT - rating lists February 12th 2012

Powered by mwForum 2.27.4 © 1999-2012 Markus Wichitill