Not logged inRybka Chess Community Forum
Up Topic Rybka Support & Discussion / Rybka Discussion / Rybka and Houdini at 40/120 Timecontrol?
1 2 3 4 5 6 Previous Next  
Parent - - By turbojuice1122 (Gold) [us] Date 2012-02-12 16:32
Something strange has happened on the CEGT list lately--they have dropped all of the ratings by about 200 points, which is certainly not realistic.
Parent - - By lkaufman (*****) Date 2012-02-12 20:16
They just decided to peg the list to Deep Shredder 12 single core being 2800. This is to make the top engine ratings realistic in human terms, and also to coincide with the standard used by IPON. Of course 2800 is unrealistically low in human terms for DS 12, but the engine vs engine ratings exaggerate the rating differences by about 4 to 3 so top ratings like 3300 were unrealistically high in human terms. I approve of their decision. Now the ratings of the top engines will be about right in human terms.
Parent - - By turbojuice1122 (Gold) [us] Date 2012-02-13 01:49
I think that even the very top ones are still way too low.  Recall that Hydra was in the 3050 to 3100 range based on games against humans.  Does anyone really think that Houdini on a single-cpu wouldn't demolish Hydra?
Parent - By lkaufman (*****) Date 2012-02-13 03:06
I would think it would be a fair match, but we have little data on the strength of Hydra against engines.
Parent - By Uri Blass (*****) [il] Date 2012-02-13 11:23
I do not think that we have enough games for a correct estimate of hydra against humans and I also think that results against humans are not a good predictor of results against chess programs.
Parent - - By Sedat Canbaz (****) [tr] Date 2012-02-14 12:21 Edited 2012-02-14 14:48

> They just decided to peg the list to Deep Shredder 12 single core being 2800. This is to make the top engine ratings realistic in human terms


Is there any proof that Shredder 12 x64 1 core should be 2800 Elo (equals to Human Elo) on AMD 4600 2.40 GHz ?

In my estimation,Chess Tiger 2007.1 Elo Performance should be around 2750 Elo Human points (on QX9650 @3.80 GHz)

Probably SCCT Auto232 Elo calculations will be fixed to Chess Tiger 2007.1=2750 ELO:
http://www.sedatcanbaz.com/chess/ratings/scct-auto232/

BTW,here are available some useful Elo data played by Engines and Humans:
http://www.rebel.nl/resu.htm

Note:There are no many games played,but however i think Rebel's side is a quite good indicator about the estimations between Engines and Human Elo points

Best,
Sedat
Parent - - By lkaufman (*****) Date 2012-02-16 06:01
I am quite sure that Shredder 12 1 core should be more than 2800 in human terms, probably at least 2900. But I think that using 2800 makes the rating of the very top engines about right in human terms, because engine rating differences are exaggerated when they play each other. Since more people care about the rating of the top engine than the rating of Shredder 12, I think they made the right decision.
Parent - - By Sedat Canbaz (****) [tr] Date 2012-02-16 08:53 Edited 2012-02-16 10:51
Personally i do not agree that all available engine ratings should be fixed to Shredder 12 1 core's 2800 Elo

And in my opinion,the engine's Elo performance should be based on Humans Elo points and on hardware processor speeds

As we know,any engine Elo rating depends on hardware speed, opening book, ponder off/on, time control...

One thing more,in reality (on latest decent processors) Chess Tiger 2007.1 is not 2550 Elo points
I expect Chess Tiger 2007.1's real Elo rating should be at least 2700 Elo (in human terms)
Note:i mean those calculations (where ChessTiger 2007.1 is rated at 2550 Elo) are wrong, which are based on Shredder 12 1c's 2800 Elo

But however, CEGT, SWCR, Clemens (which are based on Shredder 12 1c's 2800 Elo) are doing great job -BIG thanks for their works,efforts... !

Greetings,
Sedat
Parent - - By lkaufman (*****) Date 2012-02-16 17:36
I'm sure you are right about Chess Tiger 2007 being 2700 in human terms. With the new rating scale, the ratings at the very top should be about right in human terms, but the further down you go, the more underrated the programs will be. The only fix for this is to multiply all the ratings by 3/4 and then add a constant, but I doubt anyone will do this, as it is based on rather limited evidence from a study of SSDF ratings.
Parent - - By Sedat Canbaz (****) [tr] Date 2012-02-16 19:04
Actually for more accurate Elo calculations:
-All engine ratings should be concentrated (based) on Human Elo points

Otherwise there will be misunderstandings....

I think its meaningless,if the engines which are on the same level as GM of 2700-2800 Elo points to be published approx.150-200 Elo points less (weaker) than the reality

Btw,another useful comparison table- Man vs Machine:


For more details:
http://www.chessbase.com/newsdetail.asp?newsid=1229
Parent - - By lkaufman (*****) Date 2012-02-17 02:36
I don't know what you are advocating when you say all engine ratings should be based on Human Elo points. If you mean that all engines should play top GMs, that is just a dream. Can you explain just what you are proposing here? If you mean that on average ratings of machines should match performance against humans for those engines which have played serious human matches, this ignores the spread problem; that policy overrates the top ones and underrates the bottome ones unless a scaling factor is applied to all. Incidentally the table you give is hopelessly out of date, missing the final Kramnik-Fritz match, the Hydra-Adams match, and the  Rybka-Milov handicap match aga, which included two normal games (albeit conceding White both games).
Parent - By Sedat Canbaz (****) [tr] Date 2012-02-17 07:31 Edited 2012-02-17 10:55
I mean that (depending on the hardware speed,opening book,ponder off/on...) the engine rating calculations should be more equal (fixed) to Human Elo ratings

Even the current Playchess Engine Elo calculation program is too low,as we see many Top engines (Houdini,Rybka...) are rated around 2400-2800 Elo

But in reality (i mean for Playchess engine ratings) the top engines real Elo performance should be rated at least 400-600 Elo higher

That's why i use more accurate starting elo calculation:3200
http://www.sedatcanbaz.com/chess/july-december-2011/

For example,the bellow engine ratings are more equal to Human Elo points:
Rank Name                                    Elo    +    – games score oppo. draws
315 Amd64bit, Shredder 7                   2662   48   54   640    4%  3230    3%
316 David-Steiert, Junior 9                2632   75   86   232    3%  3215    2%


Best,
Sedat
Parent - - By Sedat Canbaz (****) [tr] Date 2012-02-17 10:55 Edited 2012-02-17 11:18
One thing more,(in my estimation) in reality the Top MP Engines real Elo performance should be rated (published) around 3350-3400 Elo points
I mean the current Top MP engines (e.g Rybka,Houdini,Critter...on latest decent i7 6 core machines) play approx. 500-600 Elo stronger than the Top GM of 2700-2800 Elo Human points

Btw,it seems Playchess Elo calculation program is too old-dated (needs update)
Probably Playchess Elo calculation program is based on 2000 years
Normally in those days was ok with the calculations..., as we remember the top chess programs were on level around 2600 Elo

But nowadays,the speed of the processors are changed,the strength of the engines are changed too
And i hope/think the Playchess Engine Elo calculation method should be changed (updated) too
Parent - - By lkaufman (*****) Date 2012-02-17 16:48
Obviously the Playchess ratings are not meant to be human-equivalent. But what is your evidence that any engine would rate 3350 -3400 against humans? That means that Carlsen, Aronian, Kramnik, and Anand would score about 4% against them at 40/2 hours. I don't believe that. In the games where they had White, if they played with a draw as the goal, I believe they would achieve that goal reasonably often, maybe 40% of the time (Joel Benjamin did it 25% of the time against Rybka, but he is a long way from 2800 level). That would give the computer a rating around 3200, not 3350+.
Parent - - By Sedat Canbaz (****) [tr] Date 2012-02-17 19:03 Edited 2012-02-17 19:20

>But what is your evidence that any engine would rate 3350 -3400 Elo against humans?


Sure i mean not for every engine,but i expect (especially at blitz) some of the Top Mp Engines (e.g Houdini,Rybka,Critter) would rate 3350 -3400 against Humans

Just i'd like to mention again that there are 2 (two) main important factors,which is required for high elo engine performance:
-very strong opening book and very fast hardware

>Carlsen, Aronian, Kramnik, and Anand would score about 4% against them at 40/2 hours


We know that Kramnik already lost against Deep Fritz in 2006 and the result was:4 draws, 2 wins (in favor for Fritz)
Really i wonder a lot about what will be the Elo performance of Houdini,Rybka,Critter against Carlsen ?!
Personally i expect to see at least 500 Elo difference (in favor for machines)
I wonder also,why in the latest 5-6 years,there is no any serious match - Man vs Machine ?
Is that can be the reason that there will be a lot of differences in points-between human vs computer ?

>Joel Benjamin did it 25% of the time against Rybka


Yes...Joel Benjamin is a strong anti-program master
And as far as i remember,Joel played against older Rybka version (i think the version was Rybka 2.3.2)
Note also that the current Rybka 4.1 mp is approx.150-200 Elo points stronger than Rybka 2.3.2 mp
Probably Rybka Cluster + superior opening book will be performed at least 800-1000 Elo better

Btw,i know very well too that for more accurate rating and better conclusion is needed many games,
but anyway if you check more carefully the bellow crosstable and my notes (based on 36 games) you will notice what i mean exactly

Best,
Sedat
Parent - - By lkaufman (*****) Date 2012-02-18 17:27
We just have no data that would shed any light on whether today's top engines would perform 400 elo above the top humans (as I believe) or 500 + (as you believe). I'm talking about 40/2 hour games; of course in blitz engines would rate even higher than 3400 against humans. Your data indicates that against humans averaging around 2700 (I think) the top engines performed around 2900 several years ago. Many of these games were by Hydra whose strength is unknown relative to today's software. My claim is that an advance of say 500 elo based on engine vs engine games does not translate to a gain of 500 against humans; maybe about 3/4 of that amount.
Parent - - By Harvey Williamson (*****) Date 2012-02-18 17:45 Edited 2012-02-18 17:48
I have played 1000's of games with Hiarcs in the main playing hall on Playchess. 95% blitz games. Included in these are 100's v human GM's. At blitz I do not think the rating of Hiarcs would be any different if I used my 12 core or my iPhone. There was a draw or 2 but mainly by people playing anti-chess and closing the position. One loss to GM Jobava. He played about 100 games v Hiarcs and I set the opening book to wild - he won 1 game after a bad opening but still amazing to do that at blitz!

At longer time controls the iPhone/PPC version has gained 2 GM norms in Human tournaments and a 2931 rating, 2 years ago. The iPhone version of Hiarcs is still the best on this device.

I really doubt we will ever get a reliable rating of the top engines v Humans. The top engines will probably all rate the same or the opening book will decide the result.

The most interesting test would be a top GM using Computer help v a Cluster engine.
Parent - - By Uly (Gold) [mx] Date 2012-02-18 18:03

> The most interesting test would be a top GM using Computer help v a Cluster engine.


I disagree, GMs are worse at using computer assistance than other expert Freestylers that aren't GMs. As the computer strength increases, the importance of your OTB strength diminishes and other elements like understanding what positions some engines excel at or are weak in (to avoid them) increases.
Parent - - By Harvey Williamson (*****) Date 2012-02-18 18:06
This is something you have 0 knowledge of and I have a lot of. I said a top GM 2700+ they are now using computers in ways you could not imagine. I know this as I am helping 1 of them and have discussed it with other members of the 2700+ club including 2 of the 2800+ club.
Parent - - By Uly (Gold) [mx] Date 2012-02-18 18:12
If one of those GM 2700+ are going to have a match against someone of the likes of Highendman or Alkelele, I think they'd be up for some big trouble. Their 2700+ elo only means they're good against other humans, but says nothing about their Advanced Chess capabilities, and that's a reason you never saw GMs getting the top standings in Freestyle tournaments.

Have you seen the Freestyle tournament being run at Infinity Chess?

http://rybkaforum.net/cgi-bin/rybkaforum/topic_show.pl?tid=24326

Any top GM that you can name that is doing well at it?
Parent - - By Harvey Williamson (*****) Date 2012-02-18 18:15 Edited 2012-02-18 18:29
Any 2700+ Gms playing there?

To get a GM of that level to get out of bed will need several 1000$ To play a match will probaly mean 10's of 1000's. Find a sponsor I will get you a GM. To get a 2800 GM you are talking 100's of 1000's.

My idea was 2700+ GM + Top commercially available hardware v a Cluster.
Parent - - By Uly (Gold) [mx] Date 2012-02-18 18:51
And we come back full circle, several 1000$ are overpriced as the GMs are expected to perform worse than people that would play for free (such as in the IC tournament.) Why would a sponsor do that?
Parent - - By Harvey Williamson (*****) Date 2012-02-18 19:01
you really need to grow up and take some lessons about the real world.
Parent - - By Banned for Life (Gold) Date 2012-02-18 19:16
Really??? Because he questions whether top GMs will make better centaurs than people with skills in other areas required by advanced chess? Based on the advanced chess played by top GMs in the past, this doesn't seem like such a radical a position...
Parent - - By Harvey Williamson (*****) Date 2012-02-18 19:21
How many 2700+ GM's have played? How many that have played have had access to good hardware? I remember chatting to one who was close to 2700 who was using F8 at home on a laptop and just missed getting to the finals. Since those early tournaments they have learned a lot. the very top GM's would be amazing Centaurs. I discussed at the london Classic with all the GM's there how they use computers. They have moved on a long way. Sadly you wont see them in a tournament with only a few $1000 as prize money.
Parent - - By Banned for Life (Gold) Date 2012-02-18 19:44
In the beginning, we had:

http://www.chessbase.com/newsdetail.asp?newsid=989
http://www.chessbase.com/events/events.asp?pid=140

Later we had quotes from Kasparov indicating that chess skills might not be of paramount importance in Advanced Chess. We also had a few top GMs at Freestyle events. Nakamura was pretty terrible as I recall, and he's known for his computer savviness.

As far as money is concerned, of course you wouldn't expect top GMs to compete for a few thousand dollars, but this doesn't mean that top GMs would dominate if the prize funds were a few orders of magnitude higher.
Parent - - By Harvey Williamson (*****) Date 2012-02-18 19:47
Yes i think the very young Naka learned his lesson.
Parent - - By Banned for Life (Gold) Date 2012-02-18 19:54
OK, but the lesson he learned was never to show up for another freestyle! :smile:
Parent - By Harvey Williamson (*****) Date 2012-02-18 19:59
sadly so but who can blame him when he realised how much money he can make elsewhere.
Parent - - By Kappatoo (****) [de] Date 2012-02-18 19:26
Which top GM's played in these tournaments?
Two obvious reasons why GM's didn't do very well in past freestyle tournaments: 1) bad hardware; 2) bad opening book
If hardware and opening book are equivalent, I just don't believe that anyone here on the forum will be better than a good GM in freestyle chess.
Parent - - By tano-urayoan (****) [pr] Date 2012-02-18 19:28

> Which top GM's played in these tournaments?


If my memory serves me, Jobava was the highest ranked player to participate in Freestyle tournaments.
Parent - By Kappatoo (****) [de] Date 2012-02-18 19:29
Yes, I recall that, too. That might be the one Harvey mentioned in the post above, though I might be wrong.
Parent - - By Kappatoo (****) [de] Date 2012-02-18 19:28
I recall that some time ago, you announced an upcoming match involving a top GM (Jobava?). Did anything come out of this?
Parent - - By Harvey Williamson (*****) Date 2012-02-18 19:31
Sadly not as it was impossible to fit a date that worked, for the few $1000 available, into the GM's schedule.
Parent - By Kappatoo (****) [de] Date 2012-02-18 19:34
Okay, a pity. Was it Jobava, by the way?
Parent - - By Homayoun_Sohrabi_M.D. (***) [us] Date 2012-02-19 02:47

> I said a top GM 2700+ they are now using computers in ways you could not imagine.


Really?   Example please.
Parent - - By Harvey Williamson (*****) Date 2012-02-19 02:49
I am sure one of them will write about it 1 day.
Parent - - By Homayoun_Sohrabi_M.D. (***) [us] Date 2012-02-19 03:39
Harvey,

have you noticed you make a lot of claims w/o backing them up with one shred of evidence or info?    You just got through insulting Uly saying how little he knows and how much you know.   When we ask you about your knowledge though, you can never discuss it because it's always top secret.   

My mistake for asking you a question again.   Won't happen again I hope.
Parent - - By Harvey Williamson (*****) Date 2012-02-19 03:41
Not Top Secret but not mine to share.
Parent - By TheHug (Bronze) [us] Date 2012-02-19 04:14

> I said a top GM 2700+ they are now using computers in ways you could not imagine.


If I may ask are you taking ways they can travel with there analysis? (which I have heard a few interesting ideas in this aspect) Or just new different ways to analyze there chess games. And if its the latter is there any benefit to the common corr chess player?
Parent - - By Highendman (****) Date 2012-02-18 19:46
Thanks for the compliment Uly. Incidentally a couple of days ago I reviewed some of my wins vs. the Rybka Cluster and had fun recalling those games.

As to how a 2700+ using strong h/w would fare against me using equal h/w: there's no info to have a factual discussion. Only assumptions.

These guys are super competitive, super smart. If the purse was enough to get such a GM playing (and putting his ego on the line) - they'd probably seriously prepare.

I'd wager a 10 game match (me + strong h/w vs. 2700+ using same h/w) would end in all draws or at most +1, can't say to which side...

Of course, I can't see, even with $20K purse, an 2700+ GM having any interest in crossing swords with an unrated patzer... they have nothing to gain.
Parent - - By Watchman (***) Date 2012-02-19 06:32

>Thanks for the compliment Uly.


Might as well have been complimented by Beavis.

>I'd wager a 10 game match (me + strong h/w vs. 2700+ using same h/w) would end in all draws or at most +1, can't say to which side...


Of all the stupid and equally arrogant statements I have ever heard... this has to rank right up there in the top 10.

Harvey why you trying to reason with tweedledee and tweedledum?  Just to get a good laugh? :lol:
Parent - By Highendman (****) Date 2012-02-19 06:36
Thanks for the compliment, Watchman :) long time.
Parent - - By turbojuice1122 (Gold) [us] Date 2012-02-19 13:26
Actually, his statement is probably quite accurate for many people on the forum with strong hardware, which almost eliminates elo gaps.  Strong hardware is strong enough that even with some decent human prodding, it would be nearly impossible to come up with many clearly stronger moves.  Don't forget that Shahar came very close to drawing an OTB blitz game with Kasparov, and only lost due to time trouble in the endgame, so his statement can be said with much higher confidence than most other people on the forum.  But even with strong hardware by itself playing 3300+ level chess, there isn't all that much that a strong GM is going to be able to add that someone like Shahar also won't be able to add.  The main areas of strength difference are already eliminated up by the machine.

> Of all the stupid and equally arrogant statements I have ever heard... this has to rank right up there in the top 10.


This statement doesn't bode well for an analysis of your intelligence.
Parent - - By Banned for Life (Gold) Date 2012-02-19 15:25
It can be argued that advanced chess (as defined by GK) is more like correspondance chess (with engine assistance of course) than OTB. There is certainly some correlation between strength in these two domains, but I agree that Harvey is probably overstating this, and it isn't clear how much of an advantage a 2800+ GM would have over a lesser GM in this venue.. Of course with Anand using his system, Harvey has a self interest in playing up the importance of the GM's strength (and you can always bet on Harvey aligning with his self interest :-).

As far as HEM is concerned, it is clear that he is a master of the advanced chess domain. So in a no holds barred match between advanced chess teams, I think you'd be perfectly justified in betting on him, remembering that OTB opinion is only one ingredient in the mix, and not one that can't be purchased...
Parent - By Uly (Gold) [mx] Date 2012-02-19 15:36
As for the corr chess and advanced chess connection, here's Ozymandias's report of the first round:

"some groups showed a very high level, arguably, the highest ever seen in any chess tournament. Which prompted the early elimination of known players such as: Mark Eldridge, Ralf Greweling, Alvin Alcalá (current FICGS freestyle champion), Erdogan Günes (long time computer chess expert and author of the Rondo book for the 2010 WCCC in Kanazawa), Clay Hofmeister (winner of the the Mundial Chess tournament in 2010), Patrik Schoupal (EtaoinShrdlu), Mark Noble (New Zealand correspondence chess champion in several occasions), Roger Zibell (Houdini book author) and Herbert Kruse (Kreuzfahrtshiff). And this isn't the only proof, the fact that the first and second placed players in the last Freestyle have, so far, drawn all of their games against centaurs, is something to note."

Personally, I think the GMs would want to avoid such tournaments for fear of being humiliated.
Parent - - By Kappatoo (****) [de] Date 2012-02-19 15:40
I think a 2800+ GM would not have a big advantage against an average GM, simply because at this time control, it is very difficult to contribute much of substance to the engine's output. I know this is a very unpopular opinion in this forum, but I am really unconvinced that the 'masters of the advanced chess domain' are not simply those with the best book and the best hardware.
Parent - - By Banned for Life (Gold) Date 2012-02-19 16:17
but I am really unconvinced that the 'masters of the advanced chess domain' are not simply those with the best book and the best hardware.

If this were the case, one would expect an unassisted engine to perform at least as well as a centaur, unless you are arguing that people do a better job at managing the clock. Putting that aside though,  there are always going to be positions where engines rate two or more moves that result in different games equally. I still believe that in these not infrequent occurrences, the engine method of resolution (random selection) will be inferior to the preference of a strong chess player.
Parent - - By Kappatoo (****) [de] Date 2012-02-19 17:10

> If this were the case, one would expect an unassisted engine to perform at least as well as a centaur


Not necessarily; I wouldn't take it that far. But I would expect that centaurs are not significantly stronger than unassisted engines.

> Putting that aside though,  there are always going to be positions where engines rate two or more moves that result in different games equally. I still believe that in these not > infrequent occurrences, the engine method of resolution (random selection) will be inferior to the preference of a strong chess player.


Generally yes, although I am not sure how significant the effects of this superiority really are. In freestyle chess, the problem is that, unless you have two equally good computers, it will take time to even determine that the engine evaluates two moves (close to) equally. The centaur thus risks ending up playing worse than the unassisted engine which simply has more time to calculate.
Parent - - By Banned for Life (Gold) Date 2012-02-19 17:32
But I would expect that centaurs are not significantly stronger than unassisted engines.

The consensus a few years ago was 150 Elo. This was expected to decline over time. I suspect that its still 100 Elo or so, which would qualify as significant in my book! :smile:

In freestyle chess, the problem is that, unless you have two equally good computers, it will take time to even determine that the engine evaluates two moves (close to) equally. The centaur thus risks ending up playing worse than the unassisted engine which simply has more time to calculate.

Many of the top teams have pretty impressive hardware resources, so concluding that several moves are seen as equal (or roughly equal) by the engines is certainly an already familiar situation. Each team will have a method of dealing with this. I would expect that if the team had a high quality GM team member, he/she would make the call.

Please note that in most cases (at least non-cluster cases), the centaur teams will be using a lot more hardware than the pure engine. For this reason, I don't think your last comment is generally valid.
Up Topic Rybka Support & Discussion / Rybka Discussion / Rybka and Houdini at 40/120 Timecontrol?
1 2 3 4 5 6 Previous Next  

Powered by mwForum 2.27.4 © 1999-2012 Markus Wichitill