* World Chess Software Championship (not solving championship)
I also don't get why this should be a software championship - obviously cluster capability is not rewarded there :)
> mmmh ok. but here http://www.grappa.univ-lille3.fr/icga/tournament.php?id=210 it's not called like that :) I'm confused :)
But here it is: http://www.grappa.univ-lille3.fr/icga/event_info.php?id=41
> I also don't get why this should be a software championship - obviously cluster capability is not rewarded there :)
I would think to determine the best chess software. To bad Rybka was not one of the participants.
It's easy to really find out which engine is the strongest, that's what rating lists are for. but that's a completely different thing.
It is perfectly legitimate for the winner of the equal hardware tournament to trumpet his victory in this format. This is the most relevant tournament for the 99.9999% of players who don't have a cluster or cluster software. I am guessing that Team Rybka did not compete in this because they no longer care about non-cluster Rybka.
I don't think the qual hardware tournament has any relevance just like the main tournament has no relevance for a normal chess player. Ask the rating list guys, they will confirm that in 9 games anything can happen and drawing conclusions about the engine's strength then is totally wrong.
I consider the use of custom books to be an advantage in these type of tournaments when compared to the ratings lists, as they allow the engine developer to generate an opening book that capitalizes on an engine's strengths and avoids its weaknesses, rather than being judged on what is in some generic opening book.
Ultimately, the best way to test would be to run something like the current hardware restricted tournament, but in a round robin format with a much larger number of games. This would be impractical for a human tournament, but could be easily done for an engine tournament by leasing server time and running many games in parallel.
Conversely, the ultimate hardware test, which I personally find very interesting, is the least educational when it comes to relative engine strength, as nothing that is used has any solid relationship with what is commercially available. I liken this to being a fan of the Can Am series races, where one could be entertained by Jim Hall's outrageously innovative cars which had absolutely no bearing on the reality of ordinary people. Sorry to use an example from decades before you were born...
As far as it not being overly accurate (as pointed out by other posts), I think bragging rights in a tournament + engine rating lists should both be used by people wanting to purchase chess software. Even if a particular software doesn't have the highest engine rating, it would still be considered a worthy engine if it can win tournaments.
I think the world chess software championship has more relevence for many people (over the open hardware "world computer chess championship"), if only because the software tournament shows which engines can do very well on a home computer.
I'm not particularly impressed by Rybka's performance in the World Computer Chess Champtionship, because very few (are there any?) people want to put together a 200 core cluster computer to play chess.
(Ya, I know, Rybka Cluster isn't available for purchase anyway. It would be almost impossible to sell it, since it so highly tuned for a specific hardware system, among other items)
Don't get me wrong, I think having cluster chess computers is a great thing, but more from a analytical setting, or in an engine room were the cluster plays many games against a large variety other engines, rather than in a head to head shootout competition with X number of rounds to decided the winner.
1) Which engine is strongest on equal hardware, scientifically speaking.
2) Which automated chess entity is the current king of the hill in a sporting event (like, formula 1 for racing cars).
My point of view is that a world championship or a chess match is a sporting event. You are not seeking any scientific answers, but you are rather organizing an event to allow someone to call himself the current "king of the hill". You attach specific significance to a designated event. Just like Spain is king of the hill in soccer because they won the WC (and not some series of obscure training matches). As such, you are interested in PEAK performances, the best of the best on a given day (or weak, or month).
While on the other hand, you have the scientific curiosity where you are interested in finding out which entity is the best one in principle, that is, you play enough games to rule out randomness, and you enforce equal hardware and openings (or in the case of soccer, you enforce neutral fields and you play over so long a period so that injured players are cancelled out of the equation).
In my view, you have two excellent options for answering these two questions. One is a WC where everything is allowed, so that you get the PEAK performances. The other one is the tests performed by the rating groups where they play countless games on equal hardware and openings.
From this point of view, an equal hardware WC adds nothing significant to the whole equation. It is practically redundant, since it only goes towards answering a question that the rating groups are answering anyway within the first week of their extensive testing procedure lasting many months.
On a more aggressive note, you could say that an equal hardware WC actually undermines the "real" WC (where PEAK performance is on the table) and as such is counterproductive. It takes away value from the "real" WC.
So that could be an argument against participating in an "equal hardware" WC.
Please note that I have absolutely no insight into the background of Rybka not participating in the equal hardware WC.
I just find that, in general, the arguments put forward for "being interested" in an equal hardware WC are not particularly good, since the answers sought from this thinking are answered 100 times better by the rating groups anyway.
> I just find that, in general, the arguments put forward for "being interested" in an equal hardware WC are not particularly good, since the answers sought from this thinking are answered 100 times better by the rating groups anyway.
But for me they are not. I find the sporting element to be more exciting than ratings tests. And for me I find the sporting element of an equal hardware even to be even more exciting and interesting to follow. For in the back of my mind I am not wondering if hardware had anything to do with it so I can follow the event as an even sporting event.
Second, the engines in the tournament are not the same engines that are tested and listed by the various rating lists.
Those engines that played aren't listed in rating lists, sure. If you would want to know the "real" strength of them, you would have to make a 1000 games test or something like that. Otherwise the rating lists could stop engines tests after 10 games :)
What paper would that be?
EDIT: The original list was missing a few countries.
November 2009 - February 2010
April 2010 - June 2010
In any event, the ratings you've provided seen to be very well correlated with WC results, and after seeing these lists, I am much less impressed by Paul the Octopus.
Norway at 14th place and in 1995 we even were 2nd. As much as I'd love it to be true... we're nowhere near the top.
- Small population
- Weather isn't conducive to practicing 365 days a year
- People are too polite
There's also the possibility the country is too prosperous. I suspect that in Brazil, you get a fair number of great players who grew up very poor with football being their only chance for a good life (same as here in the US with basketball). Maybe in Spain and Portugal and even Italy as well.
> Maybe in Spain and Portugal and even Italy as well
As for Spain, that was 60 years ago
b) afraid to loose
c) irrelevant tournament
d) let's give the runners up a shot at a price as well
Rybka can only lose status in these events even when it wins there is always the question about the hardware factor.
The positive news from this event was Rondo. I'm looking forward to playing with this engine in the future.
guess this was a significantly improved version over ds12 and wonder when it will be released.
> maybe vas didn't want to let the private Rybka version run on those computers :)
Was he afraid all the new and improved bugs in the private Rybka would crash those computers??
> My money was on c but this apparently is the official reason, Vas afraid of someone hacking the extraneous computer, or otherwise getting the Rybka executable, which would doom his remote model, and even Rybka 5, etc. Too many risks, reward too small.
I agree, wasn't worth the risk for him
Right there, nothing to risk, Rybka would have won easily, so my money again is on c.
So we have "the tourney was irrelevant" as one option, "very few confidence on the performance on the commercial version on small hardware" as another, perhaps even "we never thought about the possibility of using commercial Rybka instead of the last development version".
Paul commented that "Rybka lost by not joining", are we again on "the risk was too high if Rybka joined and lost"? Because I feel the Cluster's reputation is being damaged more by the games it has been drawing against supposedly weaker opponents, making me doubt if commercial Rybka 4 on a Quad would have done worse than the Cluster.
Powered by mwForum 2.27.4 © 1999-2012 Markus Wichitill