Problem here, transmitting programm crashed.
If Critter keeps its score it will be the #2 engine on IPON.
But you should not look at the 100 game matches. That is statistical nonsense!
Do you recalculate the ratings of existing engines when you add a new engine to the list? For example, in the current situation, Critter 1.4 is playing very well against Komoddo, but less well against Naum 4.2. This implies that if Critter 1.4 had been around earlier, Naum 4.2 would be rated slightly higher and Komodo slightly lower...
>Do you recalculate the ratings of existing engines when you add a new engine to the list?
After a match, yes. The current real list is always correct.
>For example, in the current situation, Critter 1.4 is playing very well against Komoddo, but less well against Naum 4.2. This implies that if Critter 1.4 had been >around earlier, Naum 4.2 would be rated slightly higher and Komodo slightly lower...
But it wasnt there! The running turney is always an "estimation" (which turns out to be very good usually)! The correct rating can be find after a run with all engines in the rating list.
I guess you need 500,000 games at 10 hour time controls on top of the line rack mount server for anybody to give a hoot :-/
For me though I say awesome! Good job, Houdini isn't easy to beat!!
Well there's still some more games left between H and critter so table could turn.
Rating list still not updated with K4.. and maybe Critter by now.
Followed by K4 and and Critter 1.4 (Critter 1.4 like 2 elo behind K4 on this list).
Have a look here: http://www.inwoba.de
Remark: As with the last two runs there where a lot of discussions about the individual results (what engine beats whom). That are just 100 games with only 50 opening. [u]To draw conclusions out of this few games, which additionaly hardly represent "Chess", is statistical nonsense![/u]
However, if you are interested in the individual results go to the archive section. You can download the individual results there. For those who are interested, the Elostat rating is in that file as well - and it looks a bit different for Komodo and Critter!
The results for a 100 game match are statisticaly abolutly useless - as I wrote above and be ignored (but I expected that). The only thing that matters are Elo at the end. The first and the second do not overlap error margins, in contrary, they have an obvious gap. You can argue about the second and the third, but not about the first!
PS: You have all the needed data in my download. YOu can make a list by removing any engine you like to have the result you want ...
Of course anyone can play a round-robin between what they think are just the top 5 engines, I'll probably do it myself when I get some spare CPU resource.
>It was a very valid question, and actually I'd like to see the answer from what I posted above. I'm hoping that Ingo provides the download I mentioned, but at the moment I can't find it on his site.
It is there as described in this thread above under Archive and then scroll down to 'individual statistics'. It is easy to find ...
The only problem with your proposal is, that your error bar is going up because you have very few games ... (And of course you can do that with any other rating list as well)
> It is there as described in this thread above under Archive and then scroll down to 'individual statistics'. It is easy to find ...
> The only problem with your proposal is, that your error bar is going up because you have very few games ... (And of course you can do that with any other rating list as well)
Thanks, I'll look more carefully !
A valid point of course about the error bars, you can't have your cake and eat it too.
Rank Name Elo + - games score oppo. draws
1 Critter 1.4 SSE42 3031 30 29 208 56% 2997 54%
2 Houdini 2.0 STD 3030 24 24 351 56% 2993 42%
3 Komodo 4 SSE42 3000 23 23 352 50% 3002 45%
4 Deep Rybka 4.1 SSE42 2991 27 27 252 46% 3018 47%
5 Stockfish 2.1.1 JA 2969 27 28 253 42% 3018 48%
Rank Name Elo + - games score oppo. draws
1 Critter 1.4 SSE42 3017 25 25 200 53% 3002 49%
2 Houdini 2.0 STD 3004 26 26 200 49% 3009 44%
3 Komodo 4 SSE42 3000 26 26 200 48% 3011 39%
I'm not sure how to do this myself easily.
Something was/is wrong with your list. You have 208, 351, 352 ... games, but it always have to be x00 as I only run 100th of games ...
Despite that I have finished the full run now, you can do that with the top 10, 5, 3, 2, 1 (or what ever cherry is to be picked :-) ) and should have 400 games each engine (Top5 )
This would be a ranking just out of the TOP 5 engines:
1 Houdini 2.0 STD 59 22 22 400 55% 28 42%
2 Critter 1.4 SSE42 52 21 21 400 54% 30 53%
3 Komodo 4 SSE42 34 22 22 400 50% 34 44%
4 Deep Rybka 4.1 SSE42 27 22 22 400 49% 36 50%
5 Stockfish 2.1.1 JA 0 22 22 400 43% 43 52%
They get closer, but nothing changes in ranking.
I just realized that yesterday was the day with the most visits of my website. Critter 1.4 beats Houdini 2.0!
If you want to know what countries are interested in the test:
There are some strange "errors" in the picture because of some special gemran characters but you see the flag and the percentages.
Bye and thanks for your interest
Here at the Rybka forum we have risen to a higher level of cognitive development. Our Guru leader and Master Guide Dadi- has lifted us out of the Pleroma and primal chaos. We, here no longer feed upon such egoistic, self-indulgent trivialities. Beyond this forum ponder is off.
Powered by mwForum 2.27.4 © 1999-2012 Markus Wichitill