Not logged inRybka Chess Community Forum
Up Topic The Rybka Lounge / Computer Chess / Critter 1.4 running for the IPON
- - By Ingo (***) [de] Date 2011-12-27 20:32
If interested peek here:

http://www.inwoba.de

Have fun
Ingo
Parent - - By Arrière Pensée (Gold) Date 2011-12-28 01:28
Have you been able to get a listing on the Critter ratings - I had it once and then it went blank. I haven't been able to call it back since.
Parent - - By Ingo (***) [de] Date 2011-12-28 07:56
FIXED

Problem here, transmitting programm crashed.

BYe
INgo
Parent - By Arrière Pensée (Gold) Date 2011-12-28 07:57
Great! And thank you for all that you are doing! :smile:
Parent - - By Brish (**) [ca] Date 2011-12-28 12:07
1182 games and Critter 1.4 is rated  2984 (Rybka is 2955 on the IPON rating list).

If Critter keeps its score it will be the #2 engine on IPON.
Parent - - By Labyrinth (****) [us] Date 2011-12-28 13:43
I wish we could see what the list was like for Houdini :-/
Parent - - By Ingo (***) [de] Date 2011-12-28 14:52
You can! All individual results are in the download.

But you should not look at the 100 game matches. That is statistical nonsense!

Bye
Ingo
Parent - - By Banned for Life (Gold) Date 2011-12-28 15:51 Edited 2011-12-28 15:53
Hi,

Quick question:

Do you recalculate the ratings of existing engines when you add a new engine to the list? For example, in the current situation, Critter 1.4 is playing very well against Komoddo, but less well against Naum 4.2. This implies that if Critter 1.4 had been around earlier, Naum 4.2 would be rated slightly higher and Komodo slightly lower...
Parent - By Ingo (***) [de] Date 2011-12-29 15:17

>Do you recalculate the ratings of existing engines when you add a new engine to the list?


After a match, yes. The current real list is always correct.

>For example, in the current situation, Critter 1.4 is playing very well against Komoddo, but less well against Naum 4.2. This implies that if Critter 1.4 had been >around earlier, Naum 4.2 would be rated slightly higher and Komodo slightly lower...


But it wasnt there! The running turney is always an "estimation" (which turns out to be very good usually)! The correct rating can be find after a run with all engines in the rating list.

BYe
Ingo
Parent - By Labyrinth (****) [us] Date 2011-12-29 00:36
Nobody cares that the new Critter and Komodo made a positive score against Houdini :-(

I guess you need 500,000 games at 10 hour time controls on top of the line rack mount server for anybody to give a hoot :-/

For me though I say awesome! Good job, Houdini isn't easy to beat!!

Well there's still some more games left between H and critter so table could turn.
Parent - - By Labyrinth (****) [us] Date 2011-12-29 10:23
er IPON match list for Critter appears to have crashed.

Rating list still not updated with K4.. and maybe Critter by now.
Parent - - By Wrath of the Titans (****) [sa] Date 2011-12-29 14:33
It seems we have a new number 2! :cool:
Parent - By Labyrinth (****) [us] Date 2011-12-29 19:20
I think Houdini 1.5a is still #2 :-/

Followed by K4 and and Critter 1.4 (Critter 1.4 like 2 elo behind K4 on this list).
Parent - By Razor (****) [gb] Date 2011-12-29 15:49
Looks like we have a new #2 single core engine - Critter 1.4 - well done Richard!  :smile:
Parent - - By Ingo (***) [de] Date 2011-12-29 16:28
Testrun finished!

Have a look here: http://www.inwoba.de

Remark: As with the last two runs there where a lot of discussions about the individual results (what engine beats whom). That are just 100 games with only 50 opening. [u]To draw conclusions out of this few games, which additionaly hardly represent "Chess", is statistical nonsense![/u]

However, if you are interested in the individual results go to the archive section. You can download the individual results there. For those who are interested, the Elostat rating is in that file as well - and it looks a bit different for Komodo and Critter!

Bye
Ingo
Parent - - By Wrath of the Titans (****) [sa] Date 2011-12-29 17:10
Ingo, would you run battle of champions like the top 5 engines around. This wil remove doubts of the real top 3 engines.
Parent - - By Ingo (***) [de] Date 2011-12-29 17:31
What doubts?

The results for a 100 game match are statisticaly abolutly useless - as I wrote above and be ignored (but I expected that). The only thing that matters are Elo at the end. The first and the second do not overlap error margins, in contrary, they have an obvious gap. You can argue about the second and the third, but not about the first!

Bye
Ingo

PS: You have all the needed data in my download. YOu can make a list by removing any engine you like to have the result you want ...
Parent - By DGB (**) [pt] Date 2011-12-29 17:45
+1
Parent - By Banned for Life (Gold) Date 2011-12-29 17:48
Thanks Ingo. Your site is extremely useful.
Parent - - By Ray (****) Date 2011-12-29 19:16
What he is saying is, how would the ratings list look if you did not play engines against each other that are 300-400 Elo apart ? E.g. if Komodo had not played Crafty, Tornado, Jonny, Loop, Umko, Stelka, Toga, Onno ?  Same for other engines. If on your entire database, you stripped out all games from all engines where the Elo diff was greater than +/- 300 Elo, or even 250 Elo, what would the ratings list look like then ? Would the rankings be the same ?  I think somewhere you provide a pgn containing bare game headers and the result don't you, so anyone is free to do this themselves if they are interested ?
Parent - - By Wrath of the Titans (****) [sa] Date 2011-12-29 19:28
That is exactly what I meant! It's like a "super tournament" just like what we have in human games..Linares, London Chess Classics, etc...
Parent - - By Ray (****) Date 2011-12-29 19:35
It was a very valid question, and actually I'd like to see the answer from what I posted above. I'm hoping that Ingo provides the download I mentioned, but at the moment I can't find it on his site.

Of course anyone can play a round-robin between what they think are just the top 5 engines, I'll probably do it myself when I get some spare CPU resource.
Parent - - By Ingo (***) [de] Date 2011-12-29 20:13

>It was a very valid question, and actually I'd like to see the answer from what I posted above. I'm hoping that Ingo provides the download I mentioned, but at the moment I can't find it on his site.


It is there as described in this thread above under Archive and then scroll down to  'individual statistics'. It is easy to find ...

The only problem with your proposal is, that your error bar is going up because you have very few games ... (And of course you can do that with any other rating list as well)

Bye
Ingo
Parent - By Ray (****) Date 2011-12-29 21:34

> It is there as described in this thread above under Archive and then scroll down to  'individual statistics'. It is easy to find ...
>
> The only problem with your proposal is, that your error bar is going up because you have very few games ... (And of course you can do that with any other rating list as well)
>
> Bye
> Ingo


Thanks, I'll look more carefully !

A valid point of course about the error bars, you can't have your cake and eat it too.
Parent - By DGB (**) [pt] Date 2011-12-29 20:23
If you want to do that put around 500 games per match.
Parent - - By Ernst (***) [nl] Date 2011-12-30 11:13 Edited 2011-12-30 11:18
In an attempt to answer your question, I made the calculations with the top 5 against eacht other, and the top 3. Both are not very meaningfull, because of the small number of games played. Furthermore, the top 5 didn't played the same number of games against each other.

Top5:

Rank Name                   Elo    +    - games score oppo. draws
   1 Critter 1.4 SSE42     3031   30   29   208   56%  2997   54%
   2 Houdini 2.0 STD       3030   24   24   351   56%  2993   42%
   3 Komodo 4 SSE42        3000   23   23   352   50%  3002   45%
   4 Deep Rybka 4.1 SSE42  2991   27   27   252   46%  3018   47%
   5 Stockfish 2.1.1 JA    2969   27   28   253   42%  3018   48%


Top 3:

Rank Name                Elo    +    - games score oppo. draws
   1 Critter 1.4 SSE42  3017   25   25   200   53%  3002   49%
   2 Houdini 2.0 STD    3004   26   26   200   49%  3009   44%
   3 Komodo 4 SSE42     3000   26   26   200   48%  3011   39%
Parent - By Ray (****) Date 2011-12-30 11:40
Indeed, not enough games. How I would do it is strip out all engine match pairings where the ELO difference was greater than +/- 200 ELO and do a new list. I know that can be circular, but using the ratings as they are in the existing published list +/- 200. Hopefully you would get a large enough number of games.

I'm not sure how to do this myself easily.
Parent - - By Ingo (***) [de] Date 2011-12-31 10:00
Hello Ernest,

Something was/is wrong with your list. You have 208, 351, 352 ... games, but it always have to be x00 as I only run 100th of games ...
Despite that I have finished the full run now, you can do that with the top 10, 5, 3, 2, 1 (or what ever cherry is to be picked :-) ) and should have 400 games each engine (Top5 )

Bye
Ingo
Parent - By Ingo (***) [de] Date 2011-12-31 10:07
I made it by myself.

This would be a ranking just out of the TOP 5 engines:


   1 Houdini 2.0 STD         59   22   22   400   55%    28   42%
   2 Critter 1.4 SSE42       52   21   21   400   54%    30   53%
   3 Komodo 4 SSE42          34   22   22   400   50%    34   44%
   4 Deep Rybka 4.1 SSE42    27   22   22   400   49%    36   50%
   5 Stockfish 2.1.1 JA       0   22   22   400   43%    43   52%


They get closer, but nothing changes in ranking.

Bye
Ingo
Parent - - By Ingo (***) [de] Date 2011-12-29 18:26
Hi,

I just realized that yesterday was the day with the most visits of my website. Critter 1.4 beats Houdini 2.0!

If you want to know what countries are interested in the test:



There are some strange "errors" in the picture because of some special gemran characters but you see the flag and the percentages.

Bye and thanks for your interest
Ingo
Parent - By Arrière Pensée (Gold) Date 2011-12-29 19:22
Aah! Yes! Thread counting! My dear child.

Here at the Rybka forum we have risen to a higher level of cognitive development. Our Guru leader and Master Guide Dadi- has lifted us out of the Pleroma and primal chaos. We, here no longer feed upon such egoistic, self-indulgent trivialities. Beyond this forum ponder is off.
Parent - By Ingo (***) [de] Date 2011-12-30 20:36
Sorry to pull this thread up again, but I finished the last missing games of Komodo and Critter (vs Fritz 13 and Chiron) with a final minor rating change.

All details on http://www.inwoba.de

Full details in Archives->individual.7z as usual.

Bye
Ingo
Up Topic The Rybka Lounge / Computer Chess / Critter 1.4 running for the IPON

Powered by mwForum 2.27.4 © 1999-2012 Markus Wichitill