Not logged inRybka Chess Community Forum
Up Topic The Rybka Lounge / Computer Chess / Houidini 3 at slower time controls
1 2 3 4 5 Previous Next  
- - By Gaмßito (****) Date 2012-09-24 23:47
Ok Robert, first thank you very much for accepting to make some tests at slower time controls. It will be truly interesting indeed.

If you let me choose the time control I would take the CCRL time control: 40/40. Use 'ponder ON' please. The opponents can be Houdini 2.0c and Komodo 5.
Regarding Komodo 5, I think it's a ''bone hard to crack'' in this time controls, so it would be specially interesting to see also some tests against it.

About the opening suite I let you to choose whatever public opening suite you want. Of course it must be ''public'' because we want to see the games.

Best Regards,
Gaмßito.
Parent - - By Stonehenge (***) Date 2012-09-25 00:13
I'd rather not play with ponder on, cutechess-cli doesn't support it and it would also reduce the number of games or the TC by a factor of 2.

Using 40/40, and assuming 80 moves average, the average game would take about 160 minutes. That means we can play about 400 to 450 games.
If we play a match against Houdini 2.0c and Komodo 5 there can be 200 games per opponent. This would require an opening suite with 100 positions.
Alternatively we can play 60/40 to compensate for the relatively slow server (1 MN/sec) and play 120 or 150 games per match (60 or 75 opening positions).
Or keep 40/40 and throw in Stockfish 2.3 to play 3 matches of 120 or 150 games.

I really prefer that you select the opening positions (Silver, Noomen, ... or any combination you want) to avoid any suggestions of bias.
Parent - - By Gaмßito (****) Date 2012-09-25 02:32
Well, I think the 'Noomen' test will be fine. It contains 60 positions, so we can get 120 games per match.

Also, maybe you can add Stockfish 2.3.1 as third opponent. What do you think about playing two matches of 120 games each one at '40/40' time control, one against Houdini 2.0c and the other against Stockfish 2.3.1, but in the next match against Komodo 5 use a slower time control like: 120'+30"?
If it's too slow, we can reduce the time a bit, and play with the previous Fide time control: 90+30, that could be very good too.

Here is my opening suite: http://www.mediafire.com/?5a396plaff6l1t0 - It has 240 positions, but the first 60 corresponds to the Noomen suite (years 2006 & 2008).

Regards,
Gaмßito.
Parent - - By Stonehenge (***) Date 2012-09-25 09:22 Edited 2012-09-25 09:29
For future reference it's better that the 3 matches are played at the same TC - it makes results more easily comparable.
I suggest to use 90'+30" for all 3 matches. This will be slightly longer than initially planned, but why not...

So here's the summary of the Houdini 3 long TC test:
- 120 games against Houdini 2.0c, Stockfish 2.3.1, Komodo 5
- 90+30 TC, 512 MB hash, 60 positions from Noomen 2006/2008 test suites played from both sides.
- No table bases, resign after 3 successive scores larger than 5.00, draw after move 120 if both evaluations below 0.10.
- Hardware: AMD Opteron 6274 Server (SSE4) producing approx. 1 MN/sec per core with Houdini.
- The Houdini 3 version that is played can be considered a Houdini 3 "Beta": close to, but not exactly the final release.
- The error margin on the individual 120-game match results will be about ± 45 Elo.

Do you agree?
Parent - By Regularuser (***) Date 2012-09-25 10:08
This will be a most interesting match.

How many cores will Houdini and Stockfish run on?
Parent - - By Gaмßito (****) Date 2012-09-25 10:30
Yes, I agree with all the points. I am impatient to see this. :smile:

Regards,
Gaмßito.
Parent - - By Stonehenge (***) Date 2012-09-25 12:49
I've now started the first match Houdini 3 - Houdini 2.0c.

To reply to the other questions:

- It's a single-core match, the goal is not to test SMP performance (would be rather difficult with Komodo :wink:), but to get the best chess games within the time limits. That means single-core matches.

- No live broadcast: the matches are run by cutechess-cli on a server, you wouldn't see anything other than a black DOS box with the current match score. Note that anytime 31 simultaneous single-core games are played. After starting the match nothing happens for about 2 or 3 hours. Then slowly game after game will finish, at a rate of 8 to 10 per hour. The full 120-game run should take about 18 hours, the match PGN is gradually written to disk when games finish.
Parent - - By NicVico (*) Date 2012-09-25 12:52
interesting, may i ask what's your hardware?
Parent - - By Stonehenge (***) Date 2012-09-25 13:33
Quad AMD Opteron 6274 box running Windows 2008 R2.
It has 32 Bulldozer modules with 64 cores, but I consider this a 32-core server with AMD hyper-threading and only use 1 core of each Bulldozer module.
This has been my main server for the Houdini 3 development, producing about 100,000 fast games per day.
Parent - - By Gaмßito (****) Date 2012-09-25 17:19

>Quad AMD Opteron 6274 box running Windows 2008 R2.
>It has 32 Bulldozer modules with 64 cores, but I consider this a 32-core server with AMD hyper-threading and only use 1 core of each Bulldozer module.
>This has been my main server for the Houdini 3 development, producing about 100,000 fast games per day.


Great computer. Certainly it has been an excellent buy for the development of Houdini. BTW, assuming you are using all the cores to run Houdini 3, how many KN/s you get? Also, it's possible to know if Houdini 3 will use 32 cores maximum or this time it will be changed to 64?

Gaмßito.
Parent - - By Stonehenge (***) Date 2012-09-25 19:31

> Great computer. Certainly it has been an excellent buy for the development of Houdini. BTW, assuming you are using all the cores to run Houdini 3, how many KN/s you get? Also, it's possible to know if Houdini 3 will use 32 cores maximum or this time it will be changed to 64?


It's great for running many single-core games, but the hardware doesn't scale very well beyond 16 cores. Even compared to my older Opteron 6128 the Bulldozer architecture is a regression.
Houdini 3 will have the same CPU limits as Houdini 2: 6-core for the Standard and 32-core for the Pro.
Parent - - By Hurnavich (Bronze) Date 2012-09-25 20:48
Hi,

Will there be an upgrade price from 2.0c standard customers to Houdini 3 Pro
Parent - - By Stonehenge (***) Date 2012-09-25 21:13
Yes.
Parent - - By Bouddha (****) Date 2012-09-25 21:40
By the way, any raw estimation window for release of v3 ?

Rgds
Parent - - By Stonehenge (***) Date 2012-09-25 21:59
The engine development is finished, but there's still a lot of work in packaging and documentation (e.g. write the User's Guide).
Hopefully around October 10.
Parent - By Bouddha (****) Date 2012-09-26 10:11
Sounds good !

Or at least better than an still expect Komodo MP  :grin:
Parent - By oudheusa (*****) Date 2012-09-26 19:21

> Hopefully around October 10.


Any pre-order discounts? :wink:
Parent - - By shrapnel (***) Date 2012-09-27 04:11
Hi
   I hope you are doing something concrete to prevent the hacking/pirating of Houdini 3. Chess websites are full of people using pirated versions of Houdini 2.0 c, which seems to be as good as the legal version.
Genuine buyers like me feel like fools and feel cheated when taunted by such people.
Hope you are doing something really effective to prevent the hackers.
Parent - - By Stonehenge (***) Date 2012-09-27 10:15

> I hope you are doing something concrete to prevent the hacking/pirating of Houdini 3. Chess websites are full of people using pirated versions of Houdini 2.0 c, which seems to be as good as the legal version.
> Genuine buyers like me feel like fools and feel cheated when taunted by such people.
> Hope you are doing something really effective to prevent the hackers.


The only effective solution to piracy is NOT releasing the software (aka Rybka Cluster model).
It's still not too late for me to make this decision for Houdini 3... :roll:

The fools are not the genuine buyers but the people that use the pirated versions, they kill computer chess.
If you want chess engines to make progress, you should support engine authors - it's the smart decision.
Parent - - By Bouddha (****) Date 2012-09-27 10:22
And this is what I am doing.
But I will not follow a cloud/cluster model as was did for Rybka.

I was very disappointed when Vas/Lukas decision decided go into the cluster/cloud model as a priority and didn't follow that.
Also they must have they own "pocket" reason.... but it seems the Rybka purchase/download is a dead project.

rgds
Parent - - By AWRIST (****) Date 2012-09-27 12:54
You were disappointed. Interesting. But you dont reflect the following, what also Robert took out of the debate:

It is not for reasons you are pretending or assuming (like "users who dont buy engines but play with pirates"), the real problem for a number one like Vas is that his product is dismantled and then the secrets are stolen by his competitors who all lack the success of Vas. Since the Houdini author is present and discussing I could initiate a debate about the historical past of Houdini. Most of you will remember the match between Houdini and Rybka, the latter not on the designed hardware of course and people were enthusiastic about the strong performance of Houdini. Not realising that this would happen to a number one, if it isnt updated since long.

I would call the naive presentation of such a top clash match (Thoresen) dishonest. (1) Rybka not being updated (2) Rybka against something that was tuned against her (3) Rybka not on its designed hardware (4) not even mentioning the hereditary factors in Houdini, e.g. if a special weak parameter in Rybka is repaired in Houdini, that could already make a difference for better results against each other - to sum this up, we all know that Rybka in its actually best code on the cluster is a different animal, BTW this is why most await so dearly R5, in particular all the other authors who want to participate of the new innocations Vas had found.

The actual reports about H3 are disappointing because on CSS Robert clarified that a focus on an increase of say 40 Elo is vague if the error margin is say +- 25 between H2 and H3. I am a little bit irritated because one time Robert says that 1000 games are better but the actual tests with long TC he played just 120 games between all players.

The problem for Roberts H3 are in the testing without existing super strong HW. H3 has the same problem like a future R5. The different results on little HW are irrelevant for the strength with top HW. If only rational decisions would count, then Houdini must cooperate with cluster Rybka. Then efforts like Komodo or Stockfish would become putty.
Parent - - By gotogo (**) Date 2012-09-27 13:12
would call the naive presentation of such a top clash match (Thoresen) dishonest. (1) Rybka not being updated (2) Rybka against something that was tuned against her (3) Rybka not on its designed hardware (4) not even mentioning the hereditary factors in Houdini, e.g. if a special weak parameter in Rybka is repaired in Houdini, that could already make a difference for better results against each other

not to mention the hereditary factors in Rybka? if a weak parameters in fruit repaired in Rybka that could have made Rybka what it is.

Elaborate Houdini bashing. Sounds like you have a very brown nose and if the butt should fart while your nose is in, then you would have freckles as well.
Parent - By Akbarfan (***) Date 2012-09-27 14:28

> Sounds like you have a very brown nose and if the butt should fart while your nose is in, then you would have freckles as well.


:lol::lol::lol::lol::lol:
Parent - By AWRIST (****) Date 2012-09-27 15:33
Together with my scrapie it must be very dangerous indeed.:smile:
Parent - - By gotogo (**) Date 2012-09-25 21:01
So do you think the older opteron would be better to build?
Parent - - By Stonehenge (***) Date 2012-09-25 21:17
If you really want high performance Intel is probably an even better choice.
Parent - By gotogo (**) Date 2012-09-25 21:19
Thanks but that really stinks I like amd.
Parent - - By darmar (**) Date 2012-09-25 13:51
We want see games alive! You can use this tools http://www.screenleap.com/ and permit us keep an eye on. Tnx :wink:
Parent - - By Stonehenge (***) Date 2012-09-25 16:50
I won't do a live cast, but below 3 screen shots of the situation on the server:

1) The Task Manager with the CPU load of the 64 cores - only one core per module is used.



2) The Task Manager with 31 Houdini 3 and 31 Houdini 2.0c processes running.



3) The cutechess-cli window with standing after 22 finished games (+12 -1 =9)

Parent - By jpqy (**) Date 2012-09-25 18:42
Is already a crazy start :eek:

JP.
Parent - - By Bouddha (****) Date 2012-09-25 20:47
Are u going to post the games ?

regards
Parent - By Stonehenge (***) Date 2012-09-25 21:01

> Are u going to post the games ?


Yes, at the end of each 120-game match I'll upload the PGN.
Parent - - By Nelson Hernandez (Gold) Date 2012-09-26 04:21
You may take it for granted, and you may think it's more of a delivery truck than a sports car, but for those of us lacking such hardware these screenshots are a thing of beauty!

Just to clarify, you didn't use the HT cores because that would give H3 too much of an advantage against engines that didn't take advantage of them?

What's the scaling multiplier from 16 to 32 on your AMD?  Is the scaling any different when you use the HT cores?  What's the node count multiplier running flat-out versus a single core with no HT?
Parent - - By Stonehenge (***) Date 2012-09-26 06:38

> Just to clarify, you didn't use the HT cores because that would give H3 too much of an advantage against engines that didn't take advantage of them?


I don't use them because they generate unpredictability. If two cores of a Bulldozer module are used, they run at about 65% of the speed when only a single core is used. That complicates the testing, as there is no guarantee that all threads run at the same speed all the time.

> What's the scaling multiplier from 16 to 32 on your AMD?  Is the scaling any different when you use the HT cores?  What's the node count multiplier running flat-out versus a single core with no HT?


The scaling from 16 to 32 is poor (I don't recall the exact numbers), and from 32 to 64 is non-existent.
Parent - - By Ray (****) Date 2012-09-26 09:53

> I don't use them because they generate unpredictability. If two cores of a Bulldozer module are used, they run at about 65% of the speed when only a single core is used. That complicates the testing, as there is no guarantee that all threads run at the same speed all the time.


You must have some way of assigning the individual threads to the modules ?
Parent - - By Stonehenge (***) Date 2012-09-26 14:24

> You must have some way of assigning the individual threads to the modules ?


I use "start /affinity" in Windows to run cutechess-cli, which automatically gets inherited in the engine processes it spawns.
Parent - - By Ray (****) Date 2012-09-26 14:42 Edited 2012-09-26 14:45
Thanks. So it is

CPU 0  start /affinity 1
CPU 1  start /affinity 2
CPU 2  start /affinity 4
CPU 3  start /affinity 8
CPU 4  start /affinity 16
CPU 5  start /affinity 32

etc ? On a normal CPU e.g. Phenom II X6

On yours, for 1 thread per module, would it be CPUs 0,2,4,6,8 etc
Parent - - By Stonehenge (***) Date 2012-09-26 14:57
"start /affinitity" expects a hexadecimal value.
On the AMD server I run cutechess-cli with "start /affinitity 5555555555555555" to skip every second core.
Parent - - By Stonehenge (***) Date 2012-09-26 16:08
You can check the command's documentation by typing "start /?" to see that AFFINITY expects a hex value.
Parent - By Ray (****) Date 2012-09-26 17:21 Edited 2012-09-26 18:01
Yes it does say that. But I tried the values relating to binary, and all the above examples worked exactly as the article suggested they would. e.g.  start /affinity 16 runs a process on CPU4 (5th of 6) perfectly. Very odd.

So for cutechesscli - does one instance run all the matches concurrently ? I must study that tool.

EDIT: I saw another article about a quad, talking about a hex mask,  which gives the same values as in my post.

CPU3 CPU2 CPU1 CPU0 Bin Hex
================================
OFF OFF OFF ON = 0001 = 1
OFF OFF ON OFF = 0010 = 2
OFF OFF ON ON = 0011 = 3
OFF ON OFF OFF = 0100 = 4
OFF ON OFF ON = 0101 = 5 confirmed via task manager
OFF ON ON OFF = 0110 = 6
OFF ON ON ON = 0111 = 7
ON OFF OFF OFF = 1000 = 8
Parent - By chess_pr0 (**) Date 2012-09-27 22:42
im sorry to say...

but if you think the equal match is deep rybka cluster 200+cores vs. H3 on 12 cores or say 32 cores.. then NO..

Robert Houdart make plausible a cluster version of Houdini 3 x64.Then Fans OF Houdini or shall I say Members/USers Will provide a Cluster Of 200+CORES for Houdini to smash Rybka 4

SIck and tired to point of VOMIT of all this crap of cluster rental blah BLAH blah....

Rybka 4 IS WEAK unless its in a CLUSTER /CLOUD PROGRAM... HAH!!!JOKE ON YOU VAS..

NEWSFLASH....HOUDINI NEW ENGINE TO BE BEATEN... DONT NEED 200+ CORES TO BE BEATEN.... THAT PROVES WHO HAS A DEEPER POCKET BOOK AND GROUPIE FANBASE OF VOLUNTEER'S.

HOUDINI 3 X64 WIL CRUSH RYKA 4 CLUSTER OR NO CLUSTER.. BRING OUT RYBKA 5 WITHOUT THE NEED OF 200+ CORES TO BEAT HOUDINI THEN YOU CAN SELL YOUR ENGINE COMMERCIAL AGAIN..

IF NOT..

I RATHER USE FRITZ 13 THAN RYBKA 4 OR 4.1 FOR ANALYSIS......
Parent - - By Stonehenge (***) Date 2012-09-25 19:35
The situation at the halfway mark: +24 -6 =30.

Parent - - By Christian Packi (****) Date 2012-09-25 20:03
I guess you optimized it against Houdini 2. Rybka also always had better results against it’s predecessors.
Parent - - By Stonehenge (***) Date 2012-09-26 14:20

> I guess you optimized it against Houdini 2. Rybka also always had better results against it’s predecessors.


The Houdini development framework uses 9 engines, including 2 previous Houdini versions (1.5 and 1.03a) but not Houdini 2.
Parent - By Christian Packi (****) Date 2012-09-26 14:33
Interesting, thanks.
Parent - By Gaмßito (****) Date 2012-09-26 19:00
Robert, the results so far are really quite impressive. At first glance, I honestly think this new version is much stronger than what I thought. Perhaps (hopefully it will be) above 60 pts Elo.

I'm watching the games now against Houdini 2.0c and I really like what I have seen. What a pleasure to see this battle!

Regards,
Gaмßito.
Parent - - By Stonehenge (***) Date 2012-09-26 06:30
Final result of the Houdini 3 - Houdini 2.0c match: +48 -15 =57
76.5-43.5 (+94 Elo ± 42 Elo).
Download Games
I haven't seen any of the games yet, hopefully everything is fine.

Houdini 3 - Stockfish 2.3.1 match has started - no games finished yet.
Parent - By Regularuser (***) Date 2012-09-26 06:49
Wow.  That was carnage.  

I currently use H 1.5    If H3 does half as well as against Komodo 5 I will be buying it without a second thought.
Parent - By Dr.Wael Deeb (***) Date 2012-09-26 07:28
That's 63.75% for Houdini 3 against Houdini 2.0c....

Good improvement at this very high level....

But will it do so good playing the other top players in the field !?

We'll have to live & see....
Dr.D
Up Topic The Rybka Lounge / Computer Chess / Houidini 3 at slower time controls
1 2 3 4 5 Previous Next  

Powered by mwForum 2.27.4 © 1999-2012 Markus Wichitill