Not logged inRybka Chess Community Forum
Up Topic Rybka Support & Discussion / Rybka Discussion / Rybka and Houdini at 40/120 Timecontrol?
1 2 3 4 5 6 Previous Next  
Parent - - By lkaufman (*****) Date 2012-02-17 16:40
Adjusting play based on the opponent's past moves is a rather radical departure from the norm. Maybe someday we can do this, but currently our program (and all others to my knowledge) assumes that we are playing against ourself. Other than opponent's history, how else do you propose to adjust the style of play dynamically?
Parent - - By Razor (****) [gb] Date 2012-02-17 19:58
My answer below, doesn't take in to account your knowledge of the parameters you have at your disposal to adjust {statically} so if we can agree for a moment that for each parameter you have a static value set, to achieve the best balanced tuning across many of the tests you carry out, we have instead a data range of values that can be dynamically applied to each of the same parameters, based on a set of input value combinations; effectively changing the behaviour of Komodo by whatever amount you decide to be sufficient for the particular effect each parameter has - make sense so far?

So, to name a few 'Inputs' we have:
(1) Opponent history (where Komodo knows this and already has a 'tuned' style against this opponent)
(2) Opponent move choices (error rates, i.e., how do moves played compare to Komodo's 'best move' table - how large are these errors)
(3) Opening played
(4) Position assessment (complexity, open/closed, etc)

My thinking on this is that by taking in to account a number of inputs that in turn can affect the playing style of Komodo, you end up tailoring the performance to what is in front of you rather than the 'best fit' approach that may not be too good in some cases {e.g., imagine from the input Komodo is assessing the position to be closed then adjusting the parameter or parameters to make Komodo play less tactically with more emphasis on positionally play, such as, nice squares for pieces}

NB: I guess (2) above is my attempt at trying to come up with a way Komodo could check what I and I imagine you do OTB, i.e., look to see what my opponent has played, primarily to identify and exploit any drawbacks that I can see from the move played.

Any use?
Parent - - By lkaufman (*****) Date 2012-02-18 17:42
Item 4 on your list is the most useful. We do a bit of it now, but much more should be done. Considering the opening is pretty dubious, because a particular structure can come from many different openings, and one pawn break can completely change the nature of the position. Considering the opponent's play is something for the far future; there is too much to do without getting into that can of worms. I rarely pay much attention in my own play (after the opening choice) to my opponent, and many strong players just try to find the best moves in general. I'm not saying it is without value, but certainly having a program do this is not desirable for those who use engines for analysis. They want "objective truth".
Parent - By Razor (****) [gb] Date 2012-02-18 18:21
My list was not meant to be an exhaustive list Larry; just a means to an end really in terms of helping me get a viewpoint across.
Parent - By Banned for Life (Gold) Date 2012-02-17 21:34
Probably the easiest advantage to exploit would be having a significant clock advantage over the opponent. In this case, one could attempt to reduce opponent's ponder hit rate (when nearly equal move choices were available) and give preference to lines that have a significant score shift between the opponent's expected time per move and own time per move.
Parent - - By Werewolf (*****) [gb] Date 2012-02-21 08:27

> But we need a good, rigorous definition of "aggressive".


Do you distinguish between the classical concept of 'aggression' being King orientated and 'the initiative' being less specifically focused but much more concerned with quick action?
Parent - - By lkaufman (*****) Date 2012-02-21 17:32
They are two different concepts. I think we already do enough (or almost enough) about the king, but the initiatve is not really a concept that is in any program to my knowledge. Maybe it should be.
Parent - - By Werewolf (*****) [gb] Date 2012-02-21 17:39

> the initiatve is not really a concept that is in any program to my knowledge


Wouldn't you say that 'Null Move' is something which represents play for the initiative?
Parent - By lkaufman (*****) Date 2012-02-21 21:13
No, null move just amounts to saying that if your position appears to be good enough to allow the opponent to move twice in a row without success, you needn't examine it so deeply.
Parent - By Uly (Gold) [mx] Date 2012-02-21 19:10 Edited 2012-02-21 19:12
I think there are three aspects to aggressiveness:

1- King Aggressiveness: Aggressiveness towards the enemy king, what pieces attack its surrounding squares, what pieces can check it or make it play forced moves (e.g. "play the king from c8 to a8 or die"), how close are they to it, etc. I think this setting should be asynchronous, for white and black, just because you want to attack the opponent's king it doesn't mean you're obsessed with defending yours. Zappa Mexico has this well covered, but plays poorly on the defensive, so this parameter is only useful when you are on the attack.

2- Activity Aggressiveness: About having the initiative, mobility, and restricting the mobility of the opponent. Getting passed pawns may be included in this, as without having attacks against the king another way to win is to end with a passed pawn with your initiative and convert to a win. I think Rybka 4 Mindbreaker's setting is excellent for this, without drawbacks as you can set one for the black or the white side, though of course you have to know when being more passive or patient may be better.

3- Complex Aggressiveness: In where the goal is complicating the position and defeating an opponent in a position that you play better. This includes pawn sacrifices, piece sacrifices against pawns, sacrificing the exchange or forcing the opponent to sacrifice theirs, Queen for 2 rooks or 3 pieces exchanges, etc. Material imbalances get huge bonuses for the sake of imbalance, so if you see a variation where you end three pawns against piece that is 0.20 worse than something else, you go for it anyway. The catch is that if the opponents avoids going for that position he ends worse than if you didn't go for it. It's "aggressive" because it decreases the chances of a drawn game significantly. Rybka 3 Dynamic at high contempt is a great tool for seeking these kind of positions, but against a stronger opponent this kind of playing is bad because you may end up being the loser.

Kind 2 and Kind 3 may run against each other, while Kind 2 keeps its own pieces on the board (there can't be high mobility if the pieces are exchanged) and probably seek closed positions that maximize the power of individual pieces, Kind 3 looks for open positions that lead to the material imbalances.
Parent - - By Werewolf (*****) [gb] Date 2012-02-27 17:14

> But we need a good, rigorous definition of "aggressive".


OK, in the last month I've done some work on this.
I (amazingly) forgot but I actually did an A-Level maths project on this very subject 20 years ago - "How do you measure aggression in chess using maths?"

I came up with a simple mathematical formula which was based on statistics, and I used it to test the aggression of human world champions. It seemed to work, however, that was then and this is now.

Please could you help me by proposing an engine you regard as fairly strong (>2700 elo) but also fairly passive in its playing style. I will carry out some tests on this engine and I'll see if the idea works.

Thanks.
Parent - - By Uly (Gold) [mx] Date 2012-02-27 19:34

> Please could you help me by proposing an engine you regard as fairly strong (>2700 elo) but also fairly passive in its playing style. I will carry out some tests on this engine and I'll see if the idea works.


(Personal suggestions: Rybka 2.3.2a and Naum 3 - those wee regarded as in the limit of highest strength and highest passiveness.)
Parent - - By Werewolf (*****) [gb] Date 2012-02-27 19:39

> (Personal suggestions: Rybka 2.3.2a and Naum 3 - those wee regarded as in the limit of highest strength and highest passiveness.)


I haven't got Naum 3, but I'll add Rybka 2.3.2a to the "passive" list to see if my idea is valid. Thanks.
Parent - By Uly (Gold) [mx] Date 2012-02-27 20:08
I suggest you add Rybka 2.3.2a and Rybka 3 Dynamic and compare their scores, Dynamic should be a lot more aggressive.
Parent - - By lkaufman (*****) Date 2012-02-29 01:26
If aggression includes willingness to sacrifice material for less tangible factors like attacking chances against the king, you could not get a cleaner example than by comparing Komodo 2.xx (say 2.03) to Komodo 3.0. The change in this aspect was quite dramatic.
Parent - By Werewolf (*****) [gb] Date 2012-02-29 07:39

> If aggression includes willingness to sacrifice material for less tangible factors like attacking chances against the king, you could not get a cleaner example than by comparing Komodo 2.xx (say 2.03) to Komodo 3.0. The change in this aspect was quite dramatic.


It's not sacrifice specific, but it does concern 'action taken towards the enemy king' which will obviously include sacrifices that lead to attacking chances. I hope to get some results back in a month and I'll certainly include Komodo 2 & 3.
Parent - - By tomgdrums (****) Date 2012-02-12 23:28

> Houdini 2 was not tested by CEGT at 40/2 hours and won't be because they only test versions at that level that were rated above previous ones on their 40/20' list. Houdini 2 (on 1 core) was rated below 1.5 on that list. This policy is quite correct: it would be unfair to give Houdini two chances to be top if the newer engine is not provably stronger at the next closest time limit. Furthermore CCRL also found that Houdini 2 on 1 core was below version 1.5. So I think it is quite correct to say that Komodo is the number one single-core engine based on 40/2 hour CEGT testing.
>      I expect our next release will remove any doubt about which is strongest at longer time controls.


Komodo 4 and Houdini 2 have become my go to analysis engines, so I don't have a horse in this race.

However, as usual Larry, your posts come off as disingenuous.

If you spend so much time promoting Komodo's strength at long time controls and how it is better to test with a certain increment setting (can't remember details) why shouldn't Houdini 2 be tested at all time controls as well?
Parent - - By lkaufman (*****) Date 2012-02-12 23:50
For analysis, most people allow some number of minutes per move. Therefore the best measure of analysis strength is long time limit tests, and preferably ones where the time per move is fairly constant, removing the importance of time control optimization. This should be clear to everyone. Unfortunately no one tests at anything like 1 minute plus 3 minutes per move increment. The closest we have is 40/2 hour results, with 40/40' being next best. For those who want quick answers or to run their own blitz matches, a large increment relative to the base time will better simulate analysis. For those who want to run their engines against others on Playchess or elsewhere at blitz, the IPON list is probably the most reliable rating.
Parent - - By Stonehenge (***) Date 2012-02-13 00:16
You truly are a grand-master at cherry-picking rating list results to suit your marketing story.

For analysis, the list with the longest effective time control is CCRL 40/40 with 4 CPU.
See http://computerchess.org.uk/ccrl/4040.live/ .

Current standings:
1  Houdini 2.0c 64-bit 4CPU      3311  +31  −30
2  Rybka 4.1 64-bit 4CPU         3260  +20  −20
3  Stockfish 2.2.2 64-bit 4CPU  3259  +31  −30
4  Critter 1.2 64-bit 4CPU         3253  +27  −26

Surely Critter 1.4 will move to position #2 (I suppose it's being tested now).
It is noteworthy that the ratings shown above are not very different from the IPON results played at about 10 times faster TC, see http://www.inwoba.de/ .
Parent - By lkaufman (*****) Date 2012-02-13 02:50
What does this list have to do with Komodo? My claim is just that Komodo is the best analytical engine now when single core is used, either due to hardware limitations or to software such as IDeA that is best used one core per position. If we are the best single core engine for analysis, one might expect us to be the best MP engine for analysis when Komodo MP is out, but of course this is just speculation. The data you cite says nothing about my claim that Komodo scales better than at least Houdini and Critter.
Parent - - By Kappatoo (****) [de] Date 2012-02-13 00:30

> For analysis, most people allow some number of minutes per move.


I often hear that, but I somehow doubt it. What do you think most people's analysis looks like? You enter a move, let the engine ponder the position for a couple of minutes while staring at the eval, then you enter the engine's first choice and repeat that process? This sounds like an awful way to spend your time. Or look at it this way: Suppose I analyze a game which I played OTB. If I allow a couple of minutes per move, then the analysis will last as long as the game itself. But wait, I do not only want to know that some move would have been better than the one I played, I would also like to know why. For this reason, I follow the engine's suggestion and check a couple of lines and responses to other possible moves. If I allow a couple of minutes for each position I analyze, my analysis will last a couple of days. Sounds terrible, I think.
Maybe you were only talking about IDeA analysis or some other (semi-)automated process, in case of which things may look different. But for most people's ordinary interactive analysis, I think it is preferable to have an engine which is strong at comparably short time controls.
Parent - By Uly (Gold) [mx] Date 2012-02-13 00:48
+1, with a warm hash interacting with the moves leads to the engine playing the move instantly (instead of a few minutes) in some positions. For seeing the plans of the engines, "a few minutes" is way too slow.

Though, Houdini has proven poor in showing me useful plans, or choosing one plan where it's hard to find something better, I think Komodo is already better than Houdini for this, regardless of blitz results.

Komodo is still behind other MP engines like Zappa Mexico II and Naum, but I think it has probably more to do with very different move choices, plans, and evaluations than other top engines, so Zappa and Naum are a great complement when the similarity of the top engines makes all of them be wrong in some position, a problem that may not be present in Zappa or Naum's "old approach".
Parent - By lkaufman (*****) Date 2012-02-13 02:59
I was mainly referring to people who have their games analyzed by the engines automatically while they sleep or work, either with IDeA or with appropriate features on ChessBase or other interfaces. For live analysis I agree that 40/2 hours is not the relevant time control; I would suggest that the CEGT 40/20 minutes is close to optimum, or even the IPON list.
Parent - - By tomgdrums (****) Date 2012-02-13 01:56

> For analysis, most people allow some number of minutes per move. Therefore the best measure of analysis strength is long time limit tests, and preferably ones where the time per move is fairly constant, removing the importance of time control optimization. This should be clear to everyone. Unfortunately no one tests at anything like 1 minute plus 3 minutes per move increment. The closest we have is 40/2 hour results, with 40/40' being next best. For those who want quick answers or to run their own blitz matches, a large increment relative to the base time will better simulate analysis. For those who want to run their engines against others on Playchess or elsewhere at blitz, the IPON list is probably the most reliable rating.


Yeah I get that.   But then why shouldn't Houdini  2 be tested at all the same time controls??

That is what I meant by disingenuous.

Komodo 4 is great by the way!  I just think your posts always sound like the kind of spin you hear on the Sunday morning talk shows.
Parent - By lkaufman (*****) Date 2012-02-13 03:04
Houdini (1.5 or 2.0 doesn't matter, there being no evidence that 2.0 is superior beyond blitz) is tested on the 40/2 hr. list, and the primarily increment matches have generally been direct ones between Houdini and Komodo. It's true I'm not a neutral party, but I cite evidence from neutral parties for my claims. Obviously, I'm joking about claiming that a one elo lead proves Komodo is superior to Houdini at 40/2 hours. For that we would need at least 2 elo points:grin:.
Parent - - By Bouddha (****) [gb] Date 2012-02-13 12:37
Hi Larry,

MP is very important !
The time you spend today using the single CPU to search for i.e. 2min on a position could be drastically reduced and thus, analyse more positions within the same amount of time using an MP version.

Looking forward to see it released soon.

rgds
Parent - - By lkaufman (*****) Date 2012-02-13 16:20
Of course MP is important, except for those who mostly use IDeA or who have only one core. I hope it will be ready very soon.
Parent - - By Razor (****) [gb] Date 2012-02-18 08:44
Not important to me Larry; what I'm waiting for is the revised SP version of K4 - when do you envisage this being made available?
Parent - By lkaufman (*****) Date 2012-02-18 17:50
Our plan is to release that and MP at the same time, hopefully quite soon.
Parent - By Razor (****) [gb] Date 2012-02-18 18:23
Sounds good - I assume I don't need to buy the MP version if I only want the SP version - true?
Parent - - By Quapsel (****) [de] Date 2012-02-22 06:56

> and where is Houdini 2


It is expensive to test a version, so not all version will get testet.

Lets assume, that H2 ist 10 ELO ahead of H1.5
An lets assume, that R4.1 is 20 ELO ahead of R4

Then H2 is leader of the gang again, with R4.1 is very narrow behind. And K at place 3.

Quap
Parent - - By Barnard (Bronze) Date 2012-02-23 00:12
from what i hear from Larry,H2 is weaker at long time control (at that list) than H1.5a
Parent - - By Arrière Pensée (Gold) Date 2012-02-23 03:48
Pure unsubstantiated nonsense!
Parent - - By Barnard (Bronze) Date 2012-02-23 04:09
yes,i must agree that has not sense that Houdini 2 is weaker than Houdini 1.5a,but was what Larry said
Parent - - By Arrière Pensée (Gold) Date 2012-02-23 04:14
Yes, I know! Larry seems to be given to making  hypothetical assertions. Like the famous  MP version  will follow with a week or two of the release of  Komodo SP version. It's all hype.
Parent - - By Barnard (Bronze) Date 2012-02-24 02:14
well,i think that implementing the code to make the MP version is making problems to Larry and Don...since Komodo 3 they are tellin g that in 1 or 2 weeks will be a MP version,but we are in Komodo 4 version,and we havent the MP...and i cant understand why
Parent - - By Arrière Pensée (Gold) Date 2012-02-24 03:08
Maybe they set their bar too high...
Parent - - By Barnard (Bronze) Date 2012-02-24 22:03
well,make the implementation to MP i think that cant be so hard,so i cant understad why they cant make the MP version of Komodo...
Parent - - By Arrière Pensée (Gold) Date 2012-02-24 22:08 Edited 2012-02-25 17:54
I'm not sure it is that easy to move to an SMP version. But I might be wrong. Then again one that is supposed to capture the top spot might be making things more complicated.
Parent - - By Barnard (Bronze) Date 2012-02-24 22:10
well,i think they are trying to make the MP version since Komodo 2,but they cant do it...other programmers develop MP versions too mcuh faster than them,so i cant understand why they cant do it
Parent - - By DamirD81 (***) [dk] Date 2012-02-25 08:05
If you are so eager to see an MP version of Komodo, maybe you should consult Don and Larey in joining their team, I am quite sure with Your participation, Komodo MP project will go much quicker, than it already does. :grin::yell:
Parent - By Barnard (Bronze) Date 2012-02-25 16:58
well,i will study your offer...while,bring me a coffe and come under my desk pretty sizar :wink:
Parent - - By turbojuice1122 (Gold) [us] Date 2012-02-23 18:27
Why is this nonsense?  It's not even clear that Houdini 2 is stronger than Houdini 1.5 on normal time controls (some rating lists have them about equal), let alone correspondence time controls.

Remember that Shredder 9, for example, wasn't really an improvement over Shredder 8 and Shredder 7.04.
Parent - - By Arrière Pensée (Gold) Date 2012-02-23 19:21
I think it is stronger by a very slight margin. My impression is that that slight margin is being ignored out of bias. Is it significant? Not by any means, but enough to maintain an edge.
Parent - - By rocket (***) [se] Date 2012-02-23 19:36 Edited 2012-02-23 22:06
It's two elo points stronger currently on 4 cpu than 1.5 in 40/40.  In general it goes between 4 and 10 points.

On the big positive side though it does not appear to be weaker than 1.5 but I was suprised Houdart was so ultra conservative in his work.

There is no apparent change in the evaluation at all in any position yet seen which is rather boring for a new version.

He appears to be afraid of changing the winning concept of 1.5 but I still think he made some apparent flaws among his many improvements of the Houdini 1.5 engine. One such thing was the material value which he tuned slightly off (although close to perfect) I think he should have lowered the queen value a bit more and then it would probably be optimal and even better than the rybka counterpart.
Parent - - By Arrière Pensée (Gold) Date 2012-02-23 21:30
I think the consensus is that Houdini 2.0 for all intent and purposes should be regarded as Houdini 1.5 b gone commercial.

Now, can this marginal increase in elo be considered -slight of hand entrepreneurial strategy? Quite frankly, I'm not interested in sitting in judgement over it. Why? Because if one disagrees with the difference -Where's the beef ?  Houdini 1.5a is still free?

The real test is yet to come with the next version of Houdini. People aren't going to be so quick to jump to buy into it without first seeing the testing results and peer reviews.

But there is nothing unusual in these tactics . It's becoming routine among nascent commercial chess engine developers to hype their chess engines in development.
Parent - - By Stonehenge (***) Date 2012-02-23 23:36

> I think the consensus is that Houdini 2.0 for all intent and purposes should be regarded as Houdini 1.5 b gone commercial.


I don't think so. Why does nearly everyone on  PlayChess use Houdini 2.0?

Of course I did expect a larger improvement in the rating lists, my own tests showed 20 to 25 Elo improvement, but on the average the lists have shown less than that.

On top of the strength improvement there are some significantly enhanced analysis features in Houdini 2.0.
Noteworthy new game play and analysis features are Strength Limit, Fischer Random Chess support, Nalimov EGTB and advanced analysis options like Persistent Hash, Learning and FiftyMoveDistance.
Scaling beyond 6 cores is significantly improved - up to the point that some people on this forum even accuse Houdini of not showing its true node speeds with 12 cores... :cool:

Note also that in the current environment I see little point in Houdini being more than 50 points stronger that the competition. All improvements I make are thoroughly RE'd by the likes of Richard Vida and Yuri Osipov, then discussed on various forums (mostly by Richard), implemented in Critter or Strelka, or passed on to the Komodo or Stockfish team through the Critter source code Richard so generously distributes. It's a strange world :).
Parent - By Arrière Pensée (Gold) Date 2012-02-24 00:28
Strength cannot be argued. That is a given. What was being argued is the difference in strength between Houdini 1.5a and Houdini 2.0. I don't believe it is an issue, regardless of the variety of differences presented in each of the rating forums.

I would not contest that you have made some advances in analysis and added interesting parameters that are not present in Houdini 1.5a. I am sorry that you had to once again point those changes out to make your point. You are absolutely correct in suggesting that that is,  and should be enough,  to constitute a version change and leap into the commercial market. Sometimes arguments can too easily get myopic in context forgetting to take into consideration the larger picture.
Parent - - By Werewolf (*****) [gb] Date 2012-02-24 08:27

> Note also that in the current environment I see little point in Houdini being more than 50 points stronger that the competition. All improvements I make are thoroughly RE'd by the likes of Richard Vida and Yuri Osipov, then discussed on various forums (mostly by Richard), implemented in Critter or Strelka, or passed on to the Komodo or Stockfish team through the Critter source code Richard so generously distributes. It's a strange world :).


If Vas has "Rental Rybka" maybe you should consider "Hire Houdini" to stop this...
Parent - By Stonehenge (***) Date 2012-02-24 16:20

> If Vas has "Rental Rybka" maybe you should consider "Hire Houdini" to stop this...


LOL, I like the name :).
Up Topic Rybka Support & Discussion / Rybka Discussion / Rybka and Houdini at 40/120 Timecontrol?
1 2 3 4 5 6 Previous Next  

Powered by mwForum 2.27.4 © 1999-2012 Markus Wichitill