2 years of waiting and we get only 30-40 elo of improvement? And we are expected to pay the full amount for such a release? Hiarcs, Naum, etc. at least gives older customers discounts to upgrade to a newer version. We didn't even get the promised Rybka 3+. Hell the Rybka3+ should have been about 30-40 elo stronger. Thus this Rybka 4 should be the free Rybka 3+ for customers of Rybka 3.
I was one of the most patient waiter here. It was times I was burning with desire for the new Rybka 4 to come out. I keep telling myself, "I like the long wait because that means the program is going to be really strong." But such a disappointment IMO. It is almost tempting to use the clone engines like the big H, the one now above everyone (even Rybka) by at least 80elo. I feel like the loyal sucker. Vas why do this to your customers? Hell if you are short on money I'm pretty sure many would donate to you to help support you if you asked for donations. I know I would. More exceptable would have been to either give us our Rybka 3+, or give us a discount to this Rybka 4, or lastly to have made Rybka 4 stronger (at least 70 points stronger).
I've and others been begging for a Rybka release for the iphone or pocket pc. Many years of this and we get nothing. I understand Vas is only one man but there is no good excuse to not take out a couple of days to a couple of weeks within several years to do this.
If you are angry that you paid and the product wasnt as you expected (your magical thinking) well then you should just wait until others report.
And dont make yourself some kind of suffering christian I am really not going to pity you. Before crying here you could have studied chess programing to realise how difficult it it nowdays, that chess has certain end to it which we are now hitting. Next thing is only tablebasing more and more pieces and improving technology.
Every elo point gained now is very precious commodity.
> In fact the problem is that we are hitting the ceiling of the chess. The program already knows almost everything about chess.
I think we're very far away from this. For example, over at CCC a couple weeks ago, there was a 7-man position being discussed and despite quite a bit of computer analysis, they could not prove whether it was a win or a draw. And that was only with 7 pieces!
> k we're very far away from this. For example, over at CCC a couple weeks ago, there was a 7-man position being discussed and despite quite a bit of computer analysis, they could not prove whether it was a win or a draw. And that was only with 7 pieces!
Terrible example. Computer analysis can only take you so far in a position. 7 pieces is in fact a LOT, and you would need to dig very very very deep to truly see the win/draw, and that requires obscene amounts of time and computational power.
No matter how well you tune an engines evaluation to judge a position, it will never know the "true result" of the position until it can see all future moves. And we will never reach a point where we have that amount of processing power available, even at 7 pieces. There will always be extraordinary positions in chess that no engine can punch through even with huge amounts of processing power.
> Terrible example.
Actually then, it's a _good_ example of how far away we are from perfect chess!
People are already experimenting like crazy with Rybka 4's new contempt settings, trying to improve the engines playing strength compared to using the default values, and while I'm sure that there is a setting out there that is superior to the defaults (not to mention the changes Vas can make to the program code itself to make it evaluate differently), I'm very skeptical of the ELO increase you can gain from such changes. Theoretically, every chess engine in the world that understands the rules of the game (including the concept of mate/statemate/50 move rule/repetition etc.) can already play "perfect chess" if you give them enough time to calculate, even if that time extends beyond the lifetime of the universe. Chess engines these days have very tightly tuned evaluations, but the ironic thing is that if we actually HAD that computational power available to analyze a chess game to the end, then engines wouldn't need evaluations at all beyond the most basic chess rules of mate and stalemate.
What I'm trying to get at is that in the world of computer chess, the word "Evaluation" translates to "Educated guess based on analysis of a limited amount of positions", and no matter how much you tune the criteria, a guess is always going to be a guess, and the only way to drastically improve the reliability of a guess is to analyze the game further, aka. making sure the "limited amount of positions" aren't so limited. We are slowly reaching the top point of how well chess engines can evaluate positions.
As an experiment I'm currently doing a tournament between Rybka 3 and Rybka 4 at default settings (Rybka 3 contempt set to 0 though, default is 15), where there is no time control, but both engines are playing at a fixed depth (Currently 12, will be doing another with 14 later). So far, Rybka 3 is winning solidly over Rybka 4 at the same depth. What does that tell you of the evolution of Rybka's evaluation criterias from Rybka 3 to Rybka 4, considering Rybka 4 is normally 30-40 ELO stronger using normal time controls? ;) Remember that one of the biggest improvements in Rybka 4 over Rybka 3 is that it evaluates much faster. At the same time controls, this means that Rybka 4 is beating Rybka 3. In my experiment, not so much.
>Theoretically, every chess engine in the world that understands the rules of the game (including the concept of mate/statemate/50 move rule/repetition etc.) can already play "perfect chess" if you give them enough time to calculate, even if that time extends beyond the lifetime of the universe.
True. I think there are a lot of people don't realize this, though.
>We are slowly reaching the top point of how well chess engines can evaluate positions.
Ahh, now I see what you meant. You're referring to the static eval at the leaf nodes, and aren't including the search at all. You can't completely seperate the search and eval, since the eval heavily guides the search (as far as move ordering, etc.).
Thanks for you thoughts. I always find discussions on solving chess and perfect play very interesting!
> has anyone confirmed that R4 realdepth is still reported depth + 3?
Yes, though when you force bugs for unsupported features (like Bishop underpromotion) you will notice she subtracts 4 from depth.
It tells me that its kind of silly to do fixed depth matches between different engines. This is especially true with Rybka, where Vas makes arbitrary changes in depth and node count which may be different between R3 and R4.
>It tells me that its kind of silly to do fixed depth matches between different engines.
I couldn't have put it better
I checked the time that each engine used on each matchup. Rybka 4 used approximately half the time on each match compared to Rybka 3, sometimes even less. I also tracked how many positions each engine evaluated to reach depth 12.
Now, given that Rybka 4 evaluates approximately at twice the speed of Rybka 3 (in most middle-game positions Rybka 4 evaluates at 155-165 Kn/s where Rybka 3 is evaluating at 75-80 Kn/s on my hardware), it using half the time of Rybka 3 to reach depth 12 is expected. But when Rybka 4 starts losing to Rybka 3 in that scenario, i start to question the strength of the evaluation.
Like i said, i also tracked how many positions each engine evaluated to reach depth 12. It bounced a bit back and forth, but overall they analyzed around the same. Some positions would have Rybka 4 analyze more positions before making a move than R3, some positions it was the other way around. So if both engines analyze approximately the same amounts of positions in a game, and one is superior in overall playing strength, then that points to that engine possibly having the superior search/evaluation (disregarding the fact that R4 evaluates much faster of course). In earlier tests i even had R3 beat R4 in tournaments where R4 was allowed to search 1 depth further.
If the test was between two drastically different engines, then i would agree it would be silly, but given the conditions above, i don't see the silliness in doing such a test between R3/R4, although it obviously needs to be more thorough, aka. needs to be done on another setting, especially considering D12 isn't very sophisticated for endgame play without tablebases. My hardware sadly isn't very fast, but I'll try a depth 16 tournament at some point although it will probably take some time. But the test makes it perfectly valid to question the strength of R4's evaluations.
> If the test was between two drastically different engines,
It is! (I'm serious.)
> Except that it isn't if your goal is to check the strength of the evaluation (and the search).
this is not true. different programs define depth differently. rybka 4 may have a different definition of depth than rybka 3; it definitely seems to reach depth faster as you say. rybka 3 at depth 12 would absolutely murder stockfish 1.7 at same depth, but no one evaluates or plays games with fixed depth (unless its same eng vs same eng, or to cripple the eng against humans), so this is just unfairly penalizing stockfish, which might reach d20 in the same time rybka 3 reports d12 (even if you account for 'real depth' (reported depth + 3), i'm quite sure rybka would still demolish stockfish). but is stockfish 1.7 actually 'faster' in a chess sense than rybka 3? afaik, it is not in any meaningful way.
> this is not true. different programs define depth differently. rybka 4 may have a different definition of depth than rybka 3;
Yes, but like i said, Depth 12 for both Rybka 3 and Rybka 4 seem to involve evaluating the same amount of positions. The results of a setting of Depth 12 for both engines yields a similar amount of searched positions, or shifts in turn of favoring one engine over the other. This is not comparable to an entirely different engine like Stockfish, who litterally evaluates 10 times faster than R4, but at the same time has much weaker evaluation because it has less chess knowledge.
Example position from a computer match.
Rybka 3 reached D12 at 1223k nodes.
Rybka 4 reached D12 at 1374k nodes.
Rybka 3 reached D12 at 610k nodes.
Rybka 4 reached D12 at 480k nodes.
Bottom line is that R3 and R4 evaluates "approximately" the same amount of positions for a given depth, indicating that there is not much visible difference between the definition of depth between the two engines (compared to say Stockfish definition of depth). If two engines are given a limited amount of positions they are allowed to evaluate, and one is coming out on top, then that indicates that this engine has more knowledgeable evaluation.
And thats the very concept I'm trying to discuss here: Chess knowledge. We have to consider that - besides the obvious programming refinements - Rybka 4's faster evaluation could potentially stem from the fact that it's evaluation criteria has been dumbed down somewhat, making it an overall weaker evaluator (but also an overall stronger engine by increased speed, which is shown when you do normal engine matches).
It would be interesting if it was possible to constrain engines to only evaluate a limited amount of positions instead of a given depth (which is defined individually by each engine). That would give a much clearer picture of which engine has the most chess knowledge if you could, say, instruct each engine to evaluate a maximum of 1 million positions per move.
first of all, & this i'm sure of, rybka does its own dance as far as node counts are concerned...to put it cheaply, it 'lies'. its real node count, as compared to other engines, is some # of times higher -- probably as high or higher than stockfish.
what i can't be -sure- of is that rybka 4 doesn't have knowledge removed & this is what boosts its kn/s over rybka 3. this is a plausible theory because we've gotten a somehow 'crippled' version, but we don't know how yet.
however i think as vas states (because he should know) its more likely its just a separate entity with a distinct definition of nodes & possibly even depth. whereas rybka 3 was said to be d+3, rybka 4 may be d+2...or it may just work depth differently. stockfish for example can hit depth 30 in any position on a single core easily, but i don't think that's because it's a brute force beancounter - that wouldn't fly at all in modern computer chess. for whatever reason it just scales plies very fast. rybka 4 may be more aggressive in this regard; computer chess seems to be trending that way. but my main point is, afaik you can't really compare the positions said to be analyzed by different rybka versions. you can only compare among identical rybkas.
in the rykba world, until R4 'faster' (as we see it on the screen anyway) was certainly not better. usually node counts were -halved- from version to version. but even tho R3 reported half the kn/s of R2, it was miles better tactically.
> no, no, no. you must be new here
New poster, but long time reader. I've been reading the Rybka forum for quite a while, and have been into computer chess since i originally bought my first Chessmaster (8000) a couple of years ago.
> first of all, & this i'm sure of, rybka does its own dance as far as node counts are concerned...to put it cheaply, it 'lies'. its real node count, as compared to other engines, is some # of times higher -- probably as high or higher than stockfish.
I'm aware of that, but as far as i remember, Vas mentioned somewhere that he didn't change node-counting between R3 and R4, so it is totally comparable in my example.
Vas posted above that the engines are totally different, and from a programming stand-point i believe that very much, but i think we can both agree that R4 isn't a rewrite from scratch. It uses alot of old concepts, and probably also alot of old code from Rybka 3 (even if that code was cleaned up along the way).
> what i can't be -sure- of is that rybka 4 doesn't have knowledge removed & this is what boosts its kn/s over rybka 3. this is a plausible theory because we've gotten a somehow 'crippled' version, but we don't know how yet.
This has been my entire point all the time, so thanks for actually getting it :)
And this is also why i believe that engine matches based around fixed search depth isn't a mistake or a terrible concept. It is, in fact, a very decent way to test the "knowledge" of an engine and how well it evaluates each position (and bases it's search around those evaluations), assuming of course that you take into account that each engine has a different concept of depth.
> in the rykba world, until R4 'faster' (as we see it on the screen anyway) was certainly not better. usually node counts were -halved- from version to version. but even tho R3 reported half the kn/s of R2, it was miles better tactically.
As with everything, it's about finding the best compromise. The more evaluation criteria you kick into an engine, the slower it becomes at evaluating positions, but playing strength can still increase as a result.
It can also decrease.
> Vas mentioned somewhere that he didn't change node-counting between R3 and R4, so it is totally comparable in my example.
Pretty sure he said the opposite.
> Pretty sure he said the opposite.
"I didn't change the node counting"
First page of the Rybka 4 FAQ.
"I didn't change the node counting but whenever the engine changes, you can't really compare NPS directly any more. The only thing it's good for is measuring your hardware.
Therefore the only thing you can compare is Rybka 4 v Rybka 4 on different hardware.
I'm not comparing performance between the engines which is why I'm using a fixed depth. I'm comparing evaluation strength based on a limited set of nodes (controlled by depth). How fast those nodes are reached are irrelevant.
You may well be right but I think there is another possibility.
Depth is a crude estimation of number of positions evaluated as the pruning may be different between versions. Couldn't it be that R4 has just as much or more knowledge than R3 but prunes harder so it can look deeper? At fixed depth it could still give worse results than R3 in this scenario.
If the first part is true, but the second part isn't, then imagine this: Rybka 3's evaluation with Rybka 4's evaluation speed (and of course all the new endgame heuristics etc. that makes R4 an even better engine). That would be a terrifying engine to play with :)
> Rybka 3's evaluation
Impossible, the code no longer exists.
(though Frankensteinian ideas come to mind...)
> where Vas makes arbitrary changes in depth and node count which may be different between R3 and R4.
And which may be different between R4 and R4 depending on position.
The matter really is, and has to be, can there be a new algorithm to circuit the current pruning method. Hyperpruning if you will. You start getting into the great mysteries of computer science here where they have got to practically make P = NP. Quantum level computing may give some insights into this, but it is still in its infancy and we will have to wait a few more years for that.
> can there be a new algorithm to circuit the current pruning method.
I'm sure new pruning methods will be developed in the future. Remember when null-move pruning was developed? That was more revolutionary than evolutionary.
F. J. Tipler, "The structure of the world from pure numbers," Reports on Progress in Physics, Vol. 68, No. 4 (April 2005), pp. 897-964. http://math.tulane.edu/~tipler/theoryofeverything.pdf , http://126.96.36.199/~tipler/theoryofeverything.pdf Also released as "Feynman-Weinberg Quantum Gravity and the Extended Standard Model as a Theory of Everything," arXiv:0704.3276, April 24, 2007. http://arxiv.org/abs/0704.3276
Out of 50 articles, Prof. Tipler's above paper was selected as one of 12 for the "Highlights of 2005" accolade as "the very best articles published in Reports on Progress in Physics in 2005 [Vol. 68]. Articles were selected by the Editorial Board for their outstanding reviews of the field. They all received the highest praise from our international referees and a high number of downloads from the journal Website." (See Richard Palmer, Publisher, "Highlights of 2005," Reports on Progress in Physics. http://www.webcitation.org/5o9VkK3eE )
Reports on Progress in Physics is the leading journal of the Institute of Physics, Britain's main professional body for physicists. Further, Reports on Progress in Physics has a higher impact factor (according to Journal Citation Reports) than Physical Review Letters, which is the most prestigious American physics journal (one, incidently, which Prof. Tipler has been published in more than once). A journal's impact factor reflects the importance the science community places in that journal in the sense of actually citing its papers in their own papers.
See also the below resource for further information on the Omega Point Theory:
Theophysics: God Is the Ultimate Physicist http://theophysics.110mb.com
Tipler is Professor of Physics and Mathematics (joint appointment) at Tulane University. His Ph.D. is in the field of global general relativity (the same rarefied field that Profs. Roger Penrose and Stephen Hawking developed), and he is also an expert in particle physics and computer science. His Omega Point Theory has been published in a number of prestigious peer-reviewed physics and science journals in addition to Reports on Progress in Physics, such as Monthly Notices of the Royal Astronomical Society (one of the world's leading astrophysics journals), Physics Letters, the International Journal of Theoretical Physics, etc.
>For example, over at CCC a couple weeks ago, there was a 7-man position being discussed and despite quite a bit of computer analysis, they could not prove whether it was a win or a draw. And that was only with 7 pieces!
Can you please post this position?
Hard to imagine the complexity of certain, say, 12-man positions!
> The TC algo that R4 is using seems very poor in such conditions and you will get much better results i 3_0 or even move with increment such as 4_2 or 5_1.
The time management for repeating time control is simpler than the one for to x+y or x+0. I'm sure Vasik Rajlich has optimized the TC-parameters like other parameters. If Rybka gives better results at x+y or x+0 then not because Rybkas time management is worse at repeating TC but the time management of other engines is worse at x+y, x+0.
>CCRL and CEGT use "move in time". So 40_4 would be 40 moves in 4 minutes. The TC algo that R4 is using seems very poor in such conditions and you will get much better results i 3_0 or even move with increment such as 4_2 or 5_1. The chess strength is there but the time management algo is not. Most likely for R4 to do well at such time controls, there will have to be a change in the TC parameters.
Yes and is it the customer's problem to test one billion TC's settings and find this hypothetical magical setting that will unleash Rybka's real Chess strength? Let's get real.
> Most likely for R4 to do well at such time controls, there will have to be a change in the TC parameters.
What change? Anything specific?
> Yes and is it the customer's problem to test one billion TC's settings and find this hypothetical magical setting that will unleash Rybka's real Chess strength? Let's get real.
It's so sad, I don't want to go through that pain. I want to use it perfectly fine out of the box :(
>2 years of waiting and we get only 30-40 elo of improvement?
40 ELO improvement is impressive for what was the strongest program.
>It is almost tempting to use the clone engines like the big H, the one now above everyone (even Rybka) by at least 80elo. I feel like the loyal sucker.
People who steal cars also get them cheaper than people who pay for them. Do you feel like a loyal sucker for not stealing your car?
> People who steal cars also get them cheaper than people who pay for them. Do you feel like a loyal sucker for not stealing your car?
+ 1 x (infinity)
This position should demonstrate why relying on any one engine, even Rybka, isn't the greatest idea. Grab a copy of Naum, or maybe Stockfish (with Zugzwang detection on), if you want to understand why this position is a win for white, rather than a draw, as Rybka claims.
Analysis by Deep Shredder 12 x64:
1. +- (3.26): 1.Qxh2+ Qxh2 2.Rb1 c4 3.Kc6 c3 4.bxc3 h4 5.Kb7 h3 6.Ka8 Qb8+ 7.Rxb8 Bf2 8.Rh8 g1Q 9.Nxg1 Bxg1 10.Rxh3+ Kg2 11.Rd3
2. -+ (-5.49): 1.Ra1 Qxf3+ 2.Ke6 c4 3.Qd8 Qe2+ 4.Kd7 Qxb2 5.Rd1 c3 6.Qg5 Qb7+ 7.Kd8 Qb8+ 8.Ke7 Qb4+ 9.Ke6 Qb3+ 10.Qd5 Qxd5+ 11.Kxd5 h4 12.Rc1 h3 13.Rb1 c2 14.Ra1
3. -+ (-9.88): 1.Nxh2 Qxh2 2.Qxh2+ Kxh2 3.Rd1 Bd4 4.b4 cxb4 5.Rxd4 g1Q 6.Rh4+ Kg2 7.Kc6 Qf2
4. -+ (-11.11): 1.Re1 Qxf3+ 2.Ke6 Qf1 3.Qg3 Bd4 4.b4 g1Q 5.Rxf1 Qxf1 6.bxc5 Bxc5 7.Kd7 Qd1+ 8.Ke8 Qe2+ 9.Kf7
5. -+ (-17.56): 1.Rxg1+ hxg1Q 2.Nxg1 Kxg1 3.Kxc5 Qf5+ 4.Kc4 Kf1 5.Qa8 g1Q 6.Qa1+ Kg2 7.Qa8+ Kh2 8.Qb8+ Qg3 9.Qxg3+ Kxg3 10.Kb4 Qf4+ 11.Kb5 Qb8+ 12.Kc5 h4 13.b4 h3
/* Steinar */
Some people only seem to rate an engine by its ability on playchess
rather than whether it can play chess.
(There is a *big* difference)
Powered by mwForum 2.27.4 © 1999-2012 Markus Wichitill