Not logged inRybka Chess Community Forum
Up Topic Rybka Support & Discussion / Rybka Discussion / Anthony Cozzie speak !
1 2 3 4 5 6 Previous Next  
Parent - By BB (****) Date 2008-02-08 12:40

> Re: Fritz uses more power

I can tell which engine is on move in testing simply by fan noise differences in many cases. [Dan Bernstein had a quote about his various super-optimised code once: "I like to hear the gears kick in as the fan goes faster and faster" :)]
Parent - By turbojuice1122 (Gold) Date 2008-02-08 15:20
I think I heard something similar for Zappa Mexico, too.
Parent - - By Sesse (****) Date 2008-02-08 21:43
Yes, the manuals can be very useful for reference, although a bit dry to read from face to face...

By the way, if you just ask, Intel will ship you printed versions of their manuals. (There's a link next to the download link.) Free of charge, no questions asked. I just received a bunch of them -- when it's 2-3000 pages, it's just so much easier having it in paper form :-)

/* Steinar */
Parent - - By Vasik Rajlich (Silver) Date 2008-02-10 18:36
Thanks for the tip, that's nice.

I find the manuals very readable, actually. We had some memory access problems in late 2006 or so with the then-new Intel machines, and one of those manuals with a huge title pretty much had everything I needed about the topic.

Parent - - By POSITRON (**) Date 2010-02-10 12:18
HE live!!!!  :) ;)
Parent - - By Uly (Gold) Date 2010-02-10 12:20
No, that message was from 2 years ago.
Parent - - By POSITRON (**) Date 2010-02-10 15:24
oops, your right :)) the emotion dont let me saw that, only check day and month
Parent - - By Uly (Gold) Date 2010-02-11 02:58
Yes, if I didn't know the thread and was gone for a few days, I could have thought that it was posted recently ;)
Parent - By NATIONAL12 (Gold) Date 2010-02-11 23:25
he is probably on the razzle with a couple of birds if he has any sense.much more fun.
Parent - - By Sesse (****) Date 2008-02-06 04:36
By the way, the paper doesn't seem to mention what year the given two-fold speedup was, but given the dates elsewhere, I'd guess this was circa 1990. We've had almost twenty years of compiler development since then, and it really shows. :-)

/* Steinar */
Parent - By BB (****) Date 2008-02-06 11:10 Edited 2008-02-06 11:33

> By the way, the paper doesn't seem to mention what year the given two-fold speedup was

I think the time in quote is about 2002, when GMP 4.1 came out. [I can ask Zimmermann if you really want to know].

All the speed in GMP is still at the assembler level. You can build the "generic" code (no assembler support), and I think you'll get something about 4 times slower. [Unfortunately, I'm having problems testing this]. Indeed, Granlund has been trying to get "vendor sponsorship" from AMD (and others) to include his assembly level optimisations in the GMP distro - a fork has been resisted, but most people use, say, the patches of Gaudry, which, while maybe 10% slower than Granlund's (non-public) code, are still nearly twice as fast as the general ix86 code when used at the bottom level of GMP. [As an example, see the "Optimal" column versus "GMPbench" in the table here].
Parent - - By Sesse (****) Date 2008-02-06 04:22
I'm not all that sure your first optimization is a win; CPUs do _much_ more intelligent branch prediction than just simple yes/no, and the fact that the last branch is almost never taken probably fades when you have to consider that there are now three branches instead of one to take care of.

BTR is almost certainly a loss. Since at least the Pentium Pro, complex instructions are more expensive than short series of simple instructions. A decrement and an AND is absurdly cheap; I couldn't find cycle counts for modern CPUs offhand, but the Pentium needed at least seven cycles for BTR, and I doubt it's gotten much cheaper.

/* Steinar */
Parent - - By BB (****) Date 2008-02-06 10:33

>I'm not all that sure your first optimization is a win

As I say, the proof would be in the pudding...

> CPUs do _much_ more intelligent branch prediction than just simple yes/no

No matter how intelligent they are [I might agree with much, but not with its underlining] it cannot be a bad idea to reduce randomness in the input - the only question is whether code sprawl would become a worry.

> and the fact that the last branch is almost never taken

You could just simply ignore the while loop over the 3rd and higher bishops - in almost all positions this will not matter. My impression is that a 100% predicted branch should be nearly a NOP, at least when the condition is simple as here - on the other hand, avoidance of it would also allow a gain from avoiding the preceding BTR. There is also the issue that (say) the PHENOM has counters for only 3 branches per (aligned) 16-byte block.

> the Pentium needed at least seven cycles for BTR

BTR is 2 cycles on the PHENOM (assuming mask is in a register) -  see Table 13 in Appendix C here. Also, DEC/AND has pipeline dependencies, in that you are ANDing with the result of the DEC [my impression is that this doesn't matter, but getting AMD and especially Intel to be specific about the vagaries of micro-op decoding is a bit difficult]. Both BTR and DEC/ADD should be 2 bytes in opcodes, though I think DEC might itself become two bytes in 64-bit mode (here is the AMD info; see Table A-2 on the 48-4F row for DEC, with footnote 5 giving the info about REX in 64-bit mode, so the DEC becomes FF/1 with the ModRM extension). Finally, DEC/AND uses 2 registers instead of one.
Parent - - By Sesse (****) Date 2008-02-06 11:04
But the DEC and AND can go in parallel with other stuff.

Seriously, there's a reason why we don't use LOOP anymore, but instead DEC+JNZ. It's the same thing.

/* Steinar */
Parent - - By Sesse (****) Date 2008-02-06 11:14
...and looking at the docs, does BTR even do the same thing as DEC+AND? It seems to me that BTR does x &= ~(1 << y), not x &= (x-1), so you'd need to keep the result of the BSR around for a lot longer.

I guess we could recompile Strelka and measure, but I'm not going to do that anytime soon. :-P

/* Steinar */
Parent - By BB (****) Date 2008-02-06 11:22

> so you'd need to keep the result of the BSR around for a lot longer

I'm guessing (weasel-word) that having the BSR-result square around should not be too problematic, as the instruction can go anywhere in the code block. Indeed, the bitboard operations in computing (say) mobility will certainly use square a couple of times, and if you have any more code, there could be additional opportunities. I guess you are correct that my chastening of DEC/AND for needing two registers is a bit harsh, but that's because I was thinking that the BSR-result was readily available, and thus came for free.
Parent - - By BB (****) Date 2008-02-06 11:17

> But the DEC and AND can go in parallel with other stuff.

This is not untrue, but is there a relative advantage in parallel capability with DEC/AND versus BTR? I could be missing something, but I don't see it. [You have the whole WH_BISHOP_CODE block in which to schedule].

> we don't use LOOP anymore

Was it ever a good idea? :-P
Parent - By Sesse (****) Date 2008-02-06 17:55
It might have been faster on 8088, I guess. And besides, it's a byte less or something... :-)

/* Steinar */
Parent - - By wem511 (**) Date 2010-02-09 19:39
Woot Zappa's back!
Parent - - By NATIONAL12 (Gold) Date 2010-02-09 23:38
i just wondered if we mentioned clones here, if the whole thread would be moved to another place.
Parent - By Uly (Gold) Date 2010-02-10 02:04
Hopefully only the clone talk would be branched.
Parent - By turbojuice1122 (Gold) Date 2008-02-04 14:02

The comparison you make with Fritz 5 (which I translate to Fritz 5.32) is interesting.  It has been said by many that Strelka "unlocks" the nps the Rybka squelches in the output.  However, even so, Strelka still only calculates about half the nodes per second as Fritz 5.32 (and perhaps 25% or less than Goliath Light 1.5).  Does Strelka still somehow "internally" suppress the count by a factor of 2?  Or are you referring to something slightly different in the "bean-counting analysis"?

Good luck in computer science!  I'm in physics, trying to do the same thing...
Parent - - By BB (****) Date 2008-02-04 14:26

> Consider Fruit: you can take the new ideas from Fruit, add them to an existing chess program, and make it better.

What was particularly new in Fruit? The use of History vis-a-vis LateMoveReductions was a bit novel, I guess, but it need not be much of a strength gain. [LMR was already bandied about by Romstad and Markoff a year before Fruit appeared]. VR's comment about FL presents a Fruit-opinion similar to my own: "Fabien is a very good engineer, and also has a very clear and simple conception of how his search should behave." Or we can hear from FL himself: "I can't think of a search feature in it that was not described before. Ditto for evaluation terms (except perhaps a few drawish-endgame rules that activate in one game in a hundred). There are specific principles that I follow in Fruit that gives it a personality somewhat (like never truncating the PV and making sure that mate-depth claims are always correct), but they probably have no impact on strength at all and could even hurt a little."

> It packs most of the search and evaluation of Fruit/Crafty into something almost 3 times faster.

64-bit Strelka is about 3 times as fast as Fruit in NPS --- as Hyatt said back in 1994, eventually bitboards would gain the speed advantage. Also, FL never tried to optimise Fruit per se: "More concretely, in chess [clearer, uncomplicated code] leads to slightly-slower programs (say 25% lower NPS) that are easier to modify. For example Fruit spends a lot of time scanning board squares to evaluate mobility. I know several faster ways I could compute the same thing. However none of them would allow me to easily modify the mobility feature later."

> I did not add any eval terms from Strelka (my view being that Zappa's eval is way better than Strelka's anyway)

Strelka's eval seems quite minimal, largely mobility.
Parent - - By Anthony C (*) Date 2008-02-04 14:47
The true inventor of LMR was Stefan Meyer-Kahlen, of course, which is why S7-S9 completely dominated the computer chess world.  But Fabien gets credit because he published his work.  Stefan didn't, and random message board discussions certainly do not count.

I will also have to assume that you haven't done much programming in C, because you are really underestimating just how great of an optimization job is in Rybka.  Look at Fabien's own statement: He thought that Fruit could be optimized by 25%, not 200%.


Parent - - By Uri Blass (*****) Date 2008-02-04 14:56 Edited 2008-02-04 15:03
I do not think that there is one true inventor of LMR

I believe that many programmers invented it independently.
Movei0.07.99 is using LMR based on my memory(I will need to look at my old code if I saved it to be sure but I am sure I invented it some years ago independently of other people and maybe stefan made it earlier).

Based on my memory I invented it some time between 0.07a and 0.07.99
It is weak because of other reasons but LMR certainly made it stronger.

Parent - By BB (****) Date 2008-02-04 15:06

> I do not think that there is one true inventor of LMR

Indeed, it's just another one of those forward pruning techniques that the commercial guys are using. :)

> but LMR certainly made it stronger

Now we know why it won New-Year-Fun '03. :-P
Parent - - By Carl Bicknell (*****) Date 2008-02-04 17:10
I've just realised what this forum needs: a glossary of terms.
Parent - By richbell (**) Date 2008-02-04 19:35

> I've just realised what this forum needs: a glossary of terms.

I keep saying this Werewolf, no one bothers, sob, sob. :)
Parent - By BB (****) Date 2008-02-04 15:02

> because you are really underestimating just how great of an optimization job is in Rybka.

I am definitely impressed by the optimising in Strelka (it is about 50% faster than my first bitboard engine). But I do not see from what you can claim a 3x engineering gain, unless you compare apples to oranges vis-a-vis bitboards. On a 32-bit computer, Strelka is something like 50% faster (in NPS) than Fruit --- unsurprisingly, this becomes almost 3x at 64 bits. I do not find Strelka to be 3x faster than bitboard engines like Crafty and Glaurung.
Parent - - By BB (****) Date 2008-02-04 21:52

> S7-S9 completely dominated the computer chess world.

I'm not sure I quite agree with this description of history. Also, SM-K claims that a big push came in Shredder 8 (no idea about 7.04, which was noticeably better than 7): "Since Shredder 8 there was a completely new search in Shredder, which is extremely selective. As a result Shredder 8 always calculates a few halfmoves deeper than its predecessor. Of course the difference in the search depths of both versions are not simply comparable with each other, however I believe that the new search is much better and there is also clearly more room for further improvements."

>[ LMR] is why S7-S9

Is this an educated guess, or do you have Inside Info? Without any other information, one could speculate that his "forward pruning methods" might have been closer to that, say, of Björnsson and Marsland, for instance.

> because [Fabien] published his work

"For Fruit, which ... made even the densest of us aware of fail-low pruning." :)
Parent - - By Vasik Rajlich (Silver) Date 2008-02-08 09:35
Actually, I talked to Stefan about it and yes, he was using LMR in almost exactly the modern way already with Shredder 7. This means re-searches, exceptions for late-move captures and other special moves, etc.

In my view, Tord clearly deserves the main credit for LMR. Fabien probably deserves the #2 spot, since Fruit is what really showed beyond any doubt that this works.

Parent - By BB (****) Date 2008-02-08 12:48 Edited 2008-02-08 12:53

> [Shredder 7 already used LMR]

Which leaves us guessing how he gained the ELO in the next two versions. :)
[I think he said Shredderbases and eval changes with passed pawns and king safety for S10/S11, though I don't think bitbases are the whole story for S10 - sorry can't fact-check at this terminal].

> [Fruit showed LMR works]

I'd be a bit more circumspect: Fruit showed it could work - if you already had (say) a whole bunch of other reduction methods, the gain might be small. It also showed that "a simple approach works well" when considering reductions.
Parent - - By Uri Blass (*****) Date 2008-02-04 14:53
The average between opening evaluation and endgame evaluation based on the stage of the game was new for me.
I do not care if it was used by other programs in the past because I did not know it.

I also do not agree about definition of new.
If a chess player plays a strong move that was played 100 years ago and was evaluated wrongly then claiming that the player found nothing new is wrong.

The player found that the move that was played 100 years ago is a strong move and the same is about fruit.
Finding that it is good to use some combination of ideas that were suggested in the past is not finding nothing.

It is not something obvious otherwise other programmers could also do it and it is a fact that the commercial programmers at the time of fruit2.1 did not do it(otherwise they could easily be above the level of fruit together with their own ideas.

Parent - By richbell (**) Date 2008-02-04 19:18 Edited 2008-02-04 19:40
Thanks for your nice response Anthony, though i am not sure how far your claim on Rybka is true as for one i am a novice with chess engine programming, second, optimization of Rybka's code by itself could be a novel technic which other programmers are missing and Finally still only Vas can have a say at "What's the real secret".

However I have two requests for you. Please continue your endevour also with Zappa along with your very successful PhD and please visit our forum more often for your valuable insights. For a small community of chess fans (other than the elites) this is the only motivation for us.
Parent - - By Sesse (****) Date 2008-02-04 21:25
First, I think your post was very, very interesting. Thanks for sharing.

Second, even though I've spent less time in academia than you (master's degree vs. graduate student), I'm not so sure novelty rules all. If you cannot show that your novelty has at least a certain advantage in some situations over existing practice, it's not going to be that interesting.

/* Steinar */
Parent - By blubb (*) Date 2008-02-05 05:21
I think this is not his point. It is rather sport versus science. Do you think Leibniz or Newton had any idea how important differential calculus would be 500 years later? There are countless examples where a scientist is driven by pure curiosity - and in the end he found something completely new and useful. 
Academic Science is about inncocent curiosity (innocent in terms of not thinking about profit or usefulness but only about the beauty of thought). This is the difference to industrial research.

What about chess? Is it useful? Does it improve mankinds situation? Does it help to create a better world? Of course, one can use some far-fetched arguments like "trains your mind" and so on, but there are many chess lovers who do this mind-sport due to the intellectual challenge, the search for amazing ideas, for joy without thinking about 'usefulness'.

I cant blame A. Cozzie for his point of view. Even though he seems not to care too much about Zappa (this might be vanity), he created one of the strongest engines. I think this competition stimulates the development of chess engines further. Maybe Rybka 3 would not be as strong as it is according to L. Kaufmann if the defeat in Mexico would not have taken place.
Parent - - By Mark Young (*) Date 2008-02-05 05:27
"Finally, Rybka is better than Crafty because Vas has implemented something new and interesting that I have not yet discovered. Nothing more, nothing less."  ----Robert Hyatt
Parent - - By M ANSARI (*****) Date 2008-02-05 08:05
Is search important or evaluation more important?   That has been the question forever in computer chess ... which is more important.  The truth is that it is a combination of both ... and this is also true in human chess.  Unlike what most people think, the biggest difference between a strong player and a weak player is tactical ability which would equate to search in the case of computer chess.  Even a weak player will know to put his rook on the open file ... but a strong player will know that he is looking at that and might see a stronger move that works tactically.  I would equate that to a weak player having poor eyesight and a strong player having great eyesight.  Put glasses on the weak player (such as allow him to check his moves tactically with a computer first) and he will play very strong and maybe even stronger than the strong player.  This is why in many cases a nominal player will be able to compete with even the strongest GM if he has computer assistance ... even it the GM is allowed to have equivalent hardware to also check his moves.  We see this all the time in the freestyle events.  So in the end it is a combination of fast search and good evaluation that matters.

Also I am not so sure that all there is with Rybka is fast search.  Obviously much more is there.  If Rybka gets a 3X speed advantage that would mean that Rybka on 3X slower hardware handicap should be performing equal to Crafty or Fruit.  It would be interesting to put this to a test .... Rybka on say an Athlon 1.2Ghz against a Quadcore 3.2Ghz Crafty.  That would probably be around a 10X speed difference ... yet I think Rybka would easily come out on top.  I have a lot of respect for Anthony but I think he still hasn't seen all there is to see in Rybka.  The evaluation function in Zappa is great ... and in some cases superior to Rybka ... but in the majority of the games I see, Rybka has much better evaluation.  I think people way underrate the value of evaluation tables that are in Rybka ... these have been honed down with thousands of games ... they have been tested and slowly improved and seem to add tremendous knowledge to Rybka.  They are not perfect and probably will never be ... but they definetely are right more times than any other program.  So a combination of faster search with superior evaluation is what makes Rybka strong.

Rybka can still improve quite a bit I think.  Multiprocessor efficiency is still a work in progress and can only improve.  Also a better implementation of endgame tablebases (such as implementing bitbases) would allow Rybka to have all 6 piece EGTB's (and some 7) without any drop off in nps.  As more and more cores are added to CPU's ... being able to efficiently make use of these cores will be paramount to progress.  IMHO the way to do that is to develop the engine in a way where you split the engine into seperate independent modules on independent motherboards ... each module would be extremely good at a specific aspect of the game ... opening ... middle game ... tactics ... positional accuracy ... and ofcourse endgame play.  Then you would have communication between these modules and an algorithim that would change dynamically during the course of the game that would set a system to what move to choose and from which module.  While communicating between these seperate modules would probably have a large latency hit ... if it works and I am sure it would work ... a quick adaptation of these cores on a single machine with many cores would immediately remove the latency hit and thus give you a performance boost.
Parent - - By Not the real Kasparov (***) Date 2008-02-05 08:24 Edited 2008-02-05 08:34
Does anyone know of the Elo improvement with Zappa Mexico 2?  I've just lost 3 games in a row with superior hardware, and from 5 games have yet to win 1. I know it only a tiny sample but have there been big improvements in Zappa?, she appears very strong. I was using 2.3.2a.
Parent - - By Carl Bicknell (*****) Date 2008-02-05 08:31

Where are all the male programs?
Parent - - By Not the real Kasparov (***) Date 2008-02-05 08:34
All programs are female lol :)
Parent - - By Vempele (Silver) Date 2008-02-05 08:53
Shredder is male, at least according to SMK's tournament reports.
Parent - - By Carl Bicknell (*****) Date 2008-02-05 09:02
Fritz? Fritz is a bloke, right?
Parent - By JhorAVi (***) Date 2008-02-05 09:30
Fritz is GAY :) :)
Parent - - By Uly (Gold) Date 2008-02-05 09:45
I've always considered Zappa genderless...
Parent - By M ANSARI (*****) Date 2008-02-05 11:51
I am testing Zappa II now and it does seem to be much better at blitz than before although at slow time intervals it is about the same.  I am testing it at 5 0 blitz using Quadcore 3.2Ghz and 16 0 using Quadcore 4.4Ghz.  The 4.4Ghz machine seems to be having some problems loading Zappa engine sometimes and that could be due to overclocking.  The other machine has Rybka scoring only 63% so far after almost 90 games ... before Rybka would score around 70% at 5 0 blitz.  So I would say the improvement seems mostly to be at fast time controls.
Parent - - By Uri Blass (*****) Date 2008-02-05 11:50
I am sure that Crafty is going to lose with 10:1 time handicap against rybka.
Public Movei is losing against rybka32 bit in these conditions if you use slow time control.

Note that Movei can beat Rybka 32 bit with 25:1 time advantage but maybe the time control that I used was not slow enough for rybka(I did not work on movei in the last months and I prefer to spend my time in understanding strelka before rewriting movei).

If we talk about evaluation then my opinion is that in the majority of the game you cannot know if rybka has better evaluation or Zappa has better evaluation.

The only case when it is clear that program A has better evaluation is if there is a big difference in the evaluation for some moves and it is clear that one of them is right.

In most games it simply does not happen so you cannot say based on watching the games which program has better evaluation.
If program slightly improves her position then it may be because of better evaluation or because of deeper search and you have no way to know based on watching the game.

Parent - - By BB (****) Date 2008-02-05 13:50 Edited 2008-02-05 14:21

> I am sure that Crafty is going to lose with 10:1 time handicap against rybka.

I agree completely. Back in the late 90s, Rebel 8.0 gave Crafty 11.17 a 50:1 (or 100:1) handicap in the "NPS Challenge" - the match was aborted, but Rebel won the only game that was completed. I would opine that one game extant (with the analysis) gives a reasonable idea of why Crafty won't win with these time odds.
Parent - - By Uri Blass (*****) Date 2008-02-05 14:00
This single game is a bad example because of the following:
1)I doubt if rebel can continue in that way and one game proved nothing.
2)Crafty improved from that time.
I also think that Rybka can give 10:1 time handicap to rebel or prodeo and win.
Movei is slightly stronger than prodeo based on rating lists and if rybka can do it to movei it probably can do it to prodeo.

Parent - By BB (****) Date 2008-02-05 14:41

> This single game is a bad example because of the following:

I agree that the game result is a bit useless. However, the engine analysis can also be taken into consideration, for instance, how does Crafty's eval/PV change as depth increases [I thought the Crafty log was available, but I guess not - in fact, I think I was remembering a different 100:1 game (aborted at move 39), Crafty 11.19 versus Hiarcs 6 on, where the Crafty logs are extant]. Also, I was not trying to predict Crafty-Rybka at 10:1 per se, but rather to demonstrate that large time odds are not insuperable.
Up Topic Rybka Support & Discussion / Rybka Discussion / Anthony Cozzie speak !
1 2 3 4 5 6 Previous Next  

Powered by mwForum 2.27.4 © 1999-2012 Markus Wichitill