Not logged inRybka Chess Community Forum
Up Topic Rybka Support & Discussion / Rybka Discussion / hash idea for Rybka 3
- - By Ty Nance (**) [us] Date 2008-04-03 21:14
If this post belongs in computer chess, then a moderator can move it. I want to know what Vas thinks about this.

Okay, so I have this idea. R3 will have persistent hash stored on the hard drive, and when we first install R3 we will select a size for this file and we'll have this big blank file. But what if it wasn't blank?

Larry's evaluation work is done (or close enough) and Jeroen has a Rybka 3 book that's close enough to being done that he could give a copy of the nearly-complete book to Vas. I have a process in mind that I don't know how to automate at home, but which Vas has the knowhow for:

While you're improving the search prior to release of R3, Let your current best version (with persistent hash, and Larry's best evaluation, maybe even his slow but accurate evaluation) play out Jeroen's book (not wasting time evaluating the positions that are in the book, we trust his work). Then, at the end of each book line, do what the Fritz 9 GUI calls "deep position analysis" with branching of say, 3, and taking it only say, 2 or 3 moves beyond the end of the book. Do this analysis at good depth.

What is good depth? On my 2-core laptop, I can get to depth 18 just fine. 19 in a bearable amount of time. I need to go read the Rybka forum while I wait for 20. I need to cook, eat a meal, do the dishes while I wait for 21. But on the forum I occasionally read people who get depths of 27 or so, and whose Rybkas are counting 1000 kiloNodes per second, but mine operates at around 100 kN/s.

So take the depth to 20 or 21, whatever you think is reasonable. After those 2 or 3 moves, it's time for a different branch in the opening tree.

Now you will accumulate a hash file on the harddrive whose only content comes from the process I've outlined above: it will have depth 20 knowledge of many of the positions that occur in the first few moves right after Jeroen's book ends. Make this file a part of the Rybka package, something we can load into our new-born hash files on our own computers. If the advantage is as I believe, then it should be obvious: even without having played through all those openings on your pc, your R3 will have "out of the box" positional learning for many positions which I think are very relevant.

To the nay-sayers: I'm aware that there isn't time to go through all of Jeroens book before R3 is released on Monday, 7 April (when my new octal machine arives via UPS!), but this doesn't discount the free extra depth R3 will have in those positions which do get covered. And I'm also aware that the moment this automated process begins, Larry will make some improvement to his evaluation, which will not benefit this hash file. That's okay: when you get R3, it will have Larry's best evaluation function, and it will be writing to the hash file. Technical problem? Will there be an error if the evaluation function that put the position and it's value into the hash file is not the same as the one which later (on my own pc at home) is making evaluations? Finally, I know that what I'm suggesting (a starter-kit hash file) is really data that we can produce for ourselves once we get R3, but as I see it, the data can be produced now (though a little slower than after the search is perfected) and production of this data can be automated in Vas' workshop.

ty
Parent - - By Vasik Rajlich (Silver) [hu] Date 2008-04-06 14:41
What you say is quite true: the persistent hash can help at the "tips" of opening theory, having the nice property that (unlike an opening book) it can never hurt. It can form a link between the opening book (where the quality threshold is quite high) and the engine play.

In theory, users can share and merge persistent hash files. I haven't worked out the details but merging two persistent hash files is a mathematically "sound" operation and we'll support it in some way, either via GUI or via some easy standalone app for expert users.

Some other futuristic possibilities would be to let opening book authors work manually with the persistent hash. For instance, Jeroen could say that in some position, move X is at least as good as a 20-ply search but not as good as a 21-ply search, so the engine should play that move without hesitation only if there isn't enough time for a 21-ply search. There are a lot of possibilities like this. You can search this forum for some interesting discussions about all of this.

Vas
Parent - - By Ty Nance (**) [us] Date 2008-04-06 15:41
Nice!

That's really exciting. By the way, I'm not completely confident about the etiquitte: if someone replies helpfully (as you have) to one of my posts, should I always follow it up with a "thanks" post, or is that just clutter and distraction? Brevity seems good, but I don't want people to think that their responses are ignored.

ty
Parent - - By Vasik Rajlich (Silver) [hu] Date 2008-04-06 16:10
I'd say that a little "thanks" doesn't hurt.

Vas
Parent - By tomski1981 (*****) [ca] Date 2008-04-06 22:45
thanks :)
Parent - - By Uly (Gold) [mx] Date 2008-04-06 19:11

> In theory, users can share and merge persistent hash files. I haven't worked out the details but merging two persistent hash files is a mathematically "sound" operation and we'll support it in some way, either via GUI or via some easy standalone app for expert users.


Hey, an idea is that you get to upload your hashes to a server, in where they are automatically merged with other users' hashes, and then you get to download the big merged hash for your own use, that contains all the information that all the users have gathered. This wouldn't work if the files are too big, however.
Parent - By Vasik Rajlich (Silver) [hu] Date 2008-04-14 12:37
This kind of thing is certainly possible. It won't need any special support from me and maybe it will even be best for it to be organized by some independent parties.

Vas
Parent - - By turbojuice1122 (Gold) [us] Date 2008-04-14 12:47
That will be the first next tactic for Freestylers: intentionally load wrong hash data so that opponents lose.
Parent - - By Uly (Gold) [mx] Date 2008-04-14 21:01

> That will be the first next tactic for Freestylers: intentionally load wrong hash data so that opponents lose.


Vas said that hash is the kind of information that would never hurt.
Parent - - By Permanent Brain (*****) Date 2008-04-14 21:23
It is imaginable that the info could be hacked, assigning bad evals to good moves, or good evals to bad moves, which can happen shortly after the typical book variations... but I think that's a "phantasy" scenario. But who knows.

The idea to be able to share and merge hash files, or learning results, sounds attractive. But actually, isn't it a bit of a technical overkill? On the other hand, creating and sharing program specific data is a big topic and important game communities feature, among many computer games fans (typical: custom maps, skins, so called "mods" with alternative content for the same program...). In computer chess, we of course had this all the time: But so far it mostly were the chess games, moves and positions, and engine analysis output, only. Maybe interchangeable hash content will be added to this list.
Parent - By Vasik Rajlich (Silver) [hu] Date 2008-04-19 12:07

> But actually, isn't it a bit of a technical overkill?


Definitely! Everything good is overkill.

> Maybe interchangeable hash content will be added to this list.


Absolutely. Hash info has the very nice property that merging it is a natural step.

Vas
Parent - - By Sesse (****) [no] Date 2008-04-14 21:38
There was an implicit assumption that the hash was indeed created by Rybka 3, not by some adversary (or by a buggy computer). :-)

/* Steinar */
Parent - - By Uly (Gold) [mx] Date 2008-04-14 21:51
Another idea is to feed Rybka with variations she doesn't understand, for example, some obscure variations on the King's Gambit that has a very hard refutation that Rybka cannot find, and tricks Rybka into thinking that 2. f4 is totally winning on the opening. I wonder how difficult would that be; if it's easy enough then allowing vandals to upload their hashes doesn't sound like a good idea...
Parent - By Vasik Rajlich (Silver) [hu] Date 2008-04-19 12:08
Ok, this is possible in theory - but it would be extremely difficult. I can't imagine that you could ever fool Rybka into thinking that the King's Gambit is winning, for example.

Vas
Parent - - By buffos (Silver) [gr] Date 2008-04-16 08:43
Persistent Hash and of course the ability to manipulate it (edit) in  the future will be a great tool not for those using engines to play chess but to analyze.
Lets say we analyze the Nf7 sacrifice of Topalov-Kramnik game. We create a very big tree of analysis , going back and forth, making many conclusions BUT we cannot make the engine aware of those. So if we have an editable hash  we could "Load" the hash from the tree of analysis we did WITH the evaluations we found.

Now imagine the possibilities. The engine is aware of very deep analysis and can actually help find hidden resources that it could not find because of mis evaluation.
Of course for this to happen correctly loaded hash entries should be modified by the engine only after certain conditions (for example the engine without the hash would still not see if a move is bad 20 move ahead and will be tempted to assign again to the hash table a different value. This should be avoided  and changing of values from the engine should be maid only if the extension of the new analysis is comparable to the lenght of the line that provided the current evaluation.

Anyway i think that engines should not be only used as "chess" horses  to compit in engine events, they are analysis tools, and it would be very nice to see some evolution at that field.
Parent - - By Vasik Rajlich (Silver) [hu] Date 2008-04-19 12:10

> Now imagine the possibilities. The engine is aware of very deep analysis and can actually help find hidden resources that it could not find because of mis evaluation.


Of course for this to happen correctly loaded hash entries should be modified by the engine only after certain conditions (for example the engine without the hash would still not see if a move is bad 20 move ahead and will be tempted to assign again to the hash table a different value. This should be avoided  and changing of values from the engine should be maid only if the extension of the new analysis is comparable to the lenght of the line that provided the current evaluation.

Right. The way to say this is that entries are replaced only by entries with a higher depth.

There are some cases where inferior entries can replace superior ones - those are just the vagaries of search.

Vas
Parent - - By buffos (Silver) [gr] Date 2008-04-21 13:52
Yes. its not simple. its need a rule to replace and it needs experimentations. Not an easy tool, and not a tool for the "masses" but certainly a tool to bring analysis to the next level.

For example a rule can be divized that  will have a) distance from max depth (the engine is calculating a score and the end leave node is D (distance) nodes away. b)score difference between current node and distant node , name it S. Probably the depth of the current evaluation has to be taken in account. Lets name d the depth of the current evaluation

So a simple inequality like 1/(coeff1*d-D)<coeff2*S should hold for the Score to change. or in compact form  [ coeff<S*(coeff1*d-D)] where coeff should be determined. 
Of course to make it 100% correct i must include abs of S and write the inequality   coeff<abs[S]*(coeff1*d-D)

So as the engine approaches end leaves and searches more and more deeply the search evaluation scores could be used to substitute the current scores

Trial and error can find correct coeffs but ths simple coeff1= 0.5 and coeff=0.1 will probably do the job.
Parent - - By Vasik Rajlich (Silver) [hu] Date 2008-04-22 23:55
Sorry, you really lost me here :)

The mechanics of a persistent hash are actually quite simple. When a position is searched, it is stored in the persistent hash with the depth at which it was searched. Later, the result is used when it has a high-enough depth.

Vas
Parent - - By buffos (Silver) [gr] Date 2008-04-23 08:00
Its very simple. (the formula describes a rule to change the persistent hash with the engine's calculated score
Imagine you are at position A. You start the engine and it reaches a position B.
Position B is also in the persistent hash table. Define D as the depth distance between A and B
Also the engine shows a score S difference reached at calculated depth d
We want to trust the engine score if its really searches deeper AND has an significant difference in score.
This is what the formula says

The formula says   coeff<abs[S]      *   (coeff1*d-D):
                                            ^         ^           ^
                                             |          |            |
                                          score    AND     deep enough

Lets say coeff=0.5  and write it again   coeff<abs[S]*(0.5*d-D)

(0.5*d-D) is positive if the engine's evaluation for position B (which is D far away)  has searched Double that distance.
So Big S and Big Depth will be required to overwrite the persistent hash. Coeff is used to cutoff low score-low depth combination
Parent - - By Vasik Rajlich (Silver) [hu] Date 2008-04-24 08:07
I still don't follow. What is the "score difference" - ie. what is subtracted from what?

In your example, when B is reached, the persistent hash entry is used for a cutoff if (among a few other things) the depth is high enough. If it isn't, the position is searched and the result is stored to the persistent hash.

Vas
Parent - - By buffos (Silver) [gr] Date 2008-04-24 12:56
Score difference = Score the engine is calculating right now - Score in the persistent hash

Actually the formula just tries to parametrize the "depth is high enough" and the score difference is "important enough not to ignore it".
Excacly your words in math (i guess so)

For example the engine shows a +2 score after a depth of 20 and the persistent hash +0.    Score difference is 2
The difference in depth is for example 6 plys.
[We also dont know if the score in the persistent has is human assigned or after an engine search at a certain depth, so i ignore this parameter.]

Depth difference is [20*coeff-6] , and if you choose coeff = 0.5 10-6=4 {this quantizes the deep enough}

So if we search "too" deep or have a "important" score difference then we prefer the engine's numbers and not the hash.

Just a parametrized rule, so one can define the "too" and the "important"
Parent - - By Vasik Rajlich (Silver) [hu] Date 2008-05-01 08:06
Ok, I see what you mean.

In general, depth takes absolute precedent. If you have a depth = 15 entry, then (obviously) when searching at depth = 15 the entry is used (if other criteria are met). When searching at depth = 16, you simply have no choice but to ignore this entry from the point of view of having a cutoff.

Vas
Parent - - By Vempele (Silver) [fi] Date 2008-05-01 08:09
Unless it's a mate score outside the window, of course.
Parent - - By Vasik Rajlich (Silver) [hu] Date 2008-05-01 09:05
For cutoff purposes, mate scores have infinite depth.

(But not for replacement purposes - that was one of the bugs in Rybka 1.0)

Vas
Parent - - By Vempele (Silver) [fi] Date 2008-05-01 13:09

> For cutoff purposes, mate scores have infinite depth.


With the exception of exact scores inside the window in PV nodes.
Parent - By Vasik Rajlich (Silver) [hu] Date 2008-05-03 15:56
Exact mate scores inside the window? Mate is mate!

(Only half-joking .. :))

Vas
Parent - - By Banned for Life (Gold) Date 2008-04-21 15:27
Best thing in my opinion would be to make the format of the permanent hash table public so that it can become a standard for this type of data and third parties can develop tools to use it in ways possibly not foreseen during the initial development.

Regards,
Alan
Parent - - By Vasik Rajlich (Silver) [hu] Date 2008-04-22 23:57
Yes, I agree.

If I end up making a standalone app for doing things like merging two persistent hashes, I'll probably make it open-source.

Vas
Parent - - By Banned for Life (Gold) Date 2008-04-24 14:13
Merging data from several files is certainly one thing you might want to do.

More interesting would be something along the lines of generating artificial persistent hash entries based on database results from positions. For example, if a database position has occurred 100 times and resulted in 95 wins and 5 losses, one could argue that this position deserves a very high score, regardless of what the eval says at any depth (in fact, its even better if the eval is really wrong because while you are trying to reach this position, your opponent will be helping you). Shredder actually allows you to assign a value for a position, but since the format of the file is not public, its not possible to generate the files in an automated manner (at least it wasn't last time I checked, which was probably with S10).

Regards,
Alan
Parent - - By Vasik Rajlich (Silver) [hu] Date 2008-04-28 13:51
This is a good example of something which could be done by a third party, once the format is public and accepted. It's not related to the engine and (obviously) it will need special interface support.

Also, keep in mind that the persistent hash isn't meant to replace the opening book, so this is probably a second-tier priority.

Vas
Parent - - By Banned for Life (Gold) Date 2008-04-28 17:03
One of the problems that I've always had with opening books is that the engine doesn't access them to find winning positions that are reachable from the current position (i.e. positions that have a high winning percentage without recent problems indicative of the line being cracked). Persistent hash of course should do this. The major problem that I see with current learning files is that they are not stored in a manner constant with efficient access which would effectively limit the size of a "usable" file. With an efficient access method for the persistent hash file, its not obvious to me what the drawbacks would be.

Regards,
Alan
Parent - - By Vasik Rajlich (Silver) [hu] Date 2008-05-01 08:03

> One of the problems that I've always had with opening books is that the engine doesn't access them to find winning positions that are reachable from the current position


Right - that's why this is a second priority and not a fourth priority.

> With an efficient access method for the persistent hash file, its not obvious to me what the drawbacks would be.


The main one is the development time to support this.

Another issue is that once users start filling their persistent hashes with data of their choosing (rather than data of Rybka's choosing), there will be more possibilities for bad data.

Vas
Parent - - By Banned for Life (Gold) Date 2008-05-01 14:15
OK, so to summarize, the problem is not that the approach is flawed, its that it will chew up too much development time and has the potential to confuse users if it is misused (which of course is guaranteed). Fair enough.

Regards,
Alan
Parent - By Vasik Rajlich (Silver) [hu] Date 2008-05-03 15:59

> its that it will chew up too much development time and has the potential to confuse users if it is misused


Of course, we're talking here for Rybka 3. I can definitely imagine some ultra-enhanced persistent hash in the future, tied to the engine, interface-independent (or with interface-specific hookups via an open protocol), with all sorts of bells and whistles. If done right, that could be very very cool.

Vas
Up Topic Rybka Support & Discussion / Rybka Discussion / hash idea for Rybka 3

Powered by mwForum 2.27.4 © 1999-2012 Markus Wichitill