Gerd Isenberg wrote:
Hmm, what you quote funny (Qxd4 Nxd4) looks very reasonable to me, and I would expect every engine reporting each fail high at the root that way, since Qxd4 is the only capture and tried first in the first iteration of an IID framework. What is the point?
There aren't many engines reporting each fail high at the root that way, and especially not for the entire first iteration.
Just try it. I guess most people who developed their own engines got annoyed by this sort of output.
Then there is the fact that you must do move ordering in a specific way in order for that move to be ordered first... (it is never first for me, for example)
Last edited by Gian-Carlo Pascutto on Sat Feb 12, 2011 9:23 pm, edited 2 times in total.
Damir wrote:same evaluation, so what? No prove that engines are are clones or of same strength just because they show exact fail high at the root...
what's your point ?
If you cannot see the point of my first post in this thread I doubt you ever will.
Christopher Conkie wrote:
[d]rn1q1r2/p1p1ppkp/1p4p1/8/2BP4/2N2N2/PP3PPP/R1B1K2R b KQ - 0 11
Houdini 1.0 x32: 1/3 00:00 30 0 -8.70 Qxd4 Nxd4 (this bit is very funny indeed)
Chris
Hmm, what you quote funny (Qxd4 Nxd4) looks very reasonable to me, and I would expect every engine reporting each fail high at the root that way, since Qxd4 is the only capture and tried first in the first iteration of an IID framework. What is the point?
Gerd
I would ask you to look again at the comparison Gerd.
Chris
Sorry, I don't understand it. You must be more patient and explicit.
You mean same scores after Qxd4 Nxd4? Coincidence that a weighted sum of different features has the same possibly rounded value?
Cubeman wrote:How do the games from that position end, it would be interesting for some test games between the so called Ippo clones and the traditional other strong engines.A wrong evaluation would show up in game results.Sometimes I think Human evaluations are not necessary the absolute truth.I also imagine that there could be many engines out there even before Rybka beta that would evaluate similar scores as Houdini and Critter.
I ran off a quick 100 games using the Monte Carlo feature of Rybka 4 at five ply (which is really 8 ply). White won by 78 to 22 confirming the human GM assessment. I imagine that really ancient engines might score this around zero, but this should have no relevance to how current engines evaluate.
Christopher Conkie wrote:
[d]rn1q1r2/p1p1ppkp/1p4p1/8/2BP4/2N2N2/PP3PPP/R1B1K2R b KQ - 0 11
Houdini 1.0 x32: 1/3 00:00 30 0 -8.70 Qxd4 Nxd4 (this bit is very funny indeed)
Chris
Hmm, what you quote funny (Qxd4 Nxd4) looks very reasonable to me, and I would expect every engine reporting each fail high at the root that way, since Qxd4 is the only capture and tried first in the first iteration of an IID framework. What is the point?
Gerd
I would ask you to look again at the comparison Gerd.
Chris
Sorry, I don't understand it. You must be more patient and explicit.
You mean same scores after Qxd4 Nxd4? Coincidence that a weighted sum of different features has the same possibly rounded value?
Gerd
I think Gian-Carlo has answered part of what I am showing above. However, I would also point out that the moves are exactly the same through depth 1-9 in my original post and......
......that the draw score is also obtained at the same depth.
This is a major feature of all Ippolit derivatives (I mean the first iteration).
Upon examination the Robbolito you see was/is the closest thing (well its the same thing really) to Houdini 1.0.
It is why we say the first Houdini was a direct copy of Robbolito. When SMP was added, that was taken from Ivanhoe.
I can but tell you what I/we see when we test engines. I hope this explains it a bit more clearly.
By the way, an underlying issue that causes the funny output is that Ippolit is using MVV/LVA scoring to order the moves at the root (in fact, they are passed from a function higher up), whereas they use (proper) SEE in the search tree, and correctly order Qxd4 backwards there.
This is why you won't find much engines with such output. You must have almost the same bug.
Christopher Conkie wrote:
[d]rn1q1r2/p1p1ppkp/1p4p1/8/2BP4/2N2N2/PP3PPP/R1B1K2R b KQ - 0 11
Houdini 1.0 x32: 1/3 00:00 30 0 -8.70 Qxd4 Nxd4 (this bit is very funny indeed)
Chris
Hmm, what you quote funny (Qxd4 Nxd4) looks very reasonable to me, and I would expect every engine reporting each fail high at the root that way, since Qxd4 is the only capture and tried first in the first iteration of an IID framework. What is the point?
Gerd
I would ask you to look again at the comparison Gerd.
Chris
His point is that starting with the queen blunder is normal, while your point is that the evals are identical. The analysis you quote does seem to prove beyond any reasonable doubt that Houdini 1.0 and Robbolito are nearly identical in eval and perhaps also in low-depth search. The chance that two independent programs would produce the same moves at the same depths and with scores that are in the same ratio (except for the identical 8.7 scores) is about the same as the chance of being hit by a meteor. Even the eval diffs are fully accounted for by the score transformation formula posted here some time ago, which reduces the scores when the eval is modest but not when it is large.
Gian-Carlo Pascutto wrote:By the way, an underlying issue that causes the funny output is that Ippolit is using MVV/LVA scoring to order the moves at the root (in fact, they are passed from a function higher up), whereas they use (proper) SEE in the search tree, and correctly order Qxd4 backwards there.
This is why you won't find much engines with such output. You must have almost the same bug.
If I enable showing the first iteration (disabled in the ini file by default)
this is Gaviota's output
I get Chris point about similarities, but I Qxd4 in iteration 1 is perfectly normal. A close to 0.00 evaluation does not mean much. Gaviota does it and it is a weaker engine. In addition, I disagree with the implications of similarities to R2. Robbolito is stylistically similar to R3 not r2. In fact, Naum is VERY similar to R2 and not R3 and in this position gets a +1 eval.
I do not believe eval means anything here. Chris may have a point with the similarity in PVs + eval, but I disagree with Larry in the original post.