There is no doubt that R 3 eval was not a joke.Hart wrote:from the id: "Strelka R-3-E (Rybka-3-Eval)", by you know who.
It's a couple of years old and the source was not included. Can't even remember where I downloaded it from.
Miguel
Moderator: Ras
There is no doubt that R 3 eval was not a joke.Hart wrote:from the id: "Strelka R-3-E (Rybka-3-Eval)", by you know who.
It's a couple of years old and the source was not included. Can't even remember where I downloaded it from.
It was my first experiment with sources of Rybka 3.Dann Corbit wrote:What's strelka_3r?
I've never heard of it.
Just a confirmation of your data.Hart wrote:As far as RL/FB is concerned, they are most similar to Rybka 3, followed very closely by Rybka 3 Human, and the least like Rybka 3 Dynamic, when only comparing Rybka 3 flavors.M ANSARI wrote:That is interesting. I do think though a MPV output of say the best 2 moves is even better, or as someone mentioned to include the second ponder move. Also the positions should not be where many positions have a possibility of equivalent scores for different moves, so the move chose should be much better than the second best. With regards to Rybka, the contempt value set should not corrupt the engine output, so analyse contempt should be used. I would be very interested to see how different R3 human, default and dynamic are to say RL or other IPPOLIT based engines.
Naum stands out again. Interestingly, Rybka 2.2n2 is "closer" to Naum 4 than Rybka 1 or Rybka 3.Code: Select all
+------------------------------------------------------zappa_1.1 +---14 ! +------------------------------------------------zpa_mexico ! ! +---------------------------------------fruit_21 ! +------8 ! +-16 +----------------------------------------toga_1.3.1 ! ! ! ! ! +-------------------------------------------------ptor_1.3.2 ! ! ! ! +-----------------------------------dh64_1.3.3 ! ! +--------------3 ! ! ! +----------------------------------komodo_1.0 ! ! ! ! ! ! +--------------------------firebird_1 ! ! +-15 +-----------1 ! ! ! ! ! +-------------------------rlito_85g3 ! ! ! ! +-12 ! ! ! ! ! ! +------------------------------rybka_3 18-20 ! ! ! ! +-4 ! ! ! +---13 +------5 +-------------------------------rybka_3H ! ! ! ! ! ! ! ! ! +---------------------------------rybka_3D ! ! +-17 ! ! ! ! ! +-----------------------------------------strelka_3r ! ! ! ! ! ! ! ! +---------------------------naum_4.0 ! ! ! ! +-------2 ! ! ! ! +---6 +-------------------------naum_4.1 ! ! ! ! ! ! ! ! ! +---------9 +---------------------------------rybka_22n2 ! +-21 ! ! ! ! +------------------------------------rybka_1w32 ! ! +--7 ! ! +-----------------------------------strelka_2 ! ! ! ! +-----------------------------------------gurung_2.2 ! ! +-10 ! ! +------11 +-------------------------------------skfish_1.4 ! ! ! ! ! +-19 +-----------------------------------------skfish_1.6 ! ! ! +-------------------------------------------------spark_0.3a ! +-------------------------------------------------------bright_04a
Code: Select all
Extended majority rule consensus tree
CONSENSUS TREE:
the numbers on the branches indicate the number
of times the partition of the species into the two sets
which are separated by that branch occurred
among the trees, out of 1000.00 trees
+------naum 4.1
+995.0-|
+910.0-| +------naum 4.0
| |
+--------1000-| +-------------rybka 22n2
| |
| +--------------------strelka 2
|
+885.0-| +------doch64 1.3
| | +---------------1000-|
| | | +------komodo 1.0
| | |
| +828.0-| +-------------rybka 3
| | +681.0-|
| | | | +------firebird 1
| +964.0-| +-1000-|
+416.0-| | +------robbolito
| | |
| | +--------------------strelka 3r
| |
| | +------glaurung 2
| | +677.0-|
+794.0-| | +998.0-| +------sf 1.4
| | | | |
| | +--------------488.0-| +-------------sf 1.6
| | |
| | +--------------------spark 0.3a
| |
+997.0-| | +-------------bright 04a
| | +----------------------------553.0-|
| | | +------zappa mexi
| | +939.0-|
+------| | +------zappa 1.1
| | |
| | +-------------------------------------------------------protector
| |
| +--------------------------------------------------------------toga 1.3.1
|
+---------------------------------------------------------------------fruit 21
remember: this is an unrooted tree!
A question Yuri, does Strelka_3r search correctly with ply games or do you also have to deduct 3 ply?Osipov Jury wrote:It was my first experiment with sources of Rybka 3.Dann Corbit wrote:What's strelka_3r?
I've never heard of it.
I posted the link here:
http://talkchess.com/forum/viewtopic.ph ... ht=#321347
Thanks JuryOsipov Jury wrote:Strelka_3R has the same search as Strelka 2.0B with correct ply, nodes and PV.
Code: Select all
Engine Score
01: Strelka3_R 141.0/208
02: Ippolit 113.5/209
03: IvanHoe_v81_w32 111.0/209
04: FireBird 1.0 beta w32 109.5/209
05: Igorrit_0086v 107.5/208
06: Robbolito_0085f1_IA32 105.5/208
07: RobboLito_0085g3_w32 104.5/208
08: Igorrit_0086v9_w32 103.5/208
09: Igorrit_0086v_Plus 103.5/208
10: Igorrit_0086v2 102.5/208
11: IvanHoe_v68_w32_JR 102.0/209
12: RobboLito_0085e1_w32 102.0/209
13: IvanHoe_v73_w32 100.0/209
14: IvanHoe999970 94.0/209
15: Tankist 1.2 32-bit 94.0/208
I hope STS can be used for the clone detection test as it also comes with the partial credit moves which just maximizes the probing and the choices of engines will be more easily and comprehensively assessed.Don wrote:Suppose you ran 1000 random positions on many different versions of a
the same program, then run the same positions on many versions of
other programs. What could be deduced statistically from how often
the various program versions picked the same move?
The 1000 positions are from a set of positions that Larry Kaufman and
I created long ago that are designed to compare chess programs to
humans in playing style. So few problems are blatantly tactical and
in many of these positions the choice of moves is going to based on
preference more than raw strength.
The test compares any two programs by how often they pick the same
move, out of a sample of 1000 positions. I run each program to the
same time limit which in this case is 1/10 of a second.
I thought the point of STS was that there was an objectively best move (as well as a possibly 2nd or 3rd best for partial credit). If this is the case, then the better the programs are the more they would look like each other in terms of STS results. If anything, the positions you have rejected are more likely to be good test candidates, because assumably you rejected them as not having a clear best move. Or better yet, the positions you did not even consider using, because it is completely unclear what the best move might be.swami wrote:
I hope STS can be used for the clone detection test as it also comes with the partial credit moves which just maximizes the probing and the choices of engines will be more easily and comprehensively assessed.
Next version will be released probably by the end of this month, and we will have 1000 positions.