What’s the key factor to win in the 40/4 matches?

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

nkg114mc
Posts: 74
Joined: Sat Dec 18, 2010 5:19 pm
Location: Tianjin, China
Full name: Chao M.

Re: What’s the key factor to win in the 40/4 matches?

Post by nkg114mc »

PK wrote:https://sites.google.com/site/sungorus/

giving it Fruit eval would require rewriting piece/square code - at least this is what I've done in the early version of Rodent. Sungorus uses only one set of piece/square tables, which are vertically symmetrical (!). It updates this score component while making and unmaking moves using one variable (no split for mg/eg and no split for colors). To "frutify" eval it would be needed to split this variable and probably to add incrementally updated game phase variable. At least this is what I've done at the beginning of Rodent :)
Hi Pawel, thanks and glad to see your reply! It reminds me that, it might also be interesting to see the tournament between Rodent-SungEval vs Sungorus, because I feel that, since Rodent is developed based on Sungorus, the sungorus evaluation score would be more compatible to Redoent than Fruit.

I start to do this from yesterday, by replacing both "ReturnFull" and "ReturnFast" with the sungorus evaluation (not sure whether this make sense, but I just want to keep it simple :) ), and launched a similar 4000 round tournament. It is not complete yet, around 2600 games have been played, but result is almost clear: Rodent-SungEval won around 80% scores. Here is the BayesElo output:

Code: Select all

Rank Name              Elo    +    - games score oppo. draws 
   1 rodent_sungeval   107    7    7  2357   79%  -107   25% 
   2 sungorus         -107    7    7  2357   21%   107   25% 
I not sure it is proper to say that "Rodent 1.6 got around 200 elo improvement purely from improving the searcher", but it is still interesting that Rodent perform much better than Fruit-SungEval :)

I will also share my implementation of Rodent-SungEval once the tournament is done. I compile Rodent on RedHat linux, so I made some trivial change on the code. Hope it does not introduce any bugs~
nkg114mc
Posts: 74
Joined: Sat Dec 18, 2010 5:19 pm
Location: Tianjin, China
Full name: Chao M.

Re: What’s the key factor to win in the 40/4 matches?

Post by nkg114mc »

cdani wrote:Hello.
I have done the same test with Andscacs.

Andscacs with the evaluation of Sungorus easily outsearches Sungorus, obtaining 80% of the points.

Here I put a file with the two executables, two pgn files, and a cutechess bat file I used to do the tests:

www.andscacs.com/sungorus/sungorus.rar

So I suppose that something was bad with Fruit-sungorus, or the evaluation function is very incompatible. Or may be it's just that Andscacs is a lot stronger than the version of Fruit used.

I have done much shorter tests. May be someone wants to do better ones.
Hi Daniel,

Thanks very much for doing the test! This makes the disscussion more interesting. I guess Andscacs also has strong searcher. My original target to investigate this problem is just want to understand whether the improvement on search and on evaluation of an engine can be measured independently, but looks like this is quite complicate problem. At least now, given the same Sungorus evaluation, engines that all orignally have a 2600+ elo on 40/4 tc perform very differently. Hope I can find what's the main reason of these difference.
nkg114mc
Posts: 74
Joined: Sat Dec 18, 2010 5:19 pm
Location: Tianjin, China
Full name: Chao M.

Re: What’s the key factor to win in the 40/4 matches?

Post by nkg114mc »

Ferdy wrote:
Uri Blass wrote:
Ferdy wrote:
nkg114mc wrote:
Ferdy wrote:
nkg114mc wrote:Hi all,

The question came from an experiment that I did recently. I replace the evaluation function in Fruit 2.1 engine with the evaluation function of Sungrous 1.4. Let’s call this hybrid engine “Fruit-SungorusEval”. Here replace means I implement an evaluation function for Fruit 2.1 that always returns exact same value as Sungorus given the same position. Then I launched an tounament with 40 moves/4 minutes time control between Sungrous 1.4 and Fruit-SungorusEval with 4000 rounds (with repeat) and 16 plies random openings. The result by BayesanElo shows that Fruit-SungorusEval is 70 elo weaker than Sungorus.

Rank Name Elo + - games score oppo. draws
1 sungorus 35 6 6 4000 60% -35 24%
2 fruit-sungeval -35 6 6 4000 40% 35 24%

This result is a little surprising to me, because I think fruit has a more complicate search implementations. However, the Fruit-SungorusEval lost the tounament for 70 elo, so I came to ask these questions:

Firstly, do you think this test setting can imply the conclusion that Fruit-SungorusEval is weaker for 70 elos in the 40/4 time control?
Secondly, if the setting is OK, what’s the main reason that Fruit-SungorusEval become weaker? I know that an evaluation function should be matched with a search algorithm in an engine, but I never expect a simplified evaluation would cause such a huge elo drop. Did some one see some similar results before?

Thanks for all suggestions!
To know the overall effect of using Sungorus eval in Fruit, perhaps you could have run the test, Fruit_sungorus_eval vs Fruit_fruit_eval.

Another option is to run a new match between, Sungorus vs Fruit. Then compare it with your existing result. This way we can measure too the total effect of Sungorus eval when applied to Fruit.
Hi Ferdinand,

Thanks for the comment! The "Fruit_sungorus_eval vs Fruit_fruit_eval" is exactly what I did now.
Looking forward to the result of that test.
Ma wrote:For the matches between standard Sungorus 1.4 and Fruit 2.1 can be found in CCRL 40/4,
I checked the pgn and there were no games between the two engine.
Mao wrote:where these two engine shows around 400 elo different, and Fruit is stronger. That's why I feel surprised to test result I got.
Lets say we use the data at this time based on playing other engines we have,
Fruit 2.1 = 2685
Sungorus 1.4 = 2311
Diff = 2685-2311 = 374

So the effect of that eval change is 374+70 = 444 rating points. This indeed is surprising.

But it is better to compare to the result of the match between the two only.
I do not understand how do you get 374+70
The +70 is from original post.

Code: Select all

Rank Name            Elo    +    - games score oppo. draws
   1 sungorus          35    6    6  4000   60%   -35   24%
   2 fruit-sungeval   -35    6    6  4000   40%    35   24%
at CCRL:
fruit = 2685
sungorus = 2311
+374 for fruit

new test:
+70 for sungorus over fruit_sung, fixing sungorus at 2311 we get fruit_sung at,
2311-70 = 2241
fruit_sung = 2241
sungorus = 2311
Overall effect between fruit and fruit_sung = 2685-2241 = 444.
I understood the following
Fruit 2.1 = 2685
Sungorus 1.4 = 2311
Fruit2.1(with bad simple evaluation of Sungorus)=2685-70=2615
Thanks for the explaination, Ferdinand. I gues Uri was confused by my bad description. I noticed that I use a resign option in my cutechess commands, and hope it does not cause problem on the result. One of my job about matches between Rodent and Sungorus was stucked on the cluster because I forgot to commen out the InputAvailable() in Rodent. Once this is resolved I will run the tounament with Fruit 2.1, Sungorus 1.4 and Fruit-SungEval, I think that will make things more clear.