More accurate evaluation function leads to worse play?

Discussion of chess software programming and technical issues.

Moderator: Ras

xylist
Posts: 3
Joined: Fri Feb 07, 2025 6:06 am
Full name: Zhongle C. Qu

More accurate evaluation function leads to worse play?

Post by xylist »

Hi everyone. I've been trying to improve the evaluation function of my engine recently to make it more accurate. To measure the accuracy, I selected 50k quiet positions (side to move not in check, best move not capture or check), used both my engine and stockfish to analyze them with low depth, and compared them with the static evaluation. I computed R^2 scores (coefficient of determination) and there is indeed an improvement (from 0.38 to 0.52). However, when I test the engine, the playing strength actually dropped (-180 elo). This feels so counterintuitive. Shouldn't a more accurate evaluation function result in a gain in playing strength?
benvining
Posts: 31
Joined: Fri May 30, 2025 10:18 pm
Full name: Ben Vining

Re: More accurate evaluation function leads to worse play?

Post by benvining »

Maybe not if the upgraded eval function takes way more time than the simpler one?
xylist
Posts: 3
Joined: Fri Feb 07, 2025 6:06 am
Full name: Zhongle C. Qu

Re: More accurate evaluation function leads to worse play?

Post by xylist »

benvining wrote: Thu Jul 17, 2025 5:30 am Maybe not if the upgraded eval function takes way more time than the simpler one?
The upgraded one is around 1.3x slower, so that shouldn't be a huge problem. However I noticed that the branching factor has increased a little, and the engine is now looking at a lot more nodes. But I have no idea why this is happening.
User avatar
Bo Persson
Posts: 259
Joined: Sat Mar 11, 2006 8:31 am
Location: Malmö, Sweden
Full name: Bo Persson

Re: More accurate evaluation function leads to worse play?

Post by Bo Persson »

When you have a more "accurate" eveluation, you might also get more different scores. If you have scores 10, 10, 10 you can get cut-offs from "no improvement", but scores 11, 10, 12 might requires more search to tell them apart.

It is common to have to balance speed and "accuracy" in the program, and realize that some evaluation terms might just be to expensive to compute. Getting the correct answer too late doesn't help.
User avatar
Tibono
Posts: 129
Joined: Sat Aug 01, 2015 6:16 pm
Location: France
Full name: Eric Bonneau

Re: More accurate evaluation function leads to worse play?

Post by Tibono »

I think a better evaluation is one that makes your engine more "comfortable" (i.e. efficient) with.
Getting closer to Stockfish's eval drifted it away from positions it manages best.
Just my 2 cents...
xylist
Posts: 3
Joined: Fri Feb 07, 2025 6:06 am
Full name: Zhongle C. Qu

Re: More accurate evaluation function leads to worse play?

Post by xylist »

Tibono wrote: Thu Jul 17, 2025 6:35 pm I think a better evaluation is one that makes your engine more "comfortable" (i.e. efficient) with.
Getting closer to Stockfish's eval drifted it away from positions it manages best.
Just my 2 cents...
What do you mean by "positions it manages best"?
User avatar
Tibono
Posts: 129
Joined: Sat Aug 01, 2015 6:16 pm
Location: France
Full name: Eric Bonneau

Re: More accurate evaluation function leads to worse play?

Post by Tibono »

xylist wrote: Fri Jul 18, 2025 7:29 am What do you mean by "positions it manages best"?
Like any engine, yours has strengthes & weaknesses. I mean positions where it can show, rely on, its strength rather than expose itself to danger because of some potential weakness.
Tuning the eval towards SF's may lead to unnatural moves with regards to your engine skills, TMHO.
Uri Blass
Posts: 10803
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: More accurate evaluation function leads to worse play?

Post by Uri Blass »

changing the evaluation may change the search of the engine so even simply multiplying the evaluation by 2 can cause a reduction or improvement in playing strength because of different pruning even if the evaluation has the same accuracy.
Uri Blass
Posts: 10803
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: More accurate evaluation function leads to worse play?

Post by Uri Blass »

I can add that evaluation that is more accurate can be less accurate in the question if position A is better than B.

if some engine is too optimistic in every position it may know better that position A is better than B relative to the case that you change the evaluation to be correct only in part of the cases.

The engine may prefer losing move relative to drawing move that it know that it is a draw when earlier it prefered the drawing move because it considered it as +2 and the losing alternative only as +1