So what do we miss in the traditional evaluation?

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

User avatar
mclane
Posts: 18755
Joined: Thu Mar 09, 2006 6:40 pm
Location: US of Europe, germany
Full name: Thorsten Czub

Re: So what do we miss in the traditional evaluation?

Post by mclane »

Guenther wrote: Fri Jan 29, 2021 11:08 pm
Ferdy wrote: Fri Jan 29, 2021 3:57 pm We have the passed pawn, mobility, kingsafety, piece value, pst, piece threats etc. There must be something big that we failed to include in the traditional eval as the gap of nnue and non-nnue is huge.
Like others already more or less indicated, it's the transition between structures and piece values/mobility/paths.
Usually we just evaluate for opening/middlegame/endgame phase, but this is a bit like Black and White only, while obviously NN and NNUE and even PST tuning have shown that they are interacting in such a complex way, that they always will prevail against handwritten eval.

Near tablebase positions might currently still be an exception though, I think with less and less pieces, concrete evaluation principles might still be better than abstract endgame 'knowledge' from current training, but that's just my opinion.
This „transition“ as you call it, is the plan the programs don’t do.
All this evaluation must lead to something that is more then the material win or the positional win.
What seems like a fairy tale today may be reality tomorrow.
Here we have a fairy tale of the day after tomorrow....
User avatar
mclane
Posts: 18755
Joined: Thu Mar 09, 2006 6:40 pm
Location: US of Europe, germany
Full name: Thorsten Czub

Re: So what do we miss in the traditional evaluation?

Post by mclane »

Even more you consider WHY you put or move a piece here or there.
One primitive way is: I have to out it there be Ayse it is said to be good. Knight in the center gives good control.
But this is a low level thinking, as primitive as increasing the evaluation score for that piece.

Better: the knight on e5 could together with the queen on h5 and the rook coming via f1, f3, g3 create a mate net.
Because mate ends the game immediately instead of knight e5 increases evaluation only.

So next comes how to prepare this plan in the given position.

It can be you have to sac the main line (maximised score) at some stage for bringing your plan through.
What seems like a fairy tale today may be reality tomorrow.
Here we have a fairy tale of the day after tomorrow....
abgursu
Posts: 91
Joined: Thu May 14, 2020 3:34 pm
Full name: A. B. Gursu

Re: So what do we miss in the traditional evaluation?

Post by abgursu »

Dann Corbit wrote: Fri Jan 29, 2021 10:42 pm I think mobility may be undervalued in traditional evaluations.

When we think of the size of the evaluation data calculated for programs like LC0, I guess we are evaluating 0.1% of the terms that LC0 evaluates.
So in that case we are missing almost everything.
In the case of NNUE type evaluations, we might be examining as much as 1% of the terms.

Since SF nnue was not 1000 Elo stronger than SF, I guess that the additional terms don't add a gigantic amount, but they are clearly very important and more than 100 Elo stronger than the regular SF eval for nearly even positions.

So what do we gain?
We gain the freedom of not chasing down a zillion tedious terms.

So what do we lose?
We do not know what the darn thing is doing, so we gain zero knowledge about how to play better chess.
I was in same thought. I tried to set mobility values higher like a 1 billion times last year. I changed everything and everything but couldn't gained even any single ELO. I still think mobility must be higher, but I couldn't prove.
Alayan
Posts: 550
Joined: Tue Nov 19, 2019 8:48 pm
Full name: Alayan Feh

Re: So what do we miss in the traditional evaluation?

Post by Alayan »

The issue is differentiating "useless mobility" from "active/threatening mobility" and from "illusory mobility" where the piece must stay static to defend something.
Harald
Posts: 318
Joined: Thu Mar 09, 2006 1:07 am

Re: So what do we miss in the traditional evaluation?

Post by Harald »

May be this would help:
- Use evaluation patterns and features that are lasting longer and are not overruled by short tactics.
- Give the engine enough evaluation patterns that it is never lost in any situation in the game.
- Give it steps and bases in the position space that it can reach one by one and 'follow a plan'.
Like the people playing anti-computer chess. First close the center and one side with pawns.
Then move your pieces to the other side and attack the king.
dkappe
Posts: 1631
Joined: Tue Aug 21, 2018 7:52 pm
Full name: Dietrich Kappe

Re: So what do we miss in the traditional evaluation?

Post by dkappe »

Some practical observations from the training of the Toga III net.

1) training the net on 930m d8 evals — no game result, it is already some 150 elo stronger than Toga II’s HCE, playing in the original Toga II search.
2) training the resulting net on the same data, but on a much lower learning rate and using lambda 0.5 — I.e. 50% game result — it gets another 50 elo.

Looking at #1, this yields a function approximation of d8 scores. Would you expect 150 extra elo by adding 8 ply?
Fat Titz by Stockfish, the engine with the bodaciously big net. Remember: size matters. If you want to learn more about this engine just google for "Fat Titz".
Harald
Posts: 318
Joined: Thu Mar 09, 2006 1:07 am

Re: So what do we miss in the traditional evaluation?

Post by Harald »

Please have a look at another thread:
Playing with "The Secret of Chess"
http://www.talkchess.com/forum3/viewtop ... =2&t=76453
Patrice Duhamel
Posts: 193
Joined: Sat May 25, 2013 11:17 am
Location: France
Full name: Patrice Duhamel

Re: So what do we miss in the traditional evaluation?

Post by Patrice Duhamel »

Can we learn something from NNUE networks, using some kind of "reverse engineering" ?

Maybe it's a bad idea but for example taking thousands of winning positions with passed pawns and looking at the highest values in hidden layers, then look at the inputs used to produce these values, is it possible to find new ideas for traditional evaluation ?
Anything that can go wrong will go wrong.
Paloma
Posts: 1167
Joined: Thu Dec 25, 2008 9:07 pm
Full name: Herbert L

Re: So what do we miss in the traditional evaluation?

Post by Paloma »

I don't think so
User avatar
mclane
Posts: 18755
Joined: Thu Mar 09, 2006 6:40 pm
Location: US of Europe, germany
Full name: Thorsten Czub

Re: So what do we miss in the traditional evaluation?

Post by mclane »

Alayan wrote: Sat Jan 30, 2021 2:04 pm The issue is differentiating "useless mobility" from "active/threatening mobility" and from "illusory mobility" where the piece must stay static to defend something.
The useless mobility is that, that is not supporting the plan.
The active threatening mobility most often is part of a plan.

The plan is IMO the key to the progress of neural net or nnue.
What seems like a fairy tale today may be reality tomorrow.
Here we have a fairy tale of the day after tomorrow....