So what do we miss in the traditional evaluation?

Ferdy · Post by **Ferdy** » Fri Jan 29, 2021 3:57 pm

We have the passed pawn, mobility, kingsafety, piece value, pst, piece threats etc. There must be something big that we failed to include in the traditional eval as the gap of nnue and non-nnue is huge.

cdani · Post by **cdani** » Fri Jan 29, 2021 4:35 pm

Ferdy wrote: ↑Fri Jan 29, 2021 3:57 pm We have the passed pawn, mobility, kingsafety, piece value, pst, piece threats etc. There must be something big that we failed to include in the traditional eval as the gap of nnue and non-nnue is huge.

Probably the coordination between a lot of different parameters is much more important than thought. Nothing obvious or easy to extract.

Ferdy · Post by **Ferdy** » Fri Jan 29, 2021 7:24 pm

cdani wrote: ↑Fri Jan 29, 2021 4:35 pm
Ferdy wrote: ↑Fri Jan 29, 2021 3:57 pm We have the passed pawn, mobility, kingsafety, piece value, pst, piece threats etc. There must be something big that we failed to include in the traditional eval as the gap of nnue and non-nnue is huge.
Probably the coordination between a lot of different parameters is much more important than thought. Nothing obvious or easy to extract.

I think we did try something similar in the past, but were not persistent.

Angle · Post by **Angle** » Fri Jan 29, 2021 9:08 pm

I think that most of all we miss some very important nuances of endgmes: pawn structure, piece/pawn location and coordination. opposition, fortress detection, zugzwang motives, fake and real piece activity, piece paralysis, passed pawns that are really dangerous and others that only seems dangerous, king support of the passed pawns and many other sophisticated things which make some of the endgames won and the others (with the same material, piece mobiliry, centralization, passed pawn advancement) drawn.

mclane · Post by **mclane** » Fri Jan 29, 2021 10:23 pm

Evaluation is not playing chess. And maximising evaluation is not playing chess.

If you see pictures of people and evaluate them on a scale of 1-10 , or if you want 1-32768 or whatever number, you can order the people following your evaluation.
But if you meet these people in real life, will this scale , this evaluation help you to understand them ?!

Chess is not not about evaluation.
As life is not about giving people a number value.

That is the thing we miss.

Chess is about finding a plan to mate the opponent,

Sometimes, you need to sac a move that has a high evaluation, and make a weaker move, if this helps you to succesfully let your plan continue on the chess Board.

If the engine believes maximising the evaluation would be the sense of chess, then this Leads to something we see today.

Normal AB programs maximising the evaluation score.

But this is not chess.

As evaluation people is Not life.

Life is not about evaluating but about finding a plan to increase the spirit/synchronicity in your live.

Without plan , without perspective, without sense and wisdom, life is boring and a mistake.

E.g. building a family is such a PLAN. A house, a tree, a wife and a child.

Chess programs need to find a plan, and believe me, the plan is NOT the main line of today’s AB engines,

That is the thing we miss, IMO.

Dann Corbit · Post by **Dann Corbit** » Fri Jan 29, 2021 10:42 pm

I think mobility may be undervalued in traditional evaluations.

When we think of the size of the evaluation data calculated for programs like LC0, I guess we are evaluating 0.1% of the terms that LC0 evaluates.
So in that case we are missing almost everything.
In the case of NNUE type evaluations, we might be examining as much as 1% of the terms.

Since SF nnue was not 1000 Elo stronger than SF, I guess that the additional terms don't add a gigantic amount, but they are clearly very important and more than 100 Elo stronger than the regular SF eval for nearly even positions.

So what do we gain?
We gain the freedom of not chasing down a zillion tedious terms.

So what do we lose?
We do not know what the darn thing is doing, so we gain zero knowledge about how to play better chess.

Guenther · Post by **Guenther** » Fri Jan 29, 2021 10:44 pm

mclane wrote: ↑Fri Jan 29, 2021 10:23 pm Evaluation is not playing chess. And maximising evaluation is not playing chess.

If you see pictures of people and evaluate them on a scale of 1-10 , or if you want 1-32768 or whatever number, you can order the people following your evaluation.
But if you meet these people in real life, will this scale , this evaluation help you to understand them ?!

Chess is not not about evaluation.
As life is not about giving people a number value.

That is the thing we miss.

Chess is about finding a plan to mate the opponent,

Sometimes, you need to sac a move that has a high evaluation, and make a weaker move, if this helps you to succesfully let your plan continue on the chess Board.

If the engine believes maximising the evaluation would be the sense of chess, then this Leads to something we see today.

Normal AB programs maximising the evaluation score.

But this is not chess.

As evaluation people is Not life.

Life is not about evaluating but about finding a plan to increase the spirit/synchronicity in your live.

Without plan , without perspective, without sense and wisdom, life is boring and a mistake.

E.g. building a family is such a PLAN. A house, a tree, a wife and a child.

Chess programs need to find a plan, and believe me, the plan is NOT the main line of today’s AB engines,

That is the thing we miss, IMO.

I did not miss your posts - they just show a deep missunderstanding (for decades).
And chess is not life, so much about your analogy.

Frank Quisinsky · Post by **Frank Quisinsky** » Fri Jan 29, 2021 11:05 pm

Hi there,

I think mobility may be undervalued in traditional evaluations.

That's the point!
We learn that more aggressive pawns will give us more dynamic and mobilitly in the mid-games.

Often I am thinking for Benoni or Sicilian defence openings, others ... that games are max. draw for black if an important pawn in the middle of board is backward. With time engines like Stockfish or Dragon by Komodo are playing against the weak point very consistent. A human will have no chance for a draw in such openings.

I am quite sure that our opening theory must be overwork.
And all the strong engines will and can helps a lot.

Very often I can see very earlier in games attacking move with h-pawn, g-pawn against opponent castling.
For years we are thinking ... coffee-house chess ... Junior like to play sometimes.

We learn that bishops in closed pawn endgames are much weaker as knights.
How often I saw that a knight pair have an advantage in closed endgame positions with many pawns if the rooks not on the board. Many engines try to hold the bishop pair in closed positions and lost the games.

Looking in engine-engine games is much more interesting as in times Rybka or Shredder are on place 1. We can see today a complete other computer chess and that what we can see ... just great ... is fantastic!

Best
Frank

Guenther · Post by **Guenther** » Fri Jan 29, 2021 11:08 pm

Ferdy wrote: ↑Fri Jan 29, 2021 3:57 pm We have the passed pawn, mobility, kingsafety, piece value, pst, piece threats etc. There must be something big that we failed to include in the traditional eval as the gap of nnue and non-nnue is huge.

Like others already more or less indicated, it's the transition between structures and piece values/mobility/paths.
Usually we just evaluate for opening/middlegame/endgame phase, but this is a bit like Black and White only, while obviously NN and NNUE and even PST tuning have shown that they are interacting in such a complex way, that they always will prevail against handwritten eval.

Near tablebase positions might currently still be an exception though, I think with less and less pieces, concrete evaluation principles might still be better than abstract endgame 'knowledge' from current training, but that's just my opinion.

Uri Blass · Post by **Uri Blass** » Fri Jan 29, 2021 11:51 pm

mclane wrote: ↑Fri Jan 29, 2021 10:23 pm Evaluation is not playing chess. And maximising evaluation is not playing chess.

If you see pictures of people and evaluate them on a scale of 1-10 , or if you want 1-32768 or whatever number, you can order the people following your evaluation.
But if you meet these people in real life, will this scale , this evaluation help you to understand them ?!

Chess is not not about evaluation.
As life is not about giving people a number value.

That is the thing we miss.

Chess is about finding a plan to mate the opponent,

Sometimes, you need to sac a move that has a high evaluation, and make a weaker move, if this helps you to succesfully let your plan continue on the chess Board.

If the engine believes maximising the evaluation would be the sense of chess, then this Leads to something we see today.

Normal AB programs maximising the evaluation score.

But this is not chess.

As evaluation people is Not life.

Life is not about evaluating but about finding a plan to increase the spirit/synchronicity in your live.

Without plan , without perspective, without sense and wisdom, life is boring and a mistake.

E.g. building a family is such a PLAN. A house, a tree, a wife and a child.

Chess programs need to find a plan, and believe me, the plan is NOT the main line of today’s AB engines,

That is the thing we miss, IMO.

Evaluation is not playing chess but it is part of playing chess by engines including Non traditional engines.
I consider planning as part of the search because basically you think about putting the pieces in different places than the places they are
even if it is not only searching legal moves as engines do.

I want my knight at g1 to go to d5 so I see that a possible path is g1-e2-c3-d5 is not the way engines calculate but it is basically the same type of thinking that engines do because basically you put pieces in your imagination in different squares that they are and it is searching
and humans need to check opponents moves after they find the line g1-e2-c3-d5 and evaluate the final position.

So what do we miss in the traditional evaluation?

So what do we miss in the traditional evaluation?

Re: So what do we miss in the traditional evaluation?

Re: So what do we miss in the traditional evaluation?

Re: So what do we miss in the traditional evaluation?

Re: So what do we miss in the traditional evaluation?

Re: So what do we miss in the traditional evaluation?

Re: So what do we miss in the traditional evaluation?

Re: So what do we miss in the traditional evaluation?

Re: So what do we miss in the traditional evaluation?

Re: So what do we miss in the traditional evaluation?