Thoughs on eval terms

Kempelen · Post by **Kempelen** » Mon Mar 31, 2014 1:05 pm

Hi,

I don't know if you have observe the same as me. In my engine, when I have an eval component, if I try to improve it, or remove same part of it, I don't get any elo gain or loss. The only thing that matter is to have it. For example, I have code that give a plus if side has the right to move when not in check. If I toy trying to give it always (check or not check), I get the same result in tests. Only if I remove that evaluation term I get an elo loss.

This is something that has happend to me in the past with other eval terms, and something that puzzle me. Is like if the most important think would be to have a good evaluation function, but not a precisse one. Saying this, as I have more important term in my eval, I don't know how to improve it, as any try of this kind I do, I always dont get any good result.

Do you have opinion or experience about this topic?

regards
Fermin

Henk · Post by **Henk** » Mon Mar 31, 2014 1:24 pm

I have trouble with measuring improvement when modifying eval. Adding a mobility term appeared to be no gain for I already counted mobility in piece square table. First I thought it was a gain but later on I did some extra tests and I got contrary results. What is left over is a bad bishop penalty and even then I am not totally sure if that is a gain.

Other eval terms gave trouble too. For instance a centre pawn blocked by it's own pieces. Same holds for isolated and weak pawns. Maybe I have to run thousands of games to know for sure it is a gain or not. On the other hand I don't like my chess program playing awkward moves.

Uri Blass · Post by **Uri Blass** » Mon Mar 31, 2014 2:26 pm

Henk wrote:I have trouble with measuring improvement when modifying eval. Adding a mobility term appeared to be no gain for I already counted mobility in piece square table. First I thought it was a gain but later on I did some extra tests and I got contrary results. What is left over is a bad bishop penalty and even then I am not totally sure if that is a gain.

Other eval terms gave trouble too. For instance a centre pawn blocked by it's own pieces. Same holds for isolated and weak pawns. Maybe I have to run thousands of games to know for sure it is a gain or not. On the other hand I don't like my chess program playing awkward moves.

How can you count mobility in piece square table?
I think that it is impossible because the same piece on the same square can have a different mobility in different position.

I think that mobility is something that you need in a chess program and without it the program is going to be weaker.

piece square table is not a good replacement for mobility and
probably you need both of them but if I choose only one of them then I guess that it is better to choose mobilty (except when we are talking about pawns).

Stan Arts · Post by **Stan Arts** » Mon Mar 31, 2014 2:33 pm

Sad truth is, (finding this out again currently as lately I've finally regained some interest in computerchess and writing something again.) search is what wins the games and 2-3 real ply extra with a completely empty eval besides a single PSQT for all the pieces seems about enough to overcome a maturely developed evaluation. Search just fills a ton of common knowledge gaps.

So a lot of evaluation code is special case. The stuff that short term search doesn't really fill. You watch it play, you notice a gap, you fix it. But the pattern may only happen once every 10 30 or a 100 games. It likely then doesn't translate directly to huge Elo gain but you still need it. Can you measure an Elo gain for trapped bishop code? If so it must be rather small but it's pretty important to have in a practical sense. Because it'll happen exactly at important games.
I'm sure though all that code combined DOES lead to a rather substantial Elo gain right. Right..?

Henk · Post by **Henk** » Mon Mar 31, 2014 2:34 pm

Uri Blass wrote:
Henk wrote:I have trouble with measuring improvement when modifying eval. Adding a mobility term appeared to be no gain for I already counted mobility in piece square table. First I thought it was a gain but later on I did some extra tests and I got contrary results. What is left over is a bad bishop penalty and even then I am not totally sure if that is a gain.

Other eval terms gave trouble too. For instance a centre pawn blocked by it's own pieces. Same holds for isolated and weak pawns. Maybe I have to run thousands of games to know for sure it is a gain or not. On the other hand I don't like my chess program playing awkward moves.
How can you count mobility in piece square table?
I think that it is impossible because the same piece on the same square can have a different mobility in different position.

I think that mobility is something that you need in a chess program and without it the program is going to be weaker.

piece square table is not a good replacement for mobility and
probably you need both of them but if I choose only one of them then I guess that it is better to choose mobilty (except when we are talking about pawns).

Mobility in piece square table is just an all moves of the piece on an empty board.

If I compute mobility in eval I loose at least one ply. And nothing beats an extra ply they say. So for instance my eval would be 0.1 Pawns more accurate but in some positions it wouldn't see that it looses a piece or gets check mated.

Stan Arts · Post by **Stan Arts** » Mon Mar 31, 2014 2:49 pm

Henk wrote: If I compute mobility in eval I loose at least one ply. And nothing beats an extra ply they say. So for instance my eval would be 0.1 Pawns more accurate but in some positions it wouldn't see that it looses a piece or gets check mated.

Losing a ply is a 200-300% slowdown. That sounds a bit off.
I can calculate mobility ten times and raytrace a dancing dinosaur before that happens.

Ferdy · Post by **Ferdy** » Tue Apr 01, 2014 12:19 am

Fermin wrote:I have code that give a plus if side has the right to move when not in check. If I toy trying to give it always (check or not check), I get the same result in tests.

There should not be a problem with this, if in check, you really need the move, the same if not in check, the engine most ofen needs the move to get ahead with the active piece placement, unless the position is a zugzwang. Did you brutally test this with 30k games or more

.

There are eval features that require more games to get retired. Even more subtle if that feature requires more games at longer TC.

I have implemented an eval feature that works in tc 40/15s, and 40/60s but does not work in 40/180s.

Henk · Post by **Henk** » Tue Apr 01, 2014 12:38 am

Stan Arts wrote:
Henk wrote: If I compute mobility in eval I loose at least one ply. And nothing beats an extra ply they say. So for instance my eval would be 0.1 Pawns more accurate but in some positions it wouldn't see that it looses a piece or gets check mated.
Losing a ply is a 200-300% slowdown. That sounds a bit off.
I can calculate mobility ten times and raytrace a dancing dinosaur before that happens.

If I have 16 pieces and say 40 moves. I get a factor 40/16. That's more than 200%.

Uri Blass · Post by **Uri Blass** » Tue Apr 01, 2014 1:40 am

Henk wrote:
Stan Arts wrote:
Henk wrote: If I compute mobility in eval I loose at least one ply. And nothing beats an extra ply they say. So for instance my eval would be 0.1 Pawns more accurate but in some positions it wouldn't see that it looses a piece or gets check mated.
Losing a ply is a 200-300% slowdown. That sounds a bit off.
I can calculate mobility ten times and raytrace a dancing dinosaur before that happens.
If I have 16 pieces and say 40 moves. I get a factor 40/16. That's more than 200%.

I do not understand how do you get something close to it unless you have only piece square table and mobility in the evaluation and even in the last case you only test the speed of the cheap evaluation function when most of the time is not used in the evaluation function.

jdart · Post by **jdart** » Tue Apr 01, 2014 4:02 am

Intuition and experiment used to be the way to tune this stuff, but the state of the art nowadays is to have an automated testing framework and where possible also do automatic optimization of parameter values. You still need to have a reasonable starting point for an eval, and you need to feed in new ideas, but the idea is that it is possible to automate the testing and tuning of these.

Besides CLOP, which is one optimization tool, I have also done some experimentation with SMAC (http://www.cs.ubc.ca/labs/beta/Projects/SMAC/) and am looking at Opal (https://github.com/dpo/opal).

--Jon

Thoughs on eval terms

Thoughs on eval terms

Re: Thoughs on eval terms

Re: Thoughs on eval terms

Re: Thoughs on eval terms

Re: Thoughs on eval terms

Re: Thoughs on eval terms

Re: Thoughs on eval terms

Re: Thoughs on eval terms

Re: Thoughs on eval terms

Re: Thoughs on eval terms