importance of evaluation in chess

Uri Blass · Post by **Uri Blass** » Wed Mar 05, 2008 9:46 am

I changed strelka code to have only piece square table evaluation with small bonus for the side to move(piece square table mean that strelka knows to centralize the king in the endgame because the piece square table is average between opening and endgame) and tested it against the normal strelka in fixed depth match

The results were

normal 1-weak 1 26-3 and 11 draws
weak 2-normal 1 28-7 and 5 draws
normal 2-weak 2 27-10 and 3 draws
weak 3-normal 2 22-11 and 7 draws
normal 3-weak 4 19-17 and 4 draws
normal 6-weak 8 21-17 and 2 draws

The last result mean that the normal version was used at fixed depth of 6 plies against 8 plies of the piece square table version and similiar for other results.

Note that the piece square table is not optimal because I did not give bonus for pushing pawns that strelka has in the passed pawn code so strelka does not know to push forward passed pawns.

My conclusion is that it seems that 3 additional plies of very simple evaluation is probably close to be equivalent to 4 plies of strelka's evaluation.

I am going to see if this theory continue to hold when I test 9 plies of normal strelka against 12 plies of weak strelka.

Uri

Uri Blass · Post by **Uri Blass** » Wed Mar 05, 2008 11:55 am

I finished the match 9 against 12 and in this case 9 plies won more convincingly 23-13 and 4 draws so maybe 4 plies of simple evaluation worth slightly less than 3 plies of normal strelka and maybe simple strelka needs 5 plies at higher depths to compensate for 3 plies of normal evaluation.

Uri

bob · Post by **bob** » Wed Mar 05, 2008 7:37 pm

Uri Blass wrote:I finished the match 9 against 12 and in this case 9 plies won more convincingly 23-13 and 4 draws so maybe 4 plies of simple evaluation worth slightly less than 3 plies of normal strelka and maybe simple strelka needs 5 plies at higher depths to compensate for 3 plies of normal evaluation.

Uri

I think the explanation is more simple than that. At shallow depths, you depend solely on your evaluation to keep you out of tactical trouble, but that is not what it is particularly well-suited to accomplish. So a significantly deeper search will usually win. But as the depths increase, the better evaluation begins to become more effective because the deeper search enables it to see more tactics and avoid the trouble seen at shallower depths. A sort of "diminishing returns" scenario.

I've done this test in the past myself, and it is one reason I won't draw any conclusions based on the kind of testing someone claimed was used on Rybka (80,000 one second games to determine if a new eval term was good or bad).

hgm · Post by **hgm** » Wed Mar 05, 2008 9:26 pm

I think you have to be very careful with Pawn evaluation, to make sure that there is a tendency to push Pawns in the end-game. Especially in fixed-depth games, where it might not have the promotion within the horizon. Earlier versions of micro-Max suffered from reluctance to push Pawns, and so it frequently happened that they drew end-games like KNPPPPPK, if none of the Pawns was advanced beound 3rd row.

Not being able to win such totally won end-games would seriously compromise the test results. In the most favorable case they would be extremely sensitive to how exactly the piece-square table of the pawns looks. What seems to work very well in uMax is to make a Pawn on 6th rank worth about 180 cP, and on 7th about 260 cP. uMax also needs a Pawn-push bonus that increases as material disappears, to compensate for a penalty it gets when pushing a pawn when the pawn two squares left or right of it is missing. This penalty prevents it from recklessly pushing its complete pawn shield forwards in the opening, although pushing two or thee pawns can be done without incurring the penalty. But in the end-game, where the Pawns get few and far between, this penalty would effectively paralyse the pawns. If you don't have such a pawn-structure term, you also wouldn't need to increase the pawn-push bonus during the game. As long as pushing Pawns will always score a bit positive, so that it will start pushing them if its other pieces are in the optimal place.

Uri Blass · Post by **Uri Blass** » Wed Mar 05, 2008 11:21 pm

hgm wrote:I think you have to be very careful with Pawn evaluation, to make sure that there is a tendency to push Pawns in the end-game. Especially in fixed-depth games, where it might not have the promotion within the horizon. Earlier versions of micro-Max suffered from reluctance to push Pawns, and so it frequently happened that they drew end-games like KNPPPPPK, if none of the Pawns was advanced beound 3rd row.

Not being able to win such totally won end-games would seriously compromise the test results. In the most favorable case they would be extremely sensitive to how exactly the piece-square table of the pawns looks. What seems to work very well in uMax is to make a Pawn on 6th rank worth about 180 cP, and on 7th about 260 cP. uMax also needs a Pawn-push bonus that increases as material disappears, to compensate for a penalty it gets when pushing a pawn when the pawn two squares left or right of it is missing. This penalty prevents it from recklessly pushing its complete pawn shield forwards in the opening, although pushing two or thee pawns can be done without incurring the penalty. But in the end-game, where the Pawns get few and far between, this penalty would effectively paralyse the pawns. If you don't have such a pawn-structure term, you also wouldn't need to increase the pawn-push bonus during the game. As long as pushing Pawns will always score a bit positive, so that it will start pushing them if its other pieces are in the optimal place.

Note that even at fixed depth games the 50 move rule is going to force
strelka to push pawns so there is no chance to draw in
KNPPPPPK endgame even if the pawns are on the third rank.

I believe that even at small fixed depth without the 50 move rule there is going to be no problem in this case because strelka is going to push pawns after strelka is going to maximize the piece square table of the king and the knight because other moves are going to make the evaluation worse.

It does not mean that there is no problem to win won endgame at small depth when you only use piece square table and note that I remember a case when strelka piece square table as the stronger side was not able to win KRB vs KB
endgame and simply sacrificed the bishop after seeing that not doing it is going to draw by the 50 move rule.

Uri

Uri Blass · Post by **Uri Blass** » Wed Mar 05, 2008 11:44 pm

bob wrote:
Uri Blass wrote:I finished the match 9 against 12 and in this case 9 plies won more convincingly 23-13 and 4 draws so maybe 4 plies of simple evaluation worth slightly less than 3 plies of normal strelka and maybe simple strelka needs 5 plies at higher depths to compensate for 3 plies of normal evaluation.

Uri
I think the explanation is more simple than that. At shallow depths, you depend solely on your evaluation to keep you out of tactical trouble, but that is not what it is particularly well-suited to accomplish. So a significantly deeper search will usually win. But as the depths increase, the better evaluation begins to become more effective because the deeper search enables it to see more tactics and avoid the trouble seen at shallower depths. A sort of "diminishing returns" scenario.

I've done this test in the past myself, and it is one reason I won't draw any conclusions based on the kind of testing someone claimed was used on Rybka (80,000 one second games to determine if a new eval term was good or bad).

Note that I expected the better evaluation to be more effective at long time control but the question is how much.

x plies of normal evaluation are equivalent to f(x) plies of piece square table evaluation.

The interesting question here for me is not if f(x)-x is an increasing function but if f(x)/x is an increasing function of x.

It is going to help to answer what is more important for long time control.

Improving in the evaluation or improving the branching factor by better order of moves(assuming both lead to the same improvement at fast time control).

Uri

importance of evaluation in chess

importance of evaluation in chess

Re: importance of evaluation in chess

Re: importance of evaluation in chess

Re: importance of evaluation in chess

Re: importance of evaluation in chess

Re: importance of evaluation in chess