Centipawns and Millipawns

bob · Post by **bob** » Wed Sep 09, 2009 7:46 pm

rjgibert wrote:I can't imagine what a centipawn means. What is the reason for using such a fine grained evaluation unit? Is there really any benefit? I understand mate-in-N evaluations and maybe what a decipawn means, but that is as far as I can go with my 2234 rating.

The only other thing I can relate to is a judgement call that position X appears to be better than position Y even though I might quantify them as having the same evaluation.

So my question is, do centipawns evaluations really help chess engines and why? What would you lose by using say 1/20th of a pawn as the finest grain for your evaluations as opposed to the finer grained evaluations? It appears to me that chess programmers are doing something akin to measuring something to the nearest hundredths, while having a standard deviation of as much as a quarter of a pawn.

In engineering, 1.234 +/- 0.1 is just silly. Why isn't it just as silly in computer chess?

And then there even are engines that use millipawns. What's that about?

Having done all three, my take on the issue is this:

(1) decipawns (0.1) is too coarse. Not every positional consideration is worth 0.1 pawns, so you either have to round the score up to 0.1, or else throw it out since it would be zero.

(2) millipawns (0.001) is too fine. I do not believe that my evaluation has any .001 accuracy ideas in it. As you spread the evaluation scores out, your tree search becomes less efficient (for example, compare a program with no positional scoring to one with, with respect to tree size).

(3) centipawns (0.01) is reasonable. One could make the argument that maybe .05 is better (1/20th of a pawn). Or some other number. But my intuition after trying all three during the development of Crafty is that the right value lies in the interval {0.01, 0.1}. Whether it is on one end or the other, or somewhere in the middle, is a point for conjecture. It is too hard to test the idea, although I suppose I could just do a normal eval and then at the end, reduce it to the accuracy needed. But it is not easy to test for smaller than .01 increments since no increments in crafty would be smaller than .01...

rjgibert · Post by **rjgibert** » Thu Sep 10, 2009 12:50 am

bob wrote:
rjgibert wrote:I can't imagine what a centipawn means. What is the reason for using such a fine grained evaluation unit? Is there really any benefit? I understand mate-in-N evaluations and maybe what a decipawn means, but that is as far as I can go with my 2234 rating.

The only other thing I can relate to is a judgement call that position X appears to be better than position Y even though I might quantify them as having the same evaluation.

So my question is, do centipawns evaluations really help chess engines and why? What would you lose by using say 1/20th of a pawn as the finest grain for your evaluations as opposed to the finer grained evaluations? It appears to me that chess programmers are doing something akin to measuring something to the nearest hundredths, while having a standard deviation of as much as a quarter of a pawn.

In engineering, 1.234 +/- 0.1 is just silly. Why isn't it just as silly in computer chess?

And then there even are engines that use millipawns. What's that about?
Having done all three, my take on the issue is this:

(1) decipawns (0.1) is too coarse. Not every positional consideration is worth 0.1 pawns, so you either have to round the score up to 0.1, or else throw it out since it would be zero.

(2) millipawns (0.001) is too fine. I do not believe that my evaluation has any .001 accuracy ideas in it. As you spread the evaluation scores out, your tree search becomes less efficient (for example, compare a program with no positional scoring to one with, with respect to tree size).

(3) centipawns (0.01) is reasonable. One could make the argument that maybe .05 is better (1/20th of a pawn). Or some other number. But my intuition after trying all three during the development of Crafty is that the right value lies in the interval {0.01, 0.1}. Whether it is on one end or the other, or somewhere in the middle, is a point for conjecture. It is too hard to test the idea, although I suppose I could just do a normal eval and then at the end, reduce it to the accuracy needed. But it is not easy to test for smaller than .01 increments since no increments in crafty would be smaller than .01...

1/10th of a pawn is about what my brain can process, so I suggested 1/20th just as you suggested might be right. So it looks like my intuition may not be far off on this issue. Thanks.

BubbaTough · Post by **BubbaTough** » Thu Sep 10, 2009 1:00 am

How much would a queen being able to move to 1 square be worth? My guess is its above 0...and its below 1/10 of a pawn....and its probably below 1/20th of a pawn....hmmmm...perhaps centipawns are a good thing. Of course I use 1/1000 of a pawn scale, but mostly because its really fun.

-Sam

Edsel Apostol · Post by **Edsel Apostol** » Thu Sep 10, 2009 1:33 am

I have tried this idea just recently. I'm still using my old scoring of centipawns but at the end of the eval I use something like:

Code: Select all

return score &= ~((1<<grain)-1);

where grain is a number 0 to 4.

Grain with 0 value means no change in the score, 1 will make it round to 2, 2 to 4 and 3 to 8, and 4 to 16.

I only have tried grain values 0, 2, 3. 0 still gives the best result, though my number of games is only 1200 each and the difference in elo between 0 and 2,3 is only 10. I will test with 1 to see if it performs better.

Using eval grain in Stockfish works because it uses 256 as Pawn value and it is too coarse for the search so it scales it by 4, meaning the finest unit for its search is 1/64. I think this is the optimal value.

BubbaTough · Post by **BubbaTough** » Thu Sep 10, 2009 1:48 am

Edsel Apostol wrote:I have tried this idea just recently. I'm still using my old scoring of centipawns but at the end of the eval I use something like:
Code: Select all
return score &= ~((1<<grain)-1);
where grain is a number 0 to 4.

Grain with 0 value means no change in the score, 1 will make it round to 2, 2 to 4 and 3 to 8, and 4 to 16.

I only have tried grain values 0, 2, 3. 0 still gives the best result, though my number of games is only 1200 each and the difference in elo between 0 and 2,3 is only 10. I will test with 1 to see if it performs better.

Using eval grain in Stockfish works because it uses 256 as Pawn value and it is too coarse for the search so it scales it by 4, meaning the finest unit for its search is 1/64. I think this is the optimal value.

I tried variations of that, and for me...even with a scale of 1/1000...it hurt performance.

-Sam

bob · Post by **bob** » Thu Sep 10, 2009 1:56 am

rjgibert wrote:
bob wrote:
rjgibert wrote:I can't imagine what a centipawn means. What is the reason for using such a fine grained evaluation unit? Is there really any benefit? I understand mate-in-N evaluations and maybe what a decipawn means, but that is as far as I can go with my 2234 rating.

The only other thing I can relate to is a judgement call that position X appears to be better than position Y even though I might quantify them as having the same evaluation.

So my question is, do centipawns evaluations really help chess engines and why? What would you lose by using say 1/20th of a pawn as the finest grain for your evaluations as opposed to the finer grained evaluations? It appears to me that chess programmers are doing something akin to measuring something to the nearest hundredths, while having a standard deviation of as much as a quarter of a pawn.

In engineering, 1.234 +/- 0.1 is just silly. Why isn't it just as silly in computer chess?

And then there even are engines that use millipawns. What's that about?
Having done all three, my take on the issue is this:

(1) decipawns (0.1) is too coarse. Not every positional consideration is worth 0.1 pawns, so you either have to round the score up to 0.1, or else throw it out since it would be zero.

(2) millipawns (0.001) is too fine. I do not believe that my evaluation has any .001 accuracy ideas in it. As you spread the evaluation scores out, your tree search becomes less efficient (for example, compare a program with no positional scoring to one with, with respect to tree size).

(3) centipawns (0.01) is reasonable. One could make the argument that maybe .05 is better (1/20th of a pawn). Or some other number. But my intuition after trying all three during the development of Crafty is that the right value lies in the interval {0.01, 0.1}. Whether it is on one end or the other, or somewhere in the middle, is a point for conjecture. It is too hard to test the idea, although I suppose I could just do a normal eval and then at the end, reduce it to the accuracy needed. But it is not easy to test for smaller than .01 increments since no increments in crafty would be smaller than .01...
1/10th of a pawn is about what my brain can process, so I suggested 1/20th just as you suggested might be right. So it looks like my intuition may not be far off on this issue. Thanks.

I have a hard time with very small or very large numbers. It is hard to comprehend the idea of a billion somethings. or one-billionth of a something.

I might well try your test of rounding to .02, .04, .05, even all the way to .1 just for fun to see what it does... I'll post the numbers. Might be tomorrow before I can do this as Tracy just sent three new king safety versions to test (I hope to get a 23.1 released soon, but new ideas keep slipping in.

)

bob · Post by **bob** » Thu Sep 10, 2009 1:59 am

Edsel Apostol wrote:I have tried this idea just recently. I'm still using my old scoring of centipawns but at the end of the eval I use something like:
Code: Select all
return score &= ~((1<<grain)-1);
where grain is a number 0 to 4.

Grain with 0 value means no change in the score, 1 will make it round to 2, 2 to 4 and 3 to 8, and 4 to 16.

I only have tried grain values 0, 2, 3. 0 still gives the best result, though my number of games is only 1200 each and the difference in elo between 0 and 2,3 is only 10. I will test with 1 to see if it performs better.

Using eval grain in Stockfish works because it uses 256 as Pawn value and it is too coarse for the search so it scales it by 4, meaning the finest unit for its search is 1/64. I think this is the optimal value.

Is zero a legit shift amount? lots of hardware will handle that in unexpected ways. Some will not shift at all, some will shift as if that were the word size, which means a value of zero will result...

bob · Post by **bob** » Thu Sep 10, 2009 2:02 am

BubbaTough wrote:
Edsel Apostol wrote:I have tried this idea just recently. I'm still using my old scoring of centipawns but at the end of the eval I use something like:
Code: Select all
return score &= ~((1<<grain)-1);
where grain is a number 0 to 4.

Grain with 0 value means no change in the score, 1 will make it round to 2, 2 to 4 and 3 to 8, and 4 to 16.

I only have tried grain values 0, 2, 3. 0 still gives the best result, though my number of games is only 1200 each and the difference in elo between 0 and 2,3 is only 10. I will test with 1 to see if it performs better.

Using eval grain in Stockfish works because it uses 256 as Pawn value and it is too coarse for the search so it scales it by 4, meaning the finest unit for its search is 1/64. I think this is the optimal value.
I tried variations of that, and for me...even with a scale of 1/1000...it hurt performance.

-Sam

I think the conclusion for that is that your values are probably reasonably tuned. I have seen some programs where every score is an even number, for example, so that the rightmost bit is never 1. They are using 1/50th of a pawn without thinking about it....

I switched from millipawns to centipawns way back in Crafty. I started with millipawns because that is what we used in Blitz/CrayBlitz since the early days. I found centipawns resulted in less arbitrary scores (is this a .01, or a .011 or a .009 score)? My tests with .1 were worse, but now I can quantify exactly how much worse on the cluster and am going to run this test hopefully tomorrow.

Edsel Apostol · Post by **Edsel Apostol** » Thu Sep 10, 2009 2:03 am

BubbaTough wrote:
Edsel Apostol wrote:I have tried this idea just recently. I'm still using my old scoring of centipawns but at the end of the eval I use something like:
Code: Select all
return score &= ~((1<<grain)-1);
where grain is a number 0 to 4.

Grain with 0 value means no change in the score, 1 will make it round to 2, 2 to 4 and 3 to 8, and 4 to 16.

I only have tried grain values 0, 2, 3. 0 still gives the best result, though my number of games is only 1200 each and the difference in elo between 0 and 2,3 is only 10. I will test with 1 to see if it performs better.

Using eval grain in Stockfish works because it uses 256 as Pawn value and it is too coarse for the search so it scales it by 4, meaning the finest unit for its search is 1/64. I think this is the optimal value.
I tried variations of that, and for me...even with a scale of 1/1000...it hurt performance.

-Sam

I can't edit my post anymore. In my last sentence I mean too fine instead of too coarse for the search.

I think the best value for the basic unit is somewhere between 0.01 and 0.05. Stockfish uses 0.03125.

There are some exceptions though. Take for example Strelka that uses 3399 as Pawn Value. Its basic uinit is 0.0003.

In the end, it all boils down to who's eval function is well tuned may it be using 1/1000 or 1/100.

Edsel Apostol · Post by **Edsel Apostol** » Thu Sep 10, 2009 2:10 am

bob wrote:
Edsel Apostol wrote:I have tried this idea just recently. I'm still using my old scoring of centipawns but at the end of the eval I use something like:
Code: Select all
return score &= ~((1<<grain)-1);
where grain is a number 0 to 4.

Grain with 0 value means no change in the score, 1 will make it round to 2, 2 to 4 and 3 to 8, and 4 to 16.

I only have tried grain values 0, 2, 3. 0 still gives the best result, though my number of games is only 1200 each and the difference in elo between 0 and 2,3 is only 10. I will test with 1 to see if it performs better.

Using eval grain in Stockfish works because it uses 256 as Pawn value and it is too coarse for the search so it scales it by 4, meaning the finest unit for its search is 1/64. I think this is the optimal value.
Is zero a legit shift amount? lots of hardware will handle that in unexpected ways. Some will not shift at all, some will shift as if that were the word size, which means a value of zero will result...

I am not sure if zero is a legitimate shift, I don't have much exposure to hardware other than the PC. I assume that since it's zero the hardware will be intelligent enough not to shift anything. Thanks for pointing it out.

Centipawns and Millipawns

Re: Centipawns and Millipawns

Re: Centipawns and Millipawns

Re: Centipawns and Millipawns

Re: Centipawns and Millipawns

Re: Centipawns and Millipawns

Re: Centipawns and Millipawns

Re: Centipawns and Millipawns

Re: Centipawns and Millipawns

Re: Centipawns and Millipawns

Re: Centipawns and Millipawns