Leela Policy
Moderator: Ras
-
- Posts: 80
- Joined: Fri Jul 29, 2022 1:30 am
- Full name: Aaron Li
Leela Policy
Out of curiosity, why does leela/a0's policy head output move probabilities rather than expected scores?
-
- Posts: 28395
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: Leela Policy
The way theses values are used is for directing search effort to the moves. And there need not be much correlation between required search effort for obtaining a reliable score, and the expected score. E.g. you might statistically know that a 7th rank passer has a 10% probability for promoting, and a 90% probability for being eliminated. The expected centiPawn score would then be 0, which might not make it stick out over other quiet moves. It would be very beneficial to first figure out (by search) whether you will be able to force promotion.
In other words, search effort should correlate with score uncertainty (complexity of the position), which is an independent quantity from the score average/expectation.
In other words, search effort should correlate with score uncertainty (complexity of the position), which is an independent quantity from the score average/expectation.