Leela Policy

Discussion of chess software programming and technical issues.

Moderator: Ras

User avatar
AAce3
Posts: 80
Joined: Fri Jul 29, 2022 1:30 am
Full name: Aaron Li

Leela Policy

Post by AAce3 »

Out of curiosity, why does leela/a0's policy head output move probabilities rather than expected scores?
User avatar
hgm
Posts: 28395
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Leela Policy

Post by hgm »

The way theses values are used is for directing search effort to the moves. And there need not be much correlation between required search effort for obtaining a reliable score, and the expected score. E.g. you might statistically know that a 7th rank passer has a 10% probability for promoting, and a 90% probability for being eliminated. The expected centiPawn score would then be 0, which might not make it stick out over other quiet moves. It would be very beneficial to first figure out (by search) whether you will be able to force promotion.

In other words, search effort should correlate with score uncertainty (complexity of the position), which is an independent quantity from the score average/expectation.