1. Train some deep NN on 7-men tablebases.
2. Let's assume some trust men limit, initially L = 7.
3. Create empty minibatch for NN.
4. Gererate random L + 1 men position and perform N-ply search (optimal N value is disputable)
Code: Select all
val neg_value(val value)
{
if (value == indefinite) return indefinite;
if (value == draw) return draw;
if (value == loss) return win; else return loss;
}
bool is_at_least(value alpha, value beta)
{
if (alpha == win) return true;
if (beta == loss) return true;
if (alpha == loss) return false;
if (alpha == indefinite)
{
return false;
}
if (beta == indefinite)
{
return false;
}
return true;
}
val trust_search(int depth, val alpha, val beta)
{
if (checkmate()) return -infinity;
if (stalemate()) return 0;
if (repetition()) return 0;
if (insuffient_material()) return 0;
if (TB_pos()) return TB_eval();
if (piece_count <= L) return NN_eval();
if (depth <= 0) return indefinite;
gen_moves();
foreach (move in moves)
{
val = neg_val(trust_search(depth - 1, neg_val(beta), neg_val(alpha)));
if (val == win) return win;
if (alpha == loss)
{
alpha = val;
}
else if (val == indefinite)
{
alpha = indefinite;
}
if (is_at_least(alpha, beta))
{
return alpha;
}
}
return alpha;
}
6. Perform one learning step for NN with this minibatch.
7. Until NN error is less then some bound B (value is disputable) repeat steps 3—7.
8. L = L + 1
9. Repeat steps 3—9 until L = 32.
10. PROFIT!