lithander wrote: ↑Mon Mar 01, 2021 2:58 am
Mike Sherwin wrote: ↑Fri Feb 26, 2021 4:36 pm
Alpha-beta search works best when it has more accurate scores to backup and prune against. And the better the evaluation function is the better the pruning will be and the deeper the search will search in a given time.
I have now implemented support for PSTs and am happy to see the same synergy that Marcel reported: PSTs together with QSearch creates stronger play than the sum of its parts would suggest. The strength is pretty much where I'd expect it at this point as opposed to when I used QSearch only with a very simple evaluation.
But when I look at the runtime performance I measure the opposite of what you predicted: When I search a set of test position to a fixed depth the PST version takes
twice as long as the version that counts only material. To make the test really fair both use the exact same code only that for the material counting version the PSTs are filled with zeros. So the speed difference must come from pruning. When using the PSTs my metrics show that I generate and play more moves to reach the same depth. So the simple eval has the better pruning!
With PSTs (from Sunfish)
Code: Select all
Performance: 1135kN/sec
A total of 44,377,312 moves were generated. 4,010,561 were played. 9%
Operation took 50.8739s
Counting Material
Code: Select all
Performance: 1152kN/sec
A total of 12,334,260 moves were generated. 1,352,387 were played. 10%
Operation took 25.8926s
If you're right I need to go looking for a problem. But to me the result seems logical, though:
Take the opening position for example: Without PSTs almost all lines will receive the same evaluation after quiesence: 0
This leads to bad play but it should mean a lot of early cutoffs where alpha == beta == 0. It just doens't matter what move sequence you play as long as you don't lose a piece. And as long as the score of positions are that homogeneous it should prune better than if they are all different because they have the PST offsets applied, right?
Why should accurate predictions help? Maybe the effect you predicted and I didn't observe requires some other advanced techniques like history heuristics that I don't have yet? MVV-LVA doesn't consider evaluations so I don't see where the alpha-beta search would benefit from accurate but diverse evaluations.
Let's think about the example of the knight and the bishop. Tord had this problem with Glaurung. He could not understand why his search was running so slow when both bishops and knights were in the position and so much faster when only bishops or knights were in the position. The problem was that knights and bishops were so close in evaluation the search could not make up its "mind" whether or not it wanted to move the knight or the bishop and was constantly thrashing the pv switching between them at all depths of the search.
So alpha-beta should have an easy time with material only scores. But what happens when each new ply depth is searched and the material evaluation changes? There is nothing to guide the search to make better positional moves that have more stable material consequences. When I wrote RomiChess the eval was done before Qsearch was written. It ran slow. The node rate was great but it did not search very deep at all. As soon as I added Qsearch it was the deepest single thread searcher that I knew of at the time of its release. Even today it is not bad.
FEN: rnbqkbnr/pppppppp/8/8/4P3/8/PPPP1PPP/RNBQKBNR b KQkq e3 0 1
RomiChess64P3n:
1 00:00 40 40 +0.44 c2c4
2 00:00 224 224 +0.36 e2e4 g8f6
3 00:00 377 377 +0.66 e2e4 g8f6
4 00:00 1k 1k +0.29 e2e4 g8f6 b1c3 e7e5
5 00:00 3k 3k +0.79 e2e4 g8f6
6 00:00 8k 8k +0.29 e2e4
7 00:00 15k 15k +0.50 e2e4 d7d5 b1c3 g8f6 e4d5 c8g4 f1b5
8 00:00 61k 6,062k +0.15 e2e4 g8f6 b1c3 d7d5 e4d5 c8g4 f1e2 g4e2
9 00:00 108k 5,396k +0.15 e2e4 g8f6 b1c3 d7d5 e4d5 c8g4 f1e2 g4e2
10 00:00 222k 7,393k +0.21 e2e4 d7d5 e4d5 g8f6 b1c3 c8g4 f1e2 g4e2 g1e2 b8d7
11 00:00 392k 6,529k +0.42 e2e4 d7d5 e4d5 g8f6 b1c3 f6d5 f1c4 d5c3 d2c3 d8d1
12 00:00 806k 6,718k +0.30 e2e4 d7d5 e4d5 g8f6 b1c3 f6d5 f1c4 d5c3 b2c3 b8d7 d1h5 g7g6
13 00:00 1,784k 7,134k +0.48 e2e4 d7d5 e4d5 g8f6 b1c3 f6d5 f1c4 d5c3 b2c3 b8d7 d1h5 g7g6 h5d5
14 00:00 2,642k 7,142k +0.50 e2e4 d7d5 e4d5 g8f6 d2d4 c8g4 f2f3 g4f5 f1b5 b8d7 b1c3 g7g6 c1f4 f8g7 g1e2
15 00:00 4,625k 7,227k +0.47 e2e4 d7d5 e4d5 g8f6 d2d4 c8g4 f2f3 g4f5 f1b5 b8d7 c2c4 g7g6 c1f4 f8g7 d1b3 b7b6
16 00:01 8,337k 7,313k +0.42 e2e4 d7d5 e4d5 g8f6 d2d4 f6d5 c2c4 d5b6 c1e3 e7e5 b1c3 c8e6 g1f3 e5d4 f3d4 e6c4
17 00:01 12,559k 7,344k +0.48 e2e4 d7d5 e4d5 g8f6 d2d4 f6d5 c2c4 d5f6 c1f4 c7c5 g1f3 d8b6 b2b3 c8d7 d4c5 b6c5 b1d2
18 00:03 25,110k 7,407k +0.41 e2e4 d7d5 e4d5 g8f6 d2d4 f6d5 c2c4 d5f6 c1e3 f6g4 e3g5 c7c5 f1e2 d8d4 d1d4
19 00:07 52,982k 7,452k +0.42 e2e4 d7d5 e4d5 g8f6 d2d4 f6d5 c2c4 d5f6 b1c3 e7e5 g1f3 e5d4 d1d4 d8d4 f3d4 c7c5 d4b5 c8g4 c1f4
20 00:33 253,957k 7,531k +0.32 e2e4 e7e6 d2d4 d7d5 b1c3 f8b4 d1g4 g8f6 g4g7 h8g8 g7h6 g8g6 h6h4 g6g4 h4h6 g4g6 h6h4 g6g4 h4h6 g4g6
The differences between SF 11 and SF 12 was that the node rate was almost cut in half for SF 12 NNUE and yet it searched more deeply.