Lazy SMP ideas
Posted: Wed Aug 22, 2018 3:53 pm
So the new engine rewrite is going well. Only material and PST for now with a basic search (just aspiration window PVS, LMR and Null move). Around Sunguros level. I managed to implement a basic Lazy SMP as well. Hannibal SMP was a modified YBWC but it is not that efficient with large number of threads due to non support of NUMA (I don't have a test machine for this), and the inherent limitation of YBWC especially in the endgame where there is not much number of moves to split work with. So right now I'm playing with Lazy SMP.
I first implemented a basic shared hash table, but I noticed that the threads seems to finish the iteration almost at the same time, returning the same score and principal variation. This is probably due to the fact that the single thread search is deterministic still and they are sharing the transposition table and the principal variation hash table.
I then implemented having every other thread start at depth 2 (odd thread ids start at depth 1), but they still converge after some time.
I also implemented an improvement over the previous one where each thread when started at depth 1 will search odd depths (1,3,5,...) and threads that started depth 2 will search even depths(2,4,6,..) but this seems too extreme as threads are not helping each other finish the iterations faster.
I don't quite get what SF is doing. It seems to take into account game ply, thread ids, and iteration depth to skip one depth. I also don't quite like the idea of lazy SMP and can't believe that it works at all. It seems too primitive. I'll probably implement a modified ABDADA.
So what's the best SMP algorithm nowadays?
Here's the link to the new engine by the way, if someone wants to play with a basic PST and mat only eval engine:
https://drive.google.com/file/d/1R_zvU3 ... sp=sharing
Let me know how it scales in large number of threads.
I'll probably share the source code as well some day when I release the first official version.
I first implemented a basic shared hash table, but I noticed that the threads seems to finish the iteration almost at the same time, returning the same score and principal variation. This is probably due to the fact that the single thread search is deterministic still and they are sharing the transposition table and the principal variation hash table.
I then implemented having every other thread start at depth 2 (odd thread ids start at depth 1), but they still converge after some time.
I also implemented an improvement over the previous one where each thread when started at depth 1 will search odd depths (1,3,5,...) and threads that started depth 2 will search even depths(2,4,6,..) but this seems too extreme as threads are not helping each other finish the iterations faster.
I don't quite get what SF is doing. It seems to take into account game ply, thread ids, and iteration depth to skip one depth. I also don't quite like the idea of lazy SMP and can't believe that it works at all. It seems too primitive. I'll probably implement a modified ABDADA.
So what's the best SMP algorithm nowadays?
Here's the link to the new engine by the way, if someone wants to play with a basic PST and mat only eval engine:
https://drive.google.com/file/d/1R_zvU3 ... sp=sharing
Let me know how it scales in large number of threads.
I'll probably share the source code as well some day when I release the first official version.