Daniel Shawul wrote:
Hey Daniel,
just want to mention that i doubled my move gen performance simply by using a vector datatype "long4" for the QuadBitboards.
That is good. GPUs don't have SSE instructions so I suppose your speedup must have come from memory optimizations.
Running a simple loop with the starting position i get about 300Knps per SIMD Unit,
if i turn of legality check it is up to 1Mnps.
Yes legality testing can be a killer. For my GGP engine Nebiyu I got a 100% speedup by allowing the king to be captured. If your attacks is slow you may want to try that.
How does your move generator perform?
I don't have a full move generator but a random legal move generator. And I also don't store moves in global memory unlike you. Your engine is probably memory bound because of that. 300k nps is not that much when you divide it by the number of threads in a SIMD unit. When you decide to give up relying on registers (& shared mem), you should bear in mind that the kernel may not be faster than a single cpu eventually. Hashtables are typical examples that are better generated on cpu. Well chess probably is not suited for gpu computation because monte-carlo misses a lot of tactics, but my effort is concentrated on MCTS and games that are suitable for that approach. The MC part is really needed to harness the power of the gpu and hide global memory latency as well.
----
cheers
You just shouldn't use monte carlo nor UCT for chess
Of course chess can work genius at gpu's, just it's a lot of work and you need 3 layers of SMP.
One within 1 compute unit, one between compute units (only works for nvidia) and one between the gpu's (using the RAM on the motherboard).
You can easily stack a bunch of gpu's onto a single machine with riser cards for those who wonder. Each few gpu's you can give its own psu.
Nvidia works genius there, with AMD you see massive bugreports - but i guess everyone here is programming for nvidia not AMD anyway.
Riser cards btw pretty cheap. I ordered a few in hongkong for a few dollar a piece.
Probably will take a few weeks to arrive here though
