I haven't personally tried, but it should run. Tesla K80 is a dual-GPU card. It has 2x GK110 GPUs - each of them should be close to performance of a GTX 780 (or slightly slower as Workstation/Tesla cards have lower clocks than consumer cards).bhamadicharef wrote:Ankan ... could you please give us your expert GPU view on the following
1) does perft_gpu would run on a Tesla K80 ?
That's right. For my perft(14) computation I used 3x TitanX (Pascal GP102) GPUs. The computation took slightly more than a week. The result is here: https://raw.githubusercontent.com/ankan ... 4_run4.txtbhamadicharef wrote: I think you are using some 3x Titan X GPU for perft14, from what I read on the forum.
Your setup with 8x Tesla K80 has 16 GPUs. Each GK110 GPU is about 1/4th of GP102 for perft. Raw throughput wise your setup should be slightly faster than mine but scaling of my program is not perfect because part of hash table in device memory is not shared with other GPUs. I would expect it to take longer than what it took on my setup - but less than 2 weeks for perft14 from initial position.bhamadicharef wrote: 2) estimate how long if would take for 8x Tesla K80 (in a Xeon-based 64-bit
server with 6-cores and 64 GB RAM) to do perft14 from initial position ? or
from unique(7) positions
Brahim @ Singapore
For my program it's faster to compute perft(14) directly from start position than from the intermediate unique(7) records likely because I get better hash table utilization for shallow levels and most of unique(7) positions anyway fit in regular hash table for deeper levels.
Let me know if you are planning to run it. I will provide you with binaries with sizes of hash tables adjusted for your setup (currently they are hardcoded - I will tune them to utilize 64 GB RAM).