Search found 418 matches

by Rémi Coulom
Sun Apr 28, 2019 12:31 pm
Forum: Computer Chess Club: Programming and Technical Discussions
Topic: Wouldn't it be nice if C++ GPU
Replies: 24
Views: 3949

Re: Wouldn't it be nice if C++ GPU

Yes it makes sense to try to lower the batchsize when usinging multiple GPUs. My scorpio currently barely scales to 4 GPUs interms of nps using 128 threads per GPU (i.e. for a total 512). I have tried to optimize the performance of parallel mcts by making it completely lockless even when allocating...
by Rémi Coulom
Sun Apr 28, 2019 6:41 am
Forum: Computer Chess Club: Programming and Technical Discussions
Topic: Question to Remi about CrazyZero
Replies: 5
Views: 1677

Re: Question to Remi about CrazyZero

Hi, Thanks for the suggestion. That looks like a fun thing to do, but I am busy with many more important projects. In order to support another game, I would have to program the rules, as well as network input encoding and output decoding. And it would take 2-3 days of calculation to get a reasonable...
by Rémi Coulom
Sat Apr 27, 2019 10:05 pm
Forum: Computer Chess Club: Programming and Technical Discussions
Topic: On-line engine blitz tourney April
Replies: 17
Views: 3277

Re: On-line engine blitz tourney April

Thanks for organizing the tournament. CrazyZero is my generic AlphaZero implementation (I currently have networks for Go, shogi, gomoku, Othello, renju, and chess). I spent only a few days training a network. It is 20 layers of 128 units. It was running with two Volta GPU (V100 + TitanV). It has goo...
by Rémi Coulom
Sat Apr 27, 2019 12:31 pm
Forum: Computer Chess Club: Programming and Technical Discussions
Topic: Training using 1 playout instead of 800
Replies: 12
Views: 2347

Re: Training using 1 playout instead of 800

b) I train the policy head to match the actual game result. This is quite different from the AlphaZero algorithms because the training of the policy head is by imitation irregardles of the outcome. So I do not train policy head from moves made by the loosing side. WDL are weighted by 1, 0.5 and 0 b...
by Rémi Coulom
Sat Apr 27, 2019 12:12 pm
Forum: Computer Chess Club: Programming and Technical Discussions
Topic: Training using 1 playout instead of 800
Replies: 12
Views: 2347

Re: Training using 1 playout instead of 800

b) I train the policy head to match the actual game result. This is quite different from the AlphaZero algorithms because the training of the policy head is by imitation irregardles of the outcome. So I do not train policy head from moves made by the loosing side. WDL are weighted by 1, 0.5 and 0 b...
by Rémi Coulom
Sat Apr 27, 2019 11:57 am
Forum: Computer Chess Club: Programming and Technical Discussions
Topic: Wouldn't it be nice if C++ GPU
Replies: 24
Views: 3949

Re: Wouldn't it be nice if C++ GPU

Well going from a batch size of 128 to 16 my nps goes down by a factor of 4x, so I haven't really bothered to measure if the increased selectivity from smaller batch size can compensate for the loss in nps. However, I have now started using single thread search for generating training games. When I...
by Rémi Coulom
Sat Apr 27, 2019 12:20 am
Forum: Computer Chess Club: Programming and Technical Discussions
Topic: Wouldn't it be nice if C++ GPU
Replies: 24
Views: 3949

Re: Wouldn't it be nice if C++ GPU

Which begs the question, why use small batch sizes at all ? I don't use batch_size of less than 128. Even launching 128 to 256 threads for multi-threaded batching on a 4-core cpu i see no problems... Lc0 uses single threaded batching and defaults to batch size of 256 -- though smaller batch size of...
by Rémi Coulom
Sat Apr 27, 2019 12:13 am
Forum: Computer Chess Club: Programming and Technical Discussions
Topic: Wouldn't it be nice if C++ GPU
Replies: 24
Views: 3949

Re: Wouldn't it be nice if C++ GPU

Which begs the question, why use small batch sizes at all ? I don't use batch_size of less than 128. Even launching 128 to 256 threads for multi-threaded batching on a 4-core cpu i see no problems... Lc0 uses single threaded batching and defaults to batch size of 256 -- though smaller batch size of...
by Rémi Coulom
Fri Apr 26, 2019 8:26 am
Forum: Computer Chess Club: Programming and Technical Discussions
Topic: Wouldn't it be nice if C++ GPU
Replies: 24
Views: 3949

Re: Wouldn't it be nice if C++ GPU

@Remi Why do you to write even a single cuda kernel ? cuDNN has lots of convolution kernels to choose from anyway and the performance the TensorRT performance is as good as hand-written cuda kernels of Ankan as I have detailed here http://talkchess.com/forum3/viewtopic.php?f=2&t=69885&hilit=Scorpio...
by Rémi Coulom
Thu Apr 25, 2019 6:51 pm
Forum: Computer Chess Club: Programming and Technical Discussions
Topic: Wouldn't it be nice if C++ GPU
Replies: 24
Views: 3949

Re: Wouldn't it be nice if C++ GPU

smatovic wrote:
Thu Apr 25, 2019 5:38 pm
https://github.com/ankan-ban/ConvTest
Very interesting, thanks. I will try to make a tensor-core version.