TalkChess.com

Posted: **Thu Aug 13, 2020 10:11 am**

Maybe it is of interest for someone, I tried, but could not figure out a way
how to do it.

https://eta-chess.app26.de/

In short, the host-device-latencies, aka. kernel-launch-overhead, are currently
in the range of 5 microseconds up to 100 of microseconds, you end up up with
max. 200K kernel calls per second. This is primary not caused by the PCIe
connection (maybe 10s of ns?) but (speculation) by the embedded CPU controller
on GPU who launches the kernels. So you need to couple tasks to batches to be
executed in one run, not that conform with the serial nature of AlphaBeta.

Maybe upcoming architectures will have lower latencies, dunno.

Another path could be to drop the search part completely, encode all in another
kind of mega NN structure and perform only a depth 1 search for evaluation,
maybe with multiple kind of NNs as ID loop replacement...

--
Srdja

Posted: **Thu Aug 13, 2020 4:44 pm**

The new AMD stuff is {eventually} going to have transparent memory access, including having the CPU read the video RAM directly instead of copy to and from.

Posted: **Thu Aug 13, 2020 4:52 pm**

Dann Corbit wrote: ↑Thu Aug 13, 2020 4:44 pm The new AMD stuff is {eventually} going to have transparent memory access, including having the CPU read the video RAM directly instead of copy to and from.

Yes, they plan something like this for Infinity Fabric gen 3 or alike, not sure when and if this will make it to the desktop PC, offers for sure some new designs in GPGPU programming, time will tell.

--
Srdja

Posted: **Thu Aug 13, 2020 5:26 pm**

smatovic wrote: ↑Thu Aug 13, 2020 4:52 pm
Dann Corbit wrote: ↑Thu Aug 13, 2020 4:44 pm The new AMD stuff is {eventually} going to have transparent memory access, including having the CPU read the video RAM directly instead of copy to and from.
Yes, they plan something like this for Infinity Fabric gen 3 or alike, not sure when and if this will make it to the desktop PC, offers for sure some new designs in GPGPU programming, time will tell.

--
Srdja

That is when LC0 will plow the new Stockfish under the dirt and have a hearty belly laugh.

Posted: **Sun Aug 16, 2020 6:50 am**

Dann Corbit wrote: ↑Thu Aug 13, 2020 5:26 pm
smatovic wrote: ↑Thu Aug 13, 2020 4:52 pm
Dann Corbit wrote: ↑Thu Aug 13, 2020 4:44 pm The new AMD stuff is {eventually} going to have transparent memory access, including having the CPU read the video RAM directly instead of copy to and from.
Yes, they plan something like this for Infinity Fabric gen 3 or alike, not sure when and if this will make it to the desktop PC, offers for sure some new designs in GPGPU programming, time will tell.

--
Srdja
That is when LC0 will plow the new Stockfish under the dirt and have a hearty belly laugh.

SF NNUE is laughing now. BTW Lc0 will never have a chance to laugh like you are saying. Sorry.

TalkChess.com

AB search with NN on GPU...

AB search with NN on GPU...

Re: AB search with NN on GPU...

Re: AB search with NN on GPU...

Re: AB search with NN on GPU...

Re: AB search with NN on GPU...