Rc2 is now ready for download. And the issue is claimed to be fixed. And rc2 should also run faster.mwyoung wrote: ↑Fri Oct 02, 2020 1:00 amNo there is a issue 0.26.3-rc1. I tested it the day it came out in a 200 game blitz match. It was faster, but it also crashed about 23 times in 200 games. Causing a big loss in the match. I hope this will be corrected in rc2.AdminX wrote: ↑Thu Oct 01, 2020 10:14 pmI found it works better with this format:Laskos wrote: ↑Thu Oct 01, 2020 1:50 pm
To remark the excellent result of DX12 backend, which seems by NPS vastly superior to the other two. A glitch occurred with this command line:
lc0_v263rc1.exe benchmark --backend=cudnn-fp16 --minibatch-size=240
which sometimes exits with this error message:
Position: 1/34 rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1
Unhandled exception in worker thread: CUDA error: an illegal memory access was encountered (c:\projects\lc0\src\neural\cuda\network_cudnn.cc:789)
Unhandled exception in worker thread:
lc0_v263rc1.exe benchmark --minibatch-size=240 --backend=cudnn-fp16
As you can see all I did was invert the two arguments.
Checking the backends with the new lc0 binary
Moderators: hgm, Rebel, chrisw
-
- Posts: 2727
- Joined: Wed May 12, 2010 10:00 pm
Re: Checking the backends with the new lc0 binary
"The worst thing that can happen to a forum is a running wild attacking moderator(HGM) who is not corrected by the community." - Ed Schröder
But my words like silent raindrops fell. And echoed in the wells of silence.
But my words like silent raindrops fell. And echoed in the wells of silence.
-
- Posts: 6340
- Joined: Mon Mar 13, 2006 2:34 pm
- Location: Acworth, GA
Re: Checking the backends with the new lc0 binary
Links not working, files may no longer be there.Laskos wrote: ↑Sat Oct 03, 2020 7:00 pmAdminX wrote: ↑Thu Oct 01, 2020 10:55 pm I was able to replicate your results on 2070 Super
Code: Select all
DX12 lc0.exe benchmark --minibatch-size=240 --threads=2 --backend-opts=gpu=0 =========================== Total time (ms) : 341461 Nodes searched : 3506458 Nodes/second : 10269
Code: Select all
Cudnn-fp16 lc0.exe benchmark --minibatch-size=240 --threads=2 --backend-opts=gpu=0 =========================== Total time (ms) : 341514 Nodes searched : 2762948 Nodes/second : 8090
Someone directed me to this test version with CUDA 11.1 and cuDNN 8.04
https://appveyorcidatav2.blob.core.wind ... a-cuda.zip
and replace lc0 with this one:
https://appveyorcidatav2.blob.core.wind ... ld/lc0.exe
I am getting very much improved results (50%+ faster) for cudnn-fp16 and cuda-fp16:
cudnn-fp16
Total time (ms) : 341515
Nodes searched : 3547152
Nodes/second : 10386
cuda-fp16
Total time (ms) : 341370
Nodes searched : 3630548
Nodes/second : 10635
dx12
Total time (ms) : 341409
Nodes searched : 3077528
Nodes/second : 9014
Cuda-fp16 seems now even faster than cudnn-fp16, and both above DX12.
"Good decisions come from experience, and experience comes from bad decisions."
__________________________________________________________________
Ted Summers
__________________________________________________________________
Ted Summers
-
- Posts: 10948
- Joined: Wed Jul 26, 2006 10:21 pm
- Full name: Kai Laskos
Re: Checking the backends with the new lc0 binary
Try these:AdminX wrote: ↑Sat Oct 03, 2020 9:15 pmLinks not working, files may no longer be there.Laskos wrote: ↑Sat Oct 03, 2020 7:00 pmAdminX wrote: ↑Thu Oct 01, 2020 10:55 pm I was able to replicate your results on 2070 Super
Code: Select all
DX12 lc0.exe benchmark --minibatch-size=240 --threads=2 --backend-opts=gpu=0 =========================== Total time (ms) : 341461 Nodes searched : 3506458 Nodes/second : 10269
Code: Select all
Cudnn-fp16 lc0.exe benchmark --minibatch-size=240 --threads=2 --backend-opts=gpu=0 =========================== Total time (ms) : 341514 Nodes searched : 2762948 Nodes/second : 8090
Someone directed me to this test version with CUDA 11.1 and cuDNN 8.04
https://appveyorcidatav2.blob.core.wind ... a-cuda.zip
and replace lc0 with this one:
https://appveyorcidatav2.blob.core.wind ... ld/lc0.exe
I am getting very much improved results (50%+ faster) for cudnn-fp16 and cuda-fp16:
cudnn-fp16
Total time (ms) : 341515
Nodes searched : 3547152
Nodes/second : 10386
cuda-fp16
Total time (ms) : 341370
Nodes searched : 3630548
Nodes/second : 10635
dx12
Total time (ms) : 341409
Nodes searched : 3077528
Nodes/second : 9014
Cuda-fp16 seems now even faster than cudnn-fp16, and both above DX12.
https://appveyorcidatav2.blob.core.wind ... 3A24Z&sp=r
https://appveyorcidatav2.blob.core.wind ... 3A28Z&sp=r
-
- Posts: 122
- Joined: Tue Oct 29, 2019 4:14 pm
- Location: Canada
- Full name: Ron Doughie
Re: Checking the backends with the new lc0 binary
I still cannot get them, either.
Code: Select all
<Error>
<Code>AuthenticationFailed</Code>
<Message>
Server failed to authenticate the request. Make sure the value of Authorization header is formed correctly including the signature. RequestId:48280c10-101e-0054-34c2-997f87000000 Time:2020-10-03T20:22:43.9719130Z
</Message>
<AuthenticationErrorDetail>
Signature not valid in the specified time frame: Start [Sat, 03 Oct 2020 19:23:24 GMT] - Expiry [Sat, 03 Oct 2020 19:29:24 GMT] - Current [Sat, 03 Oct 2020 20:22:43 GMT]
</AuthenticationErrorDetail>
</Error>
-
- Posts: 2727
- Joined: Wed May 12, 2010 10:00 pm
Re: Checking the backends with the new lc0 binary
All seems to be working fine with 0.26.3-rc2. And now getting much faster speed with cuda 11.1 with the big nets 384x30. Game average right now is 38.2 Knps on a 2080ti with default settings.mwyoung wrote: ↑Sat Oct 03, 2020 9:05 pmRc2 is now ready for download. And the issue is claimed to be fixed. And rc2 should also run faster.mwyoung wrote: ↑Fri Oct 02, 2020 1:00 amNo there is a issue 0.26.3-rc1. I tested it the day it came out in a 200 game blitz match. It was faster, but it also crashed about 23 times in 200 games. Causing a big loss in the match. I hope this will be corrected in rc2.AdminX wrote: ↑Thu Oct 01, 2020 10:14 pmI found it works better with this format:Laskos wrote: ↑Thu Oct 01, 2020 1:50 pm
To remark the excellent result of DX12 backend, which seems by NPS vastly superior to the other two. A glitch occurred with this command line:
lc0_v263rc1.exe benchmark --backend=cudnn-fp16 --minibatch-size=240
which sometimes exits with this error message:
Position: 1/34 rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1
Unhandled exception in worker thread: CUDA error: an illegal memory access was encountered (c:\projects\lc0\src\neural\cuda\network_cudnn.cc:789)
Unhandled exception in worker thread:
lc0_v263rc1.exe benchmark --minibatch-size=240 --backend=cudnn-fp16
As you can see all I did was invert the two arguments.
"The worst thing that can happen to a forum is a running wild attacking moderator(HGM) who is not corrected by the community." - Ed Schröder
But my words like silent raindrops fell. And echoed in the wells of silence.
But my words like silent raindrops fell. And echoed in the wells of silence.
-
- Posts: 3657
- Joined: Wed Nov 18, 2015 11:41 am
- Location: hungary
Re: Checking the backends with the new lc0 binary
I also got only error messages from there.Laskos wrote: ↑Sat Oct 03, 2020 9:30 pm
Try these:
https://appveyorcidatav2.blob.core.wind ... 3A24Z&sp=r
https://appveyorcidatav2.blob.core.wind ... 3A28Z&sp=r
-
- Posts: 10948
- Joined: Wed Jul 26, 2006 10:21 pm
- Full name: Kai Laskos
Re: Checking the backends with the new lc0 binary
The links I have seem to not be working anymore, and I am unable to upload these large files.corres wrote: ↑Sun Oct 04, 2020 10:39 amI also got only error messages from there.Laskos wrote: ↑Sat Oct 03, 2020 9:30 pm
Try these:
https://appveyorcidatav2.blob.core.wind ... 3A24Z&sp=r
https://appveyorcidatav2.blob.core.wind ... 3A28Z&sp=r
Get the CUDA backend with all dll's from the new 0.26.3-rc2 official release. It is the fastest, faster than cuDNN backend in all my tests with different nets.
-
- Posts: 3657
- Joined: Wed Nov 18, 2015 11:41 am
- Location: hungary
Re: Checking the backends with the new lc0 binary
Thanks.Laskos wrote: ↑Sun Oct 04, 2020 11:10 amThe links I have seem to not be working anymore, and I am unable to upload these large files.corres wrote: ↑Sun Oct 04, 2020 10:39 amI also got only error messages from there.Laskos wrote: ↑Sat Oct 03, 2020 9:30 pm
Try these:
https://appveyorcidatav2.blob.core.wind ... 3A24Z&sp=r
https://appveyorcidatav2.blob.core.wind ... 3A28Z&sp=r
Get the CUDA backend with all dll's from the new 0.26.3-rc2 official release. It is the fastest, faster than cuDNN backend in all my tests with different nets.
-
- Posts: 6340
- Joined: Mon Mar 13, 2006 2:34 pm
- Location: Acworth, GA
Re: Checking the backends with the new lc0 binary
Or grab this one: https://ci.appveyor.com/project/LeelaCh ... /artifactsLaskos wrote: ↑Sun Oct 04, 2020 11:10 amThe links I have seem to not be working anymore, and I am unable to upload these large files.corres wrote: ↑Sun Oct 04, 2020 10:39 amI also got only error messages from there.Laskos wrote: ↑Sat Oct 03, 2020 9:30 pm
Try these:
https://appveyorcidatav2.blob.core.wind ... 3A24Z&sp=r
https://appveyorcidatav2.blob.core.wind ... 3A28Z&sp=r
Get the CUDA backend with all dll's from the new 0.26.3-rc2 official release. It is the fastest, faster than cuDNN backend in all my tests with different nets.
"Good decisions come from experience, and experience comes from bad decisions."
__________________________________________________________________
Ted Summers
__________________________________________________________________
Ted Summers
-
- Posts: 122
- Joined: Tue Oct 29, 2019 4:14 pm
- Location: Canada
- Full name: Ron Doughie
Re: Checking the backends with the new lc0 binary
Thanks very much Kai and Ted.
I downloaded (from AppVeyor) and quickly tried both v0.26.3-rc2 (3787) CUDA and cuDNN binaries using the 384x30-t60-4619.pb (Sergio) net and an RTX 2080 GPU. I am using the CUDA 11.1 Toolkit and v8.0.4.30 cuDNN libraries.
Will try the DX12 binary tonight.
I downloaded (from AppVeyor) and quickly tried both v0.26.3-rc2 (3787) CUDA and cuDNN binaries using the 384x30-t60-4619.pb (Sergio) net and an RTX 2080 GPU. I am using the CUDA 11.1 Toolkit and v8.0.4.30 cuDNN libraries.
Will try the DX12 binary tonight.