Schooner Version 2.2 Release

Discussion of anything and everything relating to chess playing software and machines.

Moderators: bob, hgm, Harvey Williamson

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
Alayan
Posts: 128
Joined: Tue Nov 19, 2019 7:48 pm
Full name: Alayan Feh

Re: Schooner Version 2.2 Release

Post by Alayan » Thu Dec 26, 2019 6:54 pm

I'd be happy to see how the latest Schooner does in tournaments. I remember it had poor SMP scaling one year ago, has this been improved on ? I hope someone will volunteer to be the contact with these tournaments. I can't do it as I'm already doing so for Stockfish, but from this experience I can tell it's usually not too demanding.

D Sceviour
Posts: 496
Joined: Mon Jul 20, 2015 3:06 pm
Contact:

Re: Schooner Version 2.2 Release

Post by D Sceviour » Thu Dec 26, 2019 7:09 pm

Alayan wrote:
Thu Dec 26, 2019 6:54 pm
I'd be happy to see how the latest Schooner does in tournaments. I remember it had poor SMP scaling one year ago, has this been improved on ?
What do you mean by poor SMP scaling? Could you be more specific? The threading was crashing sometimes during multiple instances of concurrent games on CuteChess. A change was made to the polling thread to fix this. Maybe that was the problem you experienced.

Alayan
Posts: 128
Joined: Tue Nov 19, 2019 7:48 pm
Full name: Alayan Feh

Re: Schooner Version 2.2 Release

Post by Alayan » Thu Dec 26, 2019 7:25 pm

By poor SMP scaling, I mean that the elo gains from additional cores/threads were significantly lower for Schooner than for most engines. Not really a worry at 2C or 4C where the gains are still good, but significant on tournament hardware with dozens of cores.

D Sceviour
Posts: 496
Joined: Mon Jul 20, 2015 3:06 pm
Contact:

Re: Schooner Version 2.2 Release

Post by D Sceviour » Thu Dec 26, 2019 7:36 pm

Alayan wrote:
Thu Dec 26, 2019 7:25 pm
By poor SMP scaling, I mean that the elo gains from additional cores/threads were significantly lower for Schooner than for most engines. Not really a worry at 2C or 4C where the gains are still good, but significant on tournament hardware with dozens of cores.
I do not have dozens of cores to test with unfortunately. Schooner uses Shared Hash Table threading, and there is a noticeable collision on the main bus transfer with increasing number of cores. Threading is theoretically available up to 128 cores, but eight is probably the maximum thread usefulness. On the other hand, Shared Hash Table threading is easy to implement and debug. Prefetch() is used in the SSE version which might help a little.

Alayan
Posts: 128
Joined: Tue Nov 19, 2019 7:48 pm
Full name: Alayan Feh

Re: Schooner Version 2.2 Release

Post by Alayan » Wed Jan 01, 2020 3:48 pm

Most engines nowadays use "LazySMP" based approaches, which derive from the shared hash table concept, and actually scale quite well elo-wise with many threads (best approach known for 16+, and good for 8 or less). Open source code to study is available from Stockfish, Ethereal, and many others, and if you made a more detailed post about your current approach and issues in the technical discussion subforum, I'm sure you'd find other authors willing to give advice.

D Sceviour
Posts: 496
Joined: Mon Jul 20, 2015 3:06 pm
Contact:

Re: Schooner Version 2.2 Release

Post by D Sceviour » Wed Jan 01, 2020 4:36 pm

Alayan wrote:
Wed Jan 01, 2020 3:48 pm
Most engines nowadays use "LazySMP" based approaches, which derive from the shared hash table concept, and actually scale quite well elo-wise with many threads (best approach known for 16+, and good for 8 or less). Open source code to study is available from Stockfish, Ethereal, and many others, and if you made a more detailed post about your current approach and issues in the technical discussion subforum, I'm sure you'd find other authors willing to give advice.
Not everybody's advice seems to work well for me. For example, Bob Hyatt recommended assigning CPU affinity to each thread. I tried again using SetProcessAffinityMask() just today, but the results were disastrous. I suspect Windows thread handlers for my i9 expect something different. The DWORD indicates the possibility of a maximum 32 or 64 threads for affinity. This is the windows code segment:

Code: Select all

#if !defined(LINUX)
 #include <windows.h>
#else
 #include <sched.h>
#endif

HANDLE process;
DWORD_PTR processAffinityMask;
BOOL success;

  process = GetCurrentProcess();
  processAffinityMask = 1 << sd->ID;

  success = SetProcessAffinityMask(process, processAffinityMask);

  printf("ID %d affinity success %d\n",sd->ID,(int) success);
My first attempt at threading years ago was using Younger-Brothers-Wait as this is a natural way to believe there would be a threading improvement, but this was abandoned in favor of a superior and simpler Shared Hash Table.

I tried using Andrew Grants staggered depth for threads method (some people might call this lazy SMP) but found no strength increase. Andrew used to discuss chess a lot in TLCV chat window, but apparently now has difficulty logging on.

You can find a lot of discussion about threading posted by me in various places, for example:

viewtopic.php?f=7&t=68154&p=780051&hilit=thread#p780051

This one on aspiration was eventually a failure.

Jamal Bubker
Posts: 268
Joined: Mon May 24, 2010 2:32 pm

Re: Schooner Version 2.2 Release

Post by Jamal Bubker » Wed Jan 01, 2020 10:34 pm

Thank you very much Dennis !!

Post Reply