New Stockfish with Lazy_SMP, but what about the TC bug ?

ernest · Post by **ernest** » Mon Oct 26, 2015 9:14 pm

On Oct 20 I can see in http://abrok.eu/stockfish/ the Development Version of Stockfish with Lazy_SMP,

but has the Time Control bug (seen at TCEC) been taken care of ?
(seen nothing about that, and where is the corresponding patch ?)

Tomcass · Post by **Tomcass** » Tue Oct 27, 2015 10:36 am

ernest wrote:On Oct 20 I can see in http://abrok.eu/stockfish/ the Development Version of Stockfish with Lazy_SMP,

but has the Time Control bug (seen at TCEC) been taken care of ?
(seen nothing about that, and where is the corresponding patch ?)

Hi Ernest,

I have just finished the test of this Lazy version and I have not observed any Time Control bug either at Fixed or Incremental. Obviously I have not followed the 1.920 games of this test. I posted the first leg at Fixed Time Control with 960 games at the Tournament thread. I will post the second leg later on today.

Best regards from Barcelona.

Tom.

Leto · Post by **Leto** » Tue Oct 27, 2015 1:13 pm

Apparently the Stockfish 070915 version in TCEC 8 just lost a game on time, this time in a drawn position against Hannibal at move 99.

So maybe a part of the tc bug was present even before the Lazy version.

Dann Corbit · Post by **Dann Corbit** » Tue Oct 27, 2015 7:01 pm

Shortly after the bug was seen, Lucas analyzed the code and stated that the bug was also present in the previous version.

bob · Post by **bob** » Tue Oct 27, 2015 7:20 pm

Tomcass wrote:
ernest wrote:On Oct 20 I can see in http://abrok.eu/stockfish/ the Development Version of Stockfish with Lazy_SMP,

but has the Time Control bug (seen at TCEC) been taken care of ?
(seen nothing about that, and where is the corresponding patch ?)
Hi Ernest,

I have just finished the test of this Lazy version and I have not observed any Time Control bug either at Fixed or Incremental. Obviously I have not followed the 1.920 games of this test. I posted the first leg at Fixed Time Control with 960 games at the Tournament thread. I will post the second leg later on today.

Best regards from Barcelona.

Tom.

I have reported here MANY times about SMP issues in my code. Things that worked flawlessly at 8 cores got "iffy" at 12 and downright unreliable at 20. More cores = more things being done simultaneously, exposing narrow-window race conditions that won't happen in 2 and 4 core tests. And of course, "timing windows" are affected by "timing variations" and testing on linux and running on windows is problematic as well... Main point to ponder here is that as you play, you usually accumulate time up front from book moves and ponder hits. Once you accumulate 10 seconds, take that out of the timing discussion and decisions so that you don't try to use that very last second. No Elo to gain, and LOTS of Elo to lose when the flag falls.

syzygy · Post by **syzygy** » Tue Oct 27, 2015 9:06 pm

ernest wrote:On Oct 20 I can see in http://abrok.eu/stockfish/ the Development Version of Stockfish with Lazy_SMP,

but has the Time Control bug (seen at TCEC) been taken care of ?
(seen nothing about that, and where is the corresponding patch ?)

There is no real bug. SF simply needs to leave a higher safety margin when it allocates time, in particular on Windows. Setting the Move Overhead parameter to a higher value will effectively do that.

Dann Corbit · Post by **Dann Corbit** » Tue Oct 27, 2015 10:01 pm

syzygy wrote:
ernest wrote:On Oct 20 I can see in http://abrok.eu/stockfish/ the Development Version of Stockfish with Lazy_SMP,

but has the Time Control bug (seen at TCEC) been taken care of ?
(seen nothing about that, and where is the corresponding patch ?)
There is no real bug. SF simply needs to leave a higher safety margin when it allocates time, in particular on Windows. Setting the Move Overhead parameter to a higher value will effectively do that.

The TCEC contest specifically disallows monkeying with UCI parameters other than a small, fixed set of them.

So I suggest that the default value should be changed.
Clearly, the chosen value is not a good one. None of the other entrants seem to be losing on time, so they have chosen more wisely.

Dann Corbit · Post by **Dann Corbit** » Wed Oct 28, 2015 2:27 am

This is from the TCEC rules form:

Code: Select all

Engine Specific Configuration
UCI and Xboard &#40;Winboard&#41; engines are supported. To identify the protocol an engine is using you can click the gears next to the engine logo during a game. This info will be saved for the archive, but does not work for Season 5 or older games.

Compiles
Many engines come with different .exe files. 64-bit .exes are always preferred over 32-bit. Also, compiles that support SSE 4.2, AVX/AVX2 or similar instruction sets are preferred.

Large Pages
Some engines can utilize a function in Windows called large pages, that gives a speed boost. However, after a while the memory in the computer will be fragmented, so that one engine might receive large pages and the opponent won't, which would be unfair. Therefore, this has been disabled. Large pages are often useful when you are running infinite analysis on a position.

Number of Cores / Threads
Each engine can use up to all 20 cores of the processors, if this is supported. Some engines have a prefix like "deep", but this has been omitted from the engine name to make it shorter. When watching a game you can see the number of cores that are in use for the engines currently playing by clicking the gears next to the engine logo.

Split Depth
The split depth parameter can be adjusted, with advice from each of the engine programmers. Basically it defines the minimum depth for work to be split between threads. If no instructions are given from the programmer, the default value will be used.

Main Hash Size
Each engine is allowed to use up to 32768 MB of hash. Not all engines supports this much hash, so the maximum for that engine will be used in this case, typically 2048 MB or 4096 MB. When watching a game you can see the size of the hash that is in use for the engines currently playing by clicking the gears next to the engine logo.

Minor Hash Sizes
Some engines have an option to configure the size of other hash tables, often called pawn hash or evaluation hash. The combined, total limit for hash types like these is 4096 MB.

Own Opening Book
All opening books shipped with the engines are removed and/or disabled.

Endgame Tablebases
For Season 8, tablebases that are available for the engines are&#58; 5-men Syzygy, 5-men Nalimov and 5-men Gaviota &#40;cp2&#41;, depending on which each specific engine can use. For some engines specifying more than one type is possible, but here only one is allowed. Nalimov is preferred. For Stage 1 they were hosted in a RAM-drive, for Stage 2 onwards they are hosted on an SSD drive. When watching a game you can see the type of the tablebases &#40;if any&#41; that is in use for the engines currently playing by clicking the gears next to the engine logo.

Tablebase Cache
For Season 8, each engine is configured with a tablebase cache of 32 MB. This is really irrelevant anyway since they are all hosted in RAM.

Ponder / Permanent Brain
Basically this means that the engines can think during their opponents turn. It is not allowed so it has been disabled because of performance limitations with ony 1 computer. This might change in a future Season of TCEC.

Contempt / Draw Score
Some engines have a setting that can adjust their own view of the positions throughout a game to avoid draws. This setting is not changed from the default unless it is a request from the programmer.

Other Settings
Any configurable option not described above, are not adjusted in any way, except that "keep hash tables" or similar is usually enabled.

Hence, if the Stockfish team relies upon fiddling with a UCI setting, SF will lose games on time again, because it is specifically not allowed.

lucasart · Post by **lucasart** » Wed Oct 28, 2015 12:19 pm

FInally, the problem is understood, in such a way that we can reproduce measure:
https://groups.google.com/forum/?fromgr ... PNocZQkW-4

We are relying on OS sleep function, which oversleeps on large machines with heavy system load. The oversleeping we managed to measure is beyond our imagination. If you ask the OS to sleep for 5ms, it can easily sleep for 500ms (on large machines under heavy system load)

The problem is not Windows specific, but instead specific to large machines (many cores):
1/ Elan produced some significant oversleeping on a 4 core Windows machine.
2/ Joost produced some even larger oversleeping on a 32 core Linux machine.
3/ On my 4 core Linux machine, no oversleeping at all.
=> your mileage may vary...

Technically there is no bug in SF, but in the OS/hardware. However, we need to code a workaround in SF to avoid this bad OS behaviour, as it is SF that gets penalized for it in the end...

PS: Well, that's at least ONE problem. Maybe once we fix it, we will discover another one hidden behind

Waschbaer · Post by **Waschbaer** » Wed Oct 28, 2015 5:19 pm

As far as i know, Unix, Linux and Windows are not realtime operating systems, so there is no guaranteed response time given und for this you can't call it a bug of the OS.
You, the programmer of applications has to be aware of it.

EDIT
If the heavy load of the system comes from a lot of programs/threads you are responsible for, don't let the OS choose what thread has to been stopped, do it by yourself, but then the lazy concept is looking wrong, isn't it?

New Stockfish with Lazy_SMP, but what about the TC bug ?

New Stockfish with Lazy_SMP, but what about the TC bug ?

Re: New Stockfish with Lazy_SMP, but what about the TC bug ?

Re: New Stockfish with Lazy_SMP, but what about the TC bug ?

Re: New Stockfish with Lazy_SMP, but what about the TC bug ?

Re: New Stockfish with Lazy_SMP, but what about the TC bug ?

Re: New Stockfish with Lazy_SMP, but what about the TC bug ?

Re: New Stockfish with Lazy_SMP, but what about the TC bug ?

Re: New Stockfish with Lazy_SMP, but what about the TC bug ?

Re: New Stockfish with Lazy_SMP, but what about the TC bug ?

Re: New Stockfish with Lazy_SMP, but what about the TC bug ?