Core behaviour

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

User avatar
Rebel
Posts: 6991
Joined: Thu Aug 18, 2011 12:04 pm

Re: Core behaviour

Post by Rebel »

bob wrote:My take here is that there is little use in worrying about something you can't fix.
But of course you can fix when the scheduler in its wisdom makes bad decisions (as shown in the real life examples above) by modifying the affinities of the programs running in the right way. Call it bug-fixing :lol:

Considering that most of the changes we try are about what? A maximum expectation of 2-3 elo? Then if the scheduler doesn't function as it should the results are misleading. One can accept the change as an improvement (and never look back) while in reality it is a regression and we are back in the dark ages of the 70-80-90's.

If you look at the statistics given in the OP I know from experience the 240 version is hurt by at least 5 elo due to the scheduler having a bad day made visible by coretemp and the loading percentages of the cores.

Either you test and do it right, or don't test at all.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Core behaviour

Post by bob »

I was really thinking of testing done by OTHERS when I was writing that. In my case, with my own testing, I very carefully set up a testbed that introduced as little noise as possible.

But when the testing goes outside of my reach. Such as SSDF, TCEC, and all the other events and tests run by others, there's not much that can be done. VERY difficult to try to fix many of these problems given the different versions of windows, linux and MacOS that are out there, where by "fix" I mean that the program itself does whatever is needed to avoid thread bouncing and such. And there's nothing to do when a malicious GUI or opponent wants to cause problems, except to not run when the program notices competition for resources. And programs can directly interfere with the GUI as well, as we have seen in past nonsense. Flood the GUI with messages which can make it slow to respond to opponent legitimate messages and interfere with time usage.

I would NEVER consider testing under windows. I want a tightly controlled environment that will work consistently each and every time.
User avatar
cdani
Posts: 2204
Joined: Sat Jan 18, 2014 10:24 am
Location: Andorra

Re: Core behaviour

Post by cdani »

bob wrote:I would NEVER consider testing under windows. I want a tightly controlled environment that will work consistently each and every time.
I understand your concerns, but believe me, you can test on Windows and works more than reasonably well :-)
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Core behaviour

Post by bob »

cdani wrote:
bob wrote:I would NEVER consider testing under windows. I want a tightly controlled environment that will work consistently each and every time.
I understand your concerns, but believe me, you can test on Windows and works more than reasonably well :-)
Sorry, but not even close. And if you are talking windows 10, not at all.
User avatar
cdani
Posts: 2204
Joined: Sat Jan 18, 2014 10:24 am
Location: Andorra

Re: Core behaviour

Post by cdani »

bob wrote: Sorry, but not even close. And if you are talking windows 10, not at all.
I use windows 7. I never tested with Linux, so I cannot compare.
But I can say that the test results I do are consistent enough when I retest groups of patches, and I do it various times between versions. Sometimes I have surprises, of course, but Andscacs advance well in general.
User avatar
Rebel
Posts: 6991
Joined: Thu Aug 18, 2011 12:04 pm

Re: Core behaviour

Post by Rebel »

cdani wrote:
bob wrote:I would NEVER consider testing under windows. I want a tightly controlled environment that will work consistently each and every time.
I understand your concerns, but believe me, you can test on Windows and works more than reasonably well :-)
Agree, just strip everything on a separate PC. When the (my) PC is idle tempcore shows 0% in load of all 8 cores, so now and then it steals a few percentages to disapeear the next display. I guess that all OS's have that.
User avatar
Rebel
Posts: 6991
Joined: Thu Aug 18, 2011 12:04 pm

Re: Core behaviour

Post by Rebel »

Here is a simple solution that fixes the capriciousness of the Windows scheduler via a batch file. For example a match between ProDeo and Crafty (Hi Bob) using 4 cores. Start Coretemp and then the match, watch the coretemp loading percentages and notice how they fluctuate, then run the below batch file and notice the fluctuation is much less since each of the 4 matches are pinned to a fixed core with affinity. In the control panel check-out the affinities of each executable and how things are arranged.

Code: Select all

PowerShell "$Process = Get-Process ProDeo1; $Process.ProcessorAffinity=3"
PowerShell "$Process = Get-Process Crafty1; $Process.ProcessorAffinity=3"

PowerShell "$Process = Get-Process ProDeo2; $Process.ProcessorAffinity=12"
PowerShell "$Process = Get-Process Crafty2; $Process.ProcessorAffinity=12"

PowerShell "$Process = Get-Process ProDeo3; $Process.ProcessorAffinity=48"
PowerShell "$Process = Get-Process Crafty3; $Process.ProcessorAffinity=48"

PowerShell "$Process = Get-Process ProDeo4; $Process.ProcessorAffinity=192"
PowerShell "$Process = Get-Process Crafty4; $Process.ProcessorAffinity=192"
So you need to set-up your matches a bit different but the reward might be big.
User avatar
Rebel
Posts: 6991
Joined: Thu Aug 18, 2011 12:04 pm

Re: Core behaviour

Post by Rebel »

Here is the batch file I use on my 2 node NUMA (2 x 4 cores) 16 threads in total running self-play match between PoDeo 2.2 (PD22x) and ProDeo 2.4 (PD24x).

Code: Select all

PowerShell "$Process = Get-Process PD221; $Process.ProcessorAffinity=3"
PowerShell "$Process = Get-Process PD241; $Process.ProcessorAffinity=3"

PowerShell "$Process = Get-Process PD222; $Process.ProcessorAffinity=12"
PowerShell "$Process = Get-Process PD242; $Process.ProcessorAffinity=12"

PowerShell "$Process = Get-Process PD223; $Process.ProcessorAffinity=48"
PowerShell "$Process = Get-Process PD243; $Process.ProcessorAffinity=48"

PowerShell "$Process = Get-Process PD224; $Process.ProcessorAffinity=192"
PowerShell "$Process = Get-Process PD244; $Process.ProcessorAffinity=192"

PowerShell "$Process = Get-Process PD225; $Process.ProcessorAffinity=768"
PowerShell "$Process = Get-Process PD245; $Process.ProcessorAffinity=768"

PowerShell "$Process = Get-Process PD226; $Process.ProcessorAffinity=3072"
PowerShell "$Process = Get-Process PD246; $Process.ProcessorAffinity=3072"

PowerShell "$Process = Get-Process PD227; $Process.ProcessorAffinity=12288"
PowerShell "$Process = Get-Process PD247; $Process.ProcessorAffinity=12288"

PowerShell "$Process = Get-Process PD228; $Process.ProcessorAffinity=49152"
PowerShell "$Process = Get-Process PD248; $Process.ProcessorAffinity=49152"
Note CoreTemp before running the batch file (bad) and after (good).

Image xxx Image
User avatar
Rebel
Posts: 6991
Joined: Thu Aug 18, 2011 12:04 pm

Re: Core behaviour

Post by Rebel »

Nobody cares?

Or do I miss something?
Ras
Posts: 2487
Joined: Tue Aug 30, 2016 8:19 pm
Full name: Rasmus Althoff

Re: Core behaviour

Post by Ras »

bob wrote:I want a tightly controlled environment that will work consistently each and every time.
From my perspective as (mainly) bare metal developer, thinking of Linux as tightly controlled seems somewhat astonishing.