A new chess engine : m8 (comming not so soon)

Discussion of chess software programming and technical issues.

Moderators: hgm, Harvey Williamson, bob

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
petero2
Posts: 560
Joined: Mon Apr 19, 2010 5:07 pm
Location: Sweden
Contact:

Re: A new chess engine : m8 (comming not so soon)

Post by petero2 » Sun Feb 01, 2015 7:38 pm

mar wrote:
mathmoi wrote:Isn't lazy SMP the technique that use the TT as a communication device between threads that search the same tree?
Basically yes, but it has been improved since the old days (also we're using larger TTs today):
resync on each ID iteration, early termination when one of the helpers finishes and run each other thread on depth+1 (as proposed by Dan Homan).
My understanding was that it gave little speedup passed 2 CPU. Am I missing something?
Where did you read that? :shock: There's evidence that it works and in fact it works very well.
If claim sun in blue will you believe it? If I keep repeating it over and over again? Or will you trust what you see?
I have tested this algorithm using Cheng 0.38 on a 16 core Dell PowerEdge T620 computer. Both hyperthreading and turbo boost are enabled. The computer runs Fedora 19. Here are the results, computed by bayeselo:

Code: Select all

Cheng 4c vs Cheng 1c, 1+0.08:
Rank Name             Elo    +    - games score oppo. draws 
   1 cheng4_038_4c     68    8    8  1457   70%   -68   35% 
   2 cheng4_038       -68    8    8  1457   30%    68   35% 

Cheng 4c vs Cheng 1c, 8+0.64:
Rank Name             Elo    +    - games score oppo. draws 
   1 cheng4_038_4c     66   10   10  1011   71%   -66   42% 
   2 cheng4_038       -66   10   10  1011   29%    66   42% 

Cheng 8c vs Cheng 4c, 1+0.08:
Rank Name             Elo    +    - games score oppo. draws 
   1 cheng4_038_8c     28    8    8  1475   59%   -28   44% 
   2 cheng4_038_4c    -28    8    8  1475   41%    28   44% 

Cheng 16c vs Cheng 8c, 1+0.08:
Rank Name             Elo    +    - games score oppo. draws 
   1 cheng4_038_16c     4    8    8  1400   51%    -4   46% 
   2 cheng4_038_8c     -4    8    8  1400   49%     4   46% 

Cheng 16c vs Cheng 8c, 2+0.16:
Rank Name             Elo    +    - games score oppo. draws 
   1 cheng4_038_16c    13    8    8  1433   54%   -13   50% 
   2 cheng4_038_8c    -13    8    8  1433   46%    13   50% 

Cheng 16c vs Cheng 8c, 4+0.32:
Rank Name             Elo    +    - games score oppo. draws 
   1 cheng4_038_16c     6    8    8  1151   52%    -6   53% 
   2 cheng4_038_8c     -6    8    8  1151   48%     6   53% 

Cheng 16c vs Cheng 8c, 180+1:
Rank Name             Elo    +    - games score oppo. draws 
   1 cheng4_038_16c    17   20   20   184   56%   -17   53% 
   2 cheng4_038_8c    -17   20   20   184   44%    17   53% 
Even at hyper bullet speed (1s+0.08s/move) it scales well up to 8 cores. +130 elo from 1 to 4 cores, and +56 elo from 4 to 8 cores. At 16 cores it seems hyper bullet speed is too fast for the algorithm to be effective, but it still seems to work well, +34 elo, at 3m+1s/move time control, even though the number of games (184) is too low to be really sure.

Feeding all 16 vs 8 cores games into bayeselo gives +16 elo and LOS=99.9%, so it is pretty clear that the algorithm works also for 16 cores even if it is unclear exactly how well it works.

Some other results for comparison, even though the low number of games and different 1 core ratings make it hard to draw any definite conclusions:

Code: Select all

Texel 16c vs Texel 8c, 180+1:
Rank Name          Elo    +    - games score oppo. draws 
   1 Texel16c       21   19   18   190   58%   -21   69% 
   2 Texel8c       -21   18   19   190   42%    21   69% 

Texel 16c vs Texel 16c_nonuma, 180+1:
Rank Name          Elo    +    - games score oppo. draws 
   1 Texel16c        7   19   19   175   53%    -7   71% 
   2 Texel16cnn     -7   19   19   175   47%     7   71% 

Komodo 8 16c vs Komodo 8 8c, 180+1:
Rank Name          Elo    +    - games score oppo. draws 
   1 komodo8_16c    12   18   18   196   55%   -12   78% 
   2 komodo8_8c    -12   18   18   196   45%    12   78% 

mar
Posts: 1832
Joined: Fri Nov 26, 2010 1:00 pm

Re: A new chess engine : m8 (comming not so soon)

Post by mar » Sun Feb 01, 2015 8:14 pm

Thanks for the data Peter :)
Not bad for something that's lousy beyond 2 cores I guess.

mathmoi
Posts: 265
Joined: Mon Mar 13, 2006 4:23 pm
Location: Québec
Contact:

Re: A new chess engine : m8 (comming not so soon)

Post by mathmoi » Sun Feb 01, 2015 9:39 pm

mar wrote:
mathmoi wrote:Isn't lazy SMP the technique that use the TT as a communication device between threads that search the same tree?
Basically yes, but it has been improved since the old days (also we're using larger TTs today):
resync on each ID iteration, early termination when one of the helpers finishes and run each other thread on depth+1 (as proposed by Dan Homan).
My understanding was that it gave little speedup passed 2 CPU. Am I missing something?
Where did you read that? :shock: There's evidence that it works and in fact it works very well.
If claim sun in blue will you believe it? If I keep repeating it over and over again? Or will you trust what you see?
Hi Martin,

I'm not sure where I read that. I always though lazy SMP did not gave a good speedup. Obviously before I implement an SMP algorithm I will have to educate myself on what algorithms are available and how they perform.

I just looked on CPW and did not find much information about lazy SMP (or even YBWC). Is there a good resource online or maybe on CCC?

mp

Volker Annuss
Posts: 163
Joined: Mon Sep 03, 2007 7:15 am

Re: A new chess engine : m8 (comming not so soon)

Post by Volker Annuss » Sun Feb 01, 2015 10:36 pm

m8 (read mate)
Nice name. The german word for 8 is acht.
The german word Macht means power/force.

AlvaroBegue
Posts: 884
Joined: Tue Mar 09, 2010 2:46 pm
Location: New York

Re: A new chess engine : m8 (comming not so soon)

Post by AlvaroBegue » Sun Feb 01, 2015 11:03 pm

Volker Annuss wrote:
m8 (read mate)
Nice name. The german word for 8 is acht.
The german word Macht means power/force.
In Spanish it doesn't work so well: "mocho" is an adjective used for animals and it means that their horns are rounded or blunt. :)

User avatar
Evert
Posts: 2898
Joined: Fri Jan 21, 2011 11:42 pm
Location: NL
Contact:

Re: A new chess engine : m8 (comming not so soon)

Post by Evert » Sun Feb 01, 2015 11:23 pm

mathmoi wrote:
Evert wrote:
Ferdy wrote:Perhaps you may try chess variant, say capablanca, 8 rows x 10 columns. Two pieces each are added in each side, one piece is a combination of knight and rook the other piece is a combination of knight and bishop. It is fun and interesting to explore the possibilities of these two new pieces over the expanded board.
I can only add my support for this suggestion. If you want to stick with 8x8, consider Seirawan Chess: it adds the same two pieces as Capablanca chess, and regular chess is just a subset Seirawan Chess anyway.

EDIT: think of it this way: another chess engine is just that, they're a dime a dozen. Another chess variant engine is something special. There may be fewer people interested in chess variant engines, but because there are so few of them out there, people who are interested will be interested in yours.
Hi Evert,

I'll look into this. As I said, I'm interested in implementing search variant, but I am probably limited to 8 x 8 variants since I want to use bitboards.
SjaakII uses bitboards and it plays on boards larger than 8x8 (the limitation on board size is no more than 128 squares and no more than 16 ranks/files), so it can be done. It's not necessarily simple though (but it is fun because it means you have to figure out how to do some things for yourself) and I don't think you would want to use something like magic generation.

Note that I'm not saying it's necessarily optimal, just that it's possible.

mar
Posts: 1832
Joined: Fri Nov 26, 2010 1:00 pm

Re: A new chess engine : m8 (comming not so soon)

Post by mar » Mon Feb 02, 2015 8:41 am

mathmoi wrote:Is there a good resource online or maybe on CCC?
Of course, you can start here for example (original thread by Dan Homan): http://www.talkchess.com/forum/viewtopic.php?t=46858

User avatar
lucasart
Posts: 2957
Joined: Mon May 31, 2010 11:29 am
Contact:

Re: A new chess engine : m8 (comming not so soon)

Post by lucasart » Mon Feb 02, 2015 10:16 am

mar wrote:Thanks for the data Peter :)
Not bad for something that's lousy beyond 2 cores I guess.
This is remarkable for an algorithm that is so simple. Considering that most people have 4 cores or less, it seems like an ideal tradeoff. Even 4 to 8 scaling is not too bad.

One technicality I don't understand: 1 thread runs depth d, while n-1 run d+1. seems easy enough, but… how do you manage aspiration windows? or does running depth d include the aspiration loop?
Theory and practice sometimes clash. And when that happens, theory loses. Every single time.

User avatar
lucasart
Posts: 2957
Joined: Mon May 31, 2010 11:29 am
Contact:

Re: A new chess engine : m8 (comming not so soon)

Post by lucasart » Mon Feb 02, 2015 10:39 am

It's quite possible that this lazy SMP can be generalized to perform better with 16 threads. Something to try (though requires a 16 core machine which I don't posess):
* current version: 1 thread at depth d and n-1= 15 threads at depth d+1
* test version: 1 thread at depth d, 3 threads at depth d+1, 12 threads at depth d+2
(or variations along those lines)
Theory and practice sometimes clash. And when that happens, theory loses. Every single time.

mar
Posts: 1832
Joined: Fri Nov 26, 2010 1:00 pm

Re: A new chess engine : m8 (comming not so soon)

Post by mar » Mon Feb 02, 2015 11:57 am

lucasart wrote: This is remarkable for an algorithm that is so simple. Considering that most people have 4 cores or less, it seems like an ideal tradeoff. Even 4 to 8 scaling is not too bad.

One technicality I don't understand: 1 thread runs depth d, while n-1 run d+1. seems easy enough, but… how do you manage aspiration windows? or does running depth d include the aspiration loop?
Actually each other thread runs depth d+1.
So you have 1 "master" and n-1 helpers (or slaves if you want).
master starts at d, first helper at d+1, second helper d again and so on.

Actually I don't do anything special, still having the same aspiration loop.
Helpers are simply started each iteration within aspiration loop, at root.
Each time a helper finishes (it can happen that a helper finishes before "master", all helpers (+master) are aborted and I simply return score, (bestmove and PV if available) from the thread that finishes first.
Each thread has local eval cache/pawn hash (but in my case it was quite simple because I had search already wrapped in a class, including local caches).

Post Reply