Help with Crafty....please!

Discussion of anything and everything relating to chess playing software and machines.

Moderator: Ras

zullil
Posts: 6442
Joined: Tue Jan 09, 2007 12:31 am
Location: PA USA
Full name: Louis Zulli

Re: Help with Crafty....please!

Post by zullil »

It seems there is no problem with the binary named crafty_230_win32_uci_ja.exe. (Is this what you are using?) I ran it from the command line (using wine, since I have a MacBook) and it used the number of threads specified in crafty.rc. (I used the crafty.rc you posted here, which is copied below. I only have two cores, so I used smpmt=1 and then smpmt=2 to test.) I used the unix top command to monitor the binary's CPU usage, by the way. I also noted a doubling of nps when I switched from smpmt=1 to smpmt=2.

Here's the command line interaction with the engine:

Code: Select all

Procyon: ~/Desktop/craftyJA/crafty-23.0_ja/speed compiles/win32] wine ./crafty_230_win32_uci_ja.exe 
uci
id name Crafty 23.0 uci ja
id author Robert Hyatt
option name Program type string default crafty_230_win32_ja.exe
option name InitString type string default <empty>
option name Hash type spin default 2 min 0 max 1024
option name HashCommand type string default <empty>
option name HashFormula type string default 
option name HashOnCommandline type check default false
option name InitTime type spin default 2 min 0 max 30
option name StartTime type spin default 1 min 0 max 30
option name Delay type spin default 0 min 0 max 1000
option name MateScore type spin default 0 min 0 max 100000
option name LevelType type spin default 1 min 1 max 2
option name SlowDown type spin default 1 min 1 max 100
option name Edit type combo default setboard var setboard var edit var cb-edit
option name Ponder type check default true
option name AlwaysMoveOnStop type check default false
option name OwnBook type check default false
option name ShowThinkingMove type check default false
option name Analyze type check default true
option name UseUndo type check default true
option name WhiteScore type check default false
option name Logfile type check default false
option name Priority type combo default Normal var Normal var BelowNormal var Low
option name RunIdle type check default false
option name Computer type check default false
option name SimulateHint type check default false
option name LevelExtend type combo default None var Progressive var Strict var Failsafe var None
option name Protocol type spin default 2 min 1 max 2
option name Noise type spin default 0 min -1 max 99
option name Help type button
option name UCI_Opponent type string default <empty>
option name UCI_LimitStrength type check default false
option name UCI_Elo type spin default 1000 min 1 max 1000
option name CpuPower type string default 100.0
uciok
ucinewgame
position startpos
isready
readyok
go infinite
info depth 1 score cp 1
info depth 11 score cp 30 time 150 nodes 246341 nps 1642273 pv g1f3 e7e6 e2e3 g8f6 b1c3 b8c6 f1d3 f8d6 e1g1 e8g8 f3g5
info depth 11 score cp 30 time 170 nodes 283918 nps 1670105 pv g1f3 e7e6 e2e3 g8f6 b1c3 b8c6 f1d3 f8d6 e1g1 e8g8 f3g5
info depth 12 score cp 24 time 240 nodes 406645 nps 1694354 pv g1f3 e7e6 e2e3 g8f6 b1c3 b8c6 f1d3 f8d6 e1g1 e8g8 f3g5 g7g6
info depth 12 score cp 24 time 300 nodes 519776 nps 1732586 pv g1f3 e7e6 e2e3 g8f6 b1c3 b8c6 f1d3 f8d6 e1g1 e8g8 f3g5 g7g6
info depth 13 score cp 9 time 450 nodes 801175 nps 1780388 pv g1f3 e7e6 e2e3 g8f6 b1c3 b8c6 f1c4 f8d6 e1g1 e8g8 c3b5 d6c5 d2d3
info depth 13 score cp 9 time 1010 nodes 1897633 nps 1878844 pv g1f3 e7e6 e2e3 g8f6 b1c3 b8c6 f1c4 f8d6 e1g1 e8g8 c3b5 d6c5 d2d3
info depth 14 score cp 11 time 2640 nodes 5177174 nps 1961050 pv g1f3 g8f6 e2e4 f6e4 b1c3 e4c3 d2c3 e7e6 f1d3 f8d6 e1g1 b8c6 c1g5 f7f6
info depth 14 score cp 19 time 4310 nodes 2939706 nps 682066 pv e2e4 e7e6 b1c3 b8c6 g1f3 f8b4 a2a3 b4c3 d2c3 g8f6 e4e5 f6e4 f1d3 f7f5
info depth 14 score cp 19 time 4310 nodes 8640223 nps 2004692 pv e2e4 e7e6 b1c3 b8c6 g1f3 f8b4 a2a3 b4c3 d2c3 g8f6 e4e5 f6e4 f1d3 f7f5
info depth 15 score cp 17 time 9660 nodes 19852494 nps 2055123 pv e2e4 e7e5 b1c3 b8c6 g1f3 g8f6 f1b5 f8c5 e1g1 e8g8 d2d3 d7d6 b5c6 b7c6 c1e3
info depth 15 score cp 17 time 10460 nodes 21416468 nps 2047463 pv e2e4 e7e5 b1c3 b8c6 g1f3 g8f6 f1b5 f8c5 e1g1 e8g8 d2d3 d7d6 b5c6 b7c6 c1e3
stop
bestmove e2e4
quit

And here's the crafty.rc I used:

Code: Select all

#egtb 
#adaptive 750K 24M 192M 24M 48M 
#tbpath=d:\progra~1\arena\nalimov 
cache=32M 
ponder off 
# 
# (for use use with ponder on) 
# mode=tournament 
# 
# (Allows Crafty to try to win drawn games (according to Endgame Tables)) 
swindle on 
# 
# (default) 
learn 7 
# 
# (default) 
book random 1 
# 
bookw freq 0.7 
bookw ratio 0.8 
bookw eval 0.6 
bookw learn 1 
bookw cap 0.5 
book width 4 
# 
#(default = book on) 
book on 
# 
show book 
log=off 
# 
# (Increases Crafty's MaxThreads to 2 for a dual CPU computer ) 
#mt=4 
smpmt=2 
# 
# (Make Crafty not use cpu on opponents time) 
#smpnice=1 
# 
# (Makes crafty use a lot more time on the first 8 moves out of book) 
timebook 80 8 
#adaptive NPS a 
hash=512M 
hashp=64M 
exit
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Help with Crafty....please!

Post by bob »

Werewolf wrote:uh-oh. Only Jim would know the answer to that I guess...
start crafty in a command prompt window and type "mt=4" and see if you get an error message back. If not, it should work. If you do get an error, it will tell you it was not compiled for parallel search.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Help with Crafty....please!

Post by bob »

Dann Corbit wrote:Change:
smpmt=4
to:
mt=4
Either one works. Preferred command is "smpmt=n" as all of the parallel search controls are accessed through "smp-prefix" commands...
User avatar
sje
Posts: 4675
Joined: Mon Mar 13, 2006 7:43 pm

Apple cores

Post by sje »

Jim Ablett wrote:Crafty 23.0 JA compiled with support for 1024 cores :)
Sedat Canbaz asked for this feature to enable 'future proofing!'
The current Linux kernel allows for up to 64 cores (or 64 hyperthreaded pseudo-cores).

Apple produces a couple of different 8 core (or 16 hyperthreaded core) machines; all have prices above US$3,300 and are not really in the consumer class.

Symbolic uses the Linux core limit number. There is no special version for non-SMP boxes.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Help with Crafty....please!

Post by bob »

Jim Ablett wrote:
Werewolf wrote:uh-oh. Only Jim would know the answer to that I guess...
Hi Carl,

Crafty 23.0 JA compiled with support for 1024 cores :)
Sedat Canbaz asked for this feature to enable 'future proofing!' :shock:

Jim.
I hope not. That will make the memory footprint _huge_...
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Apple cores

Post by bob »

sje wrote:
Jim Ablett wrote:Crafty 23.0 JA compiled with support for 1024 cores :)
Sedat Canbaz asked for this feature to enable 'future proofing!'
The current Linux kernel allows for up to 64 cores (or 64 hyperthreaded pseudo-cores).
Actually that has been stretched a lot. Current kernel we are using (2.6.28.8) has had tables stretched to 256 cores...

Apple produces a couple of different 8 core (or 16 hyperthreaded core) machines; all have prices above US$3,300 and are not really in the consumer class.

Symbolic uses the Linux core limit number. There is no special version for non-SMP boxes.
The only reason I have a compile-time option is that too many try to run multiple threads on a single-cpu machine, which is horrible for performance.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Help with Crafty....please!

Post by bob »

bob wrote:
Jim Ablett wrote:
Werewolf wrote:uh-oh. Only Jim would know the answer to that I guess...
Hi Carl,

Crafty 23.0 JA compiled with support for 1024 cores :)
Sedat Canbaz asked for this feature to enable 'future proofing!' :shock:

Jim.
I hope not. That will make the memory footprint _huge_...
For the record, a "split block" is about 50K bytes in Crafty. The static memory allocated for split blocks is something like this:

#cpus * 64 * sizeof(splitblock). For 1024, that becomes 65536 * 50000 = 3+ gigabytes just for the split blocks. Not good. Not good at all. :)

A more normal 8 cpus is 64 * 8 * 50000 = 25 megabytes.
User avatar
sje
Posts: 4675
Joined: Mon Mar 13, 2006 7:43 pm

Re: Apple cores

Post by sje »

bob wrote:
sje wrote:
Jim Ablett wrote:Crafty 23.0 JA compiled with support for 1024 cores :)
Sedat Canbaz asked for this feature to enable 'future proofing!'
The current Linux kernel allows for up to 64 cores (or 64 hyperthreaded pseudo-cores).
Actually that has been stretched a lot. Current kernel we are using (2.6.28.8) has had tables stretched to 256 cores...

Apple produces a couple of different 8 core (or 16 hyperthreaded core) machines; all have prices above US$3,300 and are not really in the consumer class.

Symbolic uses the Linux core limit number. There is no special version for non-SMP boxes.
The only reason I have a compile-time option is that too many try to run multiple threads on a single-cpu machine, which is horrible for performance.
Symbolic's 64 core limit is an upper bound for the active worker thread limit (mate finder, path counter, and move searcher all get (at most) 64 worker threads each. but only a maximum of 64 total are active at any one time). The total number of threads is further limited PTHREAD_MAX_THREADS (or something like that).

The actual worker thread limit is the number of configured cores as returned by sysconf() (and is subject to the 64 upper bound number). So, a user can't run more active worker threads than there are cores. Alas, sysconf() does not distinguish between cores and hyperthread pseudocores.

It might be better to use sysctl() than sysconf(), the numbers are the same but the latter might be more portable.

The only user supplied compile time definition that has any effect is -DNDEBUG (for the <cassert> assert() macro) and I might even remove that. I note that the latest xboard protocol specification allows for a user to specify an upper limit on core count utilization. Symbolic currently ignores this and that behavior might also be changed.

Other than phones, handhelds and ultra cheap notebooks, are there any recent consumer machines that don't have at least two cores? Or at least two pseudocores? The same question could be asked for 64 bit vs 32 arithmetic operations.

If someone wanted future proofing, they should worry more about the 2.037K bug than a 1,024 core limit. Symbolic's time values span over a half million years with microsecond resolution, but they are still dependent upon the old 32 bit library time routines.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Apple cores

Post by bob »

sje wrote:
bob wrote:
sje wrote:
Jim Ablett wrote:Crafty 23.0 JA compiled with support for 1024 cores :)
Sedat Canbaz asked for this feature to enable 'future proofing!'
The current Linux kernel allows for up to 64 cores (or 64 hyperthreaded pseudo-cores).
Actually that has been stretched a lot. Current kernel we are using (2.6.28.8) has had tables stretched to 256 cores...

Apple produces a couple of different 8 core (or 16 hyperthreaded core) machines; all have prices above US$3,300 and are not really in the consumer class.

Symbolic uses the Linux core limit number. There is no special version for non-SMP boxes.
The only reason I have a compile-time option is that too many try to run multiple threads on a single-cpu machine, which is horrible for performance.
Symbolic's 64 core limit is an upper bound for the active worker thread limit (mate finder, path counter, and move searcher all get (at most) 64 worker threads each. but only a maximum of 64 total are active at any one time). The total number of threads is further limited PTHREAD_MAX_THREADS (or something like that).

The actual worker thread limit is the number of configured cores as returned by sysconf() (and is subject to the 64 upper bound number). So, a user can't run more active worker threads than there are cores. Alas, sysconf() does not distinguish between cores and hyperthread pseudocores.

It might be better to use sysctl() than sysconf(), the numbers are the same but the latter might be more portable.

The only user supplied compile time definition that has any effect is -DNDEBUG (for the <cassert> assert() macro) and I might even remove that. I note that the latest xboard protocol specification allows for a user to specify an upper limit on core count utilization. Symbolic currently ignores this and that behavior might also be changed.

Other than phones, handhelds and ultra cheap notebooks, are there any recent consumer machines that don't have at least two cores? Or at least two pseudocores? The same question could be asked for 64 bit vs 32 arithmetic operations.

If someone wanted future proofing, they should worry more about the 2.037K bug than a 1,024 core limit. Symbolic's time values span over a half million years with microsecond resolution, but they are still dependent upon the old 32 bit library time routines.
I've shied away from too much unix dependence, and since hyper-threading adds yet another level of complexity, I've just left that to the user. If one was to be anal, one could use the "cpuid" instruction and discover how many physical cores there are and whether hyper-threading is enabled or not. That might actually be a more portable solution since that is not dependent on linux or windows. Maybe, one day. However, there are other interesting issues and I don't want to spend weeks trying to make the compile operation option-free. For example, Nehalem's SSE4.2 which has (finally) the popcnt instruction that I am now using when I run on that platform. But I am not relying on the compiler telling me about SSE4.2 because older compilers don't understand it, so I added a -DPOPCNT option to say use a hardware popcnt instruction which will crash and burn on a non-SSE4.2 box.
Werewolf
Posts: 2032
Joined: Thu Sep 18, 2008 10:24 pm

Re: Help with Crafty....please!

Post by Werewolf »

SOLVED THE PROBLEM!

I was using the 23.0 engine from the 'speed complie' list that Jim provides which, ironically, doesn't seem to support multithreading.

I'm now using the full featured one and it does.

Thanks!