Crafty 23.1 scaling problem on Nehalem octa

Discussion of anything and everything relating to chess playing software and machines.

Moderator: Ras

User avatar
Jim Ablett
Posts: 2284
Joined: Fri Jul 14, 2006 7:56 am
Location: London, England
Full name: Jim Ablett

Re: Crafty 23.1 scaling problem on Nehalem octa

Post by Jim Ablett »

Hugo wrote:Hello Harun,

yes I tried, but did not change anything on my machine.
thanx

Clemens
I'm compiling a mingw64 version for you to test. I'll send it to you soon.

Jim.
Hugo
Posts: 782
Joined: Tue Dec 01, 2009 11:10 am

Re: Crafty 23.1 scaling problem on Nehalem octa

Post by Hugo »

Jim,

Thanks in advance :)

verry best regards

Clemens
User avatar
Peter Skinner
Posts: 1763
Joined: Sun Feb 26, 2006 1:49 pm
Location: Edmonton, Alberta, Canada
Full name: Peter Skinner

Re: Crafty 23.1 scaling problem on Nehalem octa

Post by Peter Skinner »

Hi Clemons,

Can you try something for me?

Download my compiled version of Crafty here: http://www.webkikr.com/files/crafty-23.1-win64.exe.zip

Save it to your desktop, and extract it there. Use no crafty.rc file for this test.

Once extracted, double click the executable so it starts in console mode. Once it starts type what is below. When you see [enter] <-- just hit enter after the command.

bench [enter]

This will test the system using 1 processor. Once completed, make note of the output Crafty gives, then type:

mt 2 [enter]
bench [enter]

Log the output from crafty again. Then:

mt 4 [enter]
bench [enter]

Once completed, make note of the output Crafty gives, then type:

mt 8 [enter]
bench [enter]

Once completed, make note of the output Crafty gives, then post all the results here. This will give us some numbers to look at possibly where the issue lies.

Peter
I was kicked out of Chapters because I moved all the Bibles to the fiction section.
User avatar
Jim Ablett
Posts: 2284
Joined: Fri Jul 14, 2006 7:56 am
Location: London, England
Full name: Jim Ablett

Re: Crafty 23.1 scaling problem on Nehalem octa

Post by Jim Ablett »

Hi Clemens,

Here's an Intel x64 compile for you to test. I'm unable to compile a mingw64 version at present, too many build errors. :(
Using pgo made the exe unstable, so I did a little manual inlining to speed it up a bit. Not quite as fast as the Msvc builds with pgo though.

http://www.mediafire.com/?tdg1ycy0jmm

Jim.
Hugo
Posts: 782
Joined: Tue Dec 01, 2009 11:10 am

Re: Crafty 23.1 scaling problem on Nehalem octa

Post by Hugo »

Hello Peter

your compile gives the same slow nodes on more than 4 cpu as others before.
First success I see now with Jims compile.

Regrads, Clemens
Hugo
Posts: 782
Joined: Tue Dec 01, 2009 11:10 am

Re: Crafty 23.1 scaling problem on Nehalem octa

Post by Hugo »

Hello Jim

first success:

usage: bookpath|perspath|logpath|tbpath <path>
unable to open book file [./book.bin].
book is disabled
unable to open book file [./books.bin].

Initializing multiple threads.
System is NUMA. 2 nodes reported by Windows
Node 0 CPUs: 0 1 2 3
Node 1 CPUs: 4 5 6 7
Current ideal CPU is 7
Exchanging nodes 0 and 1
EGTB cache memory = 32M bytes.
pondering enabled.
playing a computer!
use 'settc' command if a game is restarted after Crafty
has been terminated for any reason.
tournament mode.
book learning disabled
book file disabled.
max threads set to 8.
SMP keep extra threads spinning when idle.
hash table memory = 1024M bytes.
pawn hash table memory = 256M bytes.


Crafty v23.1 (8 cpus)

White(1): go depth 20
time surplus 0.00 time limit 57.00 (+27.00) (3:00)
depth time score variation (1)
starting thread 1
Starting thread on node 1 CPU mask 15
starting thread 2
Starting thread on node 0 CPU mask 240
starting thread 3
Starting thread on node 1 CPU mask 15
starting thread 4
Starting thread on node 0 CPU mask 240
starting thread 5
Starting thread on node 1 CPU mask 15
starting thread 6
Starting thread on node 0 CPU mask 240
starting thread 7
Starting thread on node 1 CPU mask 15
12 0.06 0.23 1. Nf3 Nc6 2. Nc3 Nf6 3. e3 e5 4. Bb5
Bd6 5. O-O O-O 6. d3 Re8
12-> 0.08 0.23 1. Nf3 Nc6 2. Nc3 Nf6 3. e3 e5 4. Bb5
Bd6 5. O-O O-O 6. d3 Re8 (s=2)
13 0.14 0.14 1. Nf3 Nc6 2. Nc3 Nf6 3. e4 e6 4. Bb5
Bb4 5. d3 a6 6. Bxc6 Bxc3+ 7. bxc3
dxc6
13-> 0.22 0.14 1. Nf3 Nc6 2. Nc3 Nf6 3. e4 e6 4. Bb5
Bb4 5. d3 a6 6. Bxc6 Bxc3+ 7. bxc3
dxc6 (s=3)
14 0.30 0.13 1. Nf3 Nc6 2. Nc3 Nf6 3. e4 e6 4. Bb5
Bb4 5. O-O O-O 6. d3 d5 7. e5 Ng4 (s=2)
14-> 0.34 0.13 1. Nf3 Nc6 2. Nc3 Nf6 3. e4 e6 4. Bb5
Bb4 5. O-O O-O 6. d3 d5 7. e5 Ng4
15 0.49 0.16 1. Nf3 Nc6 2. Nc3 Nf6 3. e4 e6 4. Bb5
Bb4 5. O-O O-O 6. d3 a6 7. Bxc6 dxc6
8. Bf4
15-> 0.58 0.16 1. Nf3 Nc6 2. Nc3 Nf6 3. e4 e6 4. Bb5
Bb4 5. O-O O-O 6. d3 a6 7. Bxc6 dxc6
8. Bf4
16 1.25 0.17 1. Nf3 Nf6 2. Nc3 Nc6 3. e4 e5 4. Bb5
Bb4 5. O-O O-O 6. d3 d6 7. Be3 Bg4
8. a3 Ba5 <HT>
16 2.00 0.19 1. e4 Nc6 2. Nf3 e5 3. Bb5 Nf6 4. d3
Bc5 5. Nc3 O-O 6. O-O d6 7. Bxc6 bxc6
8. Be3 Bb6 <HT>
16-> 2.00 0.19 1. e4 Nc6 2. Nf3 e5 3. Bb5 Nf6 4. d3
Bc5 5. Nc3 O-O 6. O-O d6 7. Bxc6 bxc6
8. Be3 Bb6 <HT> (s=2)
17 3.17 0.25 1. e4 Nc6 2. Nf3 e5 3. Bb5 Nf6 4. Nc3
Bc5 5. O-O O-O 6. Nxe5 Re8 7. Nd3 Bd4
8. Re1 Bxc3 9. dxc3 Nxe4
17-> 3.38 0.25 1. e4 Nc6 2. Nf3 e5 3. Bb5 Nf6 4. Nc3
Bc5 5. O-O O-O 6. Nxe5 Re8 7. Nd3 Bd4
8. Re1 Bxc3 9. dxc3 Nxe4
18 9.25 0.25 1. e4 e5 2. Nf3 Nc6 3. Bb5 Nf6 4. Nc3
Bc5 5. O-O O-O 6. Nxe5 Re8 7. Nd3 Bd4
8. Re1 Bxc3 9. dxc3 Nxe4 <HT>
18-> 10.05 0.25 1. e4 e5 2. Nf3 Nc6 3. Bb5 Nf6 4. Nc3
Bc5 5. O-O O-O 6. Nxe5 Re8 7. Nd3 Bd4
8. Re1 Bxc3 9. dxc3 Nxe4 <HT>
19 13.28 0.25 1. e4 e5 2. Nf3 Nc6 3. Bb5 Nf6 4. O-O
Bc5 5. Nc3 O-O 6. Nxe5 Re8 7. Nxc6
dxc6 8. Bd3 Bg4 9. Be2 Qd7 10. Kh1
<HT>
19-> 14.11 0.25 1. e4 e5 2. Nf3 Nc6 3. Bb5 Nf6 4. O-O
Bc5 5. Nc3 O-O 6. Nxe5 Re8 7. Nxc6
dxc6 8. Bd3 Bg4 9. Be2 Qd7 10. Kh1
<HT>
20 23.38 0.21 1. e4 e5 2. Nf3 Nc6 3. Bb5 Nf6 4. O-O
Bc5 5. c3 Bb6 6. d4 O-O 7. dxe5 Nxe4
8. Qd5 Nc5 9. Bg5 Ne7 10. Qd4 d6 11.
exd6 cxd6
20-> 28.72 0.21 1. e4 e5 2. Nf3 Nc6 3. Bb5 Nf6 4. O-O
Bc5 5. c3 Bb6 6. d4 O-O 7. dxe5 Nxe4
8. Qd5 Nc5 9. Bg5 Ne7 10. Qd4 d6 11.
exd6 cxd6
21 46.74 0.24 1. e4 e5 2. Nf3 Nc6 3. Bb5 Nf6 4. O-O
Bc5 5. c3 Bb6 6. d4 O-O 7. dxe5 Nxe4
8. Qd5 Nc5 9. Bg5 Ne7 10. Bxe7 Qxe7
11. Nbd2
time=1:05 mat=0 n=1330219086 fh=92% nps=20.2M
extensions=34.1M qchecks=36.2M reduced=139.1M pruned=447.7M
predicted=0 evals=647.7M 50move=0 EGTBprobes=0 hits=0
SMP-> splits=132148 aborts=17132 data=47/512 elap=1:05
White(1): e4
time used: 1:05
time remaining (white): 0:28:54 (59 more moves)
time remaining (black): 0:30:00 (60 more moves)
if clocks are wrong, use 'clock' command to adjust them
Black(1): e5 [pondering]
time surplus 0.00 time limit 52.90 (+23.51) (2:56)
depth time score variation (20)
20 19.62 0.26 2. Nf3 Nc6 3. Bb5 Nf6 4. O-O Bc5 5.
Nxe5 Nxe5 6. d4 a6 7. dxe5 axb5 8.
exf6 Qxf6 9. Nc3 O-O 10. Nxb5 Qc6 11.
Qd3 Re8
20-> 25.89 0.26 2. Nf3 Nc6 3. Bb5 Nf6 4. O-O Bc5 5.
Nxe5 Nxe5 6. d4 a6 7. dxe5 axb5 8.
exf6 Qxf6 9. Nc3 O-O 10. Nxb5 Qc6 11.
Qd3 Re8
21 57.65 0.30 2. Nf3 Nc6 3. Bb5 a6 4. Bxc6 dxc6 5.
O-O Bg4 6. h3 Bh5 7. Na3 Qf6 8. g4
Bg6 9. d3 h5 10. Bg5 Qe6 11. Nh4 hxg4
12. Nxg6 fxg6 13. Qxg4 Qxg4+ 14. hxg4
21-> 1:08 0.30 2. Nf3 Nc6 3. Bb5 a6 4. Bxc6 dxc6 5.
O-O Bg4 6. h3 Bh5 7. Na3 Qf6 8. g4
Bg6 9. d3 h5 10. Bg5 Qe6 11. Nh4 hxg4
12. Nxg6 fxg6 13. Qxg4 Qxg4+ 14. hxg4
22 1:48 1/29? 2. Nf3 (23.4Mnps)

am getting more nodes now(20000Kns), as before(8000-9000KNs) on 8 cpu.
It is still slower than the skulltrail machine, but it is much better then before.
Someone told me I must switch my computer ínto numa system...I dont know about. Is that nonsense, or what must I do?

Kind regards, and tanks a lot.
Clemens
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Crafty 23.1 scaling problem on Nehalem octa

Post by bob »

Hugo wrote:Hello Jim

first success:

usage: bookpath|perspath|logpath|tbpath <path>
unable to open book file [./book.bin].
book is disabled
unable to open book file [./books.bin].

Initializing multiple threads.
System is NUMA. 2 nodes reported by Windows
Node 0 CPUs: 0 1 2 3
Node 1 CPUs: 4 5 6 7
Current ideal CPU is 7
Exchanging nodes 0 and 1
EGTB cache memory = 32M bytes.
pondering enabled.
playing a computer!
use 'settc' command if a game is restarted after Crafty
has been terminated for any reason.
tournament mode.
book learning disabled
book file disabled.
max threads set to 8.
SMP keep extra threads spinning when idle.
hash table memory = 1024M bytes.
pawn hash table memory = 256M bytes.


Crafty v23.1 (8 cpus)

White(1): go depth 20
time surplus 0.00 time limit 57.00 (+27.00) (3:00)
depth time score variation (1)
starting thread 1
Starting thread on node 1 CPU mask 15
starting thread 2
Starting thread on node 0 CPU mask 240
starting thread 3
Starting thread on node 1 CPU mask 15
starting thread 4
Starting thread on node 0 CPU mask 240
starting thread 5
Starting thread on node 1 CPU mask 15
starting thread 6
Starting thread on node 0 CPU mask 240
starting thread 7
Starting thread on node 1 CPU mask 15
12 0.06 0.23 1. Nf3 Nc6 2. Nc3 Nf6 3. e3 e5 4. Bb5
Bd6 5. O-O O-O 6. d3 Re8
12-> 0.08 0.23 1. Nf3 Nc6 2. Nc3 Nf6 3. e3 e5 4. Bb5
Bd6 5. O-O O-O 6. d3 Re8 (s=2)
13 0.14 0.14 1. Nf3 Nc6 2. Nc3 Nf6 3. e4 e6 4. Bb5
Bb4 5. d3 a6 6. Bxc6 Bxc3+ 7. bxc3
dxc6
13-> 0.22 0.14 1. Nf3 Nc6 2. Nc3 Nf6 3. e4 e6 4. Bb5
Bb4 5. d3 a6 6. Bxc6 Bxc3+ 7. bxc3
dxc6 (s=3)
14 0.30 0.13 1. Nf3 Nc6 2. Nc3 Nf6 3. e4 e6 4. Bb5
Bb4 5. O-O O-O 6. d3 d5 7. e5 Ng4 (s=2)
14-> 0.34 0.13 1. Nf3 Nc6 2. Nc3 Nf6 3. e4 e6 4. Bb5
Bb4 5. O-O O-O 6. d3 d5 7. e5 Ng4
15 0.49 0.16 1. Nf3 Nc6 2. Nc3 Nf6 3. e4 e6 4. Bb5
Bb4 5. O-O O-O 6. d3 a6 7. Bxc6 dxc6
8. Bf4
15-> 0.58 0.16 1. Nf3 Nc6 2. Nc3 Nf6 3. e4 e6 4. Bb5
Bb4 5. O-O O-O 6. d3 a6 7. Bxc6 dxc6
8. Bf4
16 1.25 0.17 1. Nf3 Nf6 2. Nc3 Nc6 3. e4 e5 4. Bb5
Bb4 5. O-O O-O 6. d3 d6 7. Be3 Bg4
8. a3 Ba5 <HT>
16 2.00 0.19 1. e4 Nc6 2. Nf3 e5 3. Bb5 Nf6 4. d3
Bc5 5. Nc3 O-O 6. O-O d6 7. Bxc6 bxc6
8. Be3 Bb6 <HT>
16-> 2.00 0.19 1. e4 Nc6 2. Nf3 e5 3. Bb5 Nf6 4. d3
Bc5 5. Nc3 O-O 6. O-O d6 7. Bxc6 bxc6
8. Be3 Bb6 <HT> (s=2)
17 3.17 0.25 1. e4 Nc6 2. Nf3 e5 3. Bb5 Nf6 4. Nc3
Bc5 5. O-O O-O 6. Nxe5 Re8 7. Nd3 Bd4
8. Re1 Bxc3 9. dxc3 Nxe4
17-> 3.38 0.25 1. e4 Nc6 2. Nf3 e5 3. Bb5 Nf6 4. Nc3
Bc5 5. O-O O-O 6. Nxe5 Re8 7. Nd3 Bd4
8. Re1 Bxc3 9. dxc3 Nxe4
18 9.25 0.25 1. e4 e5 2. Nf3 Nc6 3. Bb5 Nf6 4. Nc3
Bc5 5. O-O O-O 6. Nxe5 Re8 7. Nd3 Bd4
8. Re1 Bxc3 9. dxc3 Nxe4 <HT>
18-> 10.05 0.25 1. e4 e5 2. Nf3 Nc6 3. Bb5 Nf6 4. Nc3
Bc5 5. O-O O-O 6. Nxe5 Re8 7. Nd3 Bd4
8. Re1 Bxc3 9. dxc3 Nxe4 <HT>
19 13.28 0.25 1. e4 e5 2. Nf3 Nc6 3. Bb5 Nf6 4. O-O
Bc5 5. Nc3 O-O 6. Nxe5 Re8 7. Nxc6
dxc6 8. Bd3 Bg4 9. Be2 Qd7 10. Kh1
<HT>
19-> 14.11 0.25 1. e4 e5 2. Nf3 Nc6 3. Bb5 Nf6 4. O-O
Bc5 5. Nc3 O-O 6. Nxe5 Re8 7. Nxc6
dxc6 8. Bd3 Bg4 9. Be2 Qd7 10. Kh1
<HT>
20 23.38 0.21 1. e4 e5 2. Nf3 Nc6 3. Bb5 Nf6 4. O-O
Bc5 5. c3 Bb6 6. d4 O-O 7. dxe5 Nxe4
8. Qd5 Nc5 9. Bg5 Ne7 10. Qd4 d6 11.
exd6 cxd6
20-> 28.72 0.21 1. e4 e5 2. Nf3 Nc6 3. Bb5 Nf6 4. O-O
Bc5 5. c3 Bb6 6. d4 O-O 7. dxe5 Nxe4
8. Qd5 Nc5 9. Bg5 Ne7 10. Qd4 d6 11.
exd6 cxd6
21 46.74 0.24 1. e4 e5 2. Nf3 Nc6 3. Bb5 Nf6 4. O-O
Bc5 5. c3 Bb6 6. d4 O-O 7. dxe5 Nxe4
8. Qd5 Nc5 9. Bg5 Ne7 10. Bxe7 Qxe7
11. Nbd2
time=1:05 mat=0 n=1330219086 fh=92% nps=20.2M
extensions=34.1M qchecks=36.2M reduced=139.1M pruned=447.7M
predicted=0 evals=647.7M 50move=0 EGTBprobes=0 hits=0
SMP-> splits=132148 aborts=17132 data=47/512 elap=1:05
White(1): e4
time used: 1:05
time remaining (white): 0:28:54 (59 more moves)
time remaining (black): 0:30:00 (60 more moves)
if clocks are wrong, use 'clock' command to adjust them
Black(1): e5 [pondering]
time surplus 0.00 time limit 52.90 (+23.51) (2:56)
depth time score variation (20)
20 19.62 0.26 2. Nf3 Nc6 3. Bb5 Nf6 4. O-O Bc5 5.
Nxe5 Nxe5 6. d4 a6 7. dxe5 axb5 8.
exf6 Qxf6 9. Nc3 O-O 10. Nxb5 Qc6 11.
Qd3 Re8
20-> 25.89 0.26 2. Nf3 Nc6 3. Bb5 Nf6 4. O-O Bc5 5.
Nxe5 Nxe5 6. d4 a6 7. dxe5 axb5 8.
exf6 Qxf6 9. Nc3 O-O 10. Nxb5 Qc6 11.
Qd3 Re8
21 57.65 0.30 2. Nf3 Nc6 3. Bb5 a6 4. Bxc6 dxc6 5.
O-O Bg4 6. h3 Bh5 7. Na3 Qf6 8. g4
Bg6 9. d3 h5 10. Bg5 Qe6 11. Nh4 hxg4
12. Nxg6 fxg6 13. Qxg4 Qxg4+ 14. hxg4
21-> 1:08 0.30 2. Nf3 Nc6 3. Bb5 a6 4. Bxc6 dxc6 5.
O-O Bg4 6. h3 Bh5 7. Na3 Qf6 8. g4
Bg6 9. d3 h5 10. Bg5 Qe6 11. Nh4 hxg4
12. Nxg6 fxg6 13. Qxg4 Qxg4+ 14. hxg4
22 1:48 1/29? 2. Nf3 (23.4Mnps)

am getting more nodes now(20000Kns), as before(8000-9000KNs) on 8 cpu.
It is still slower than the skulltrail machine, but it is much better then before.
Someone told me I must switch my computer ínto numa system...I dont know about. Is that nonsense, or what must I do?

Kind regards, and tanks a lot.
Clemens
Core I7 is _not_ NUMA. That is only AMD at present.

I just ran on a dual-socket quad-core (8 cores total) box and the NPS looked perfectly normal. However, this is all run under linux. Windows is a different world and I don't really live there. :)
User avatar
Peter Skinner
Posts: 1763
Joined: Sun Feb 26, 2006 1:49 pm
Location: Edmonton, Alberta, Canada
Full name: Peter Skinner

Re: Crafty 23.1 scaling problem on Nehalem octa

Post by Peter Skinner »

Hugo wrote: nps=20.2M
No, you are getting 20.2Mnps. Not 20000.

Can you start crafty in console mode without a crafty.rc file and type:

mt 8
bench

Then post the output here?

Peter
I was kicked out of Chapters because I moved all the Bibles to the fiction section.
User avatar
Peter Skinner
Posts: 1763
Joined: Sun Feb 26, 2006 1:49 pm
Location: Edmonton, Alberta, Canada
Full name: Peter Skinner

Re: Crafty 23.1 scaling problem on Nehalem octa

Post by Peter Skinner »

Try this for an rc file as well:

mt=8
egtb
tbpath=
hash=1024M
hashp=256M
cache=32M
ponder on
computer
mode tournament
swindle off
learn 0
book off
log=off
smpnice=0
timebook 80 8
exit

Peter
I was kicked out of Chapters because I moved all the Bibles to the fiction section.
zullil
Posts: 6442
Joined: Tue Jan 09, 2007 12:31 am
Location: PA USA
Full name: Louis Zulli

Re: Crafty 23.1 scaling problem on Nehalem octa

Post by zullil »

Peter Skinner wrote:
Hugo wrote: nps=20.2M
No, you are getting 20.2Mnps. Not 20000.

Peter
Didn't he write 20000K?