Perft speed optimization (renamed)

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

Terje
Posts: 347
Joined: Tue Nov 19, 2019 4:34 am
Location: https://github.com/TerjeKir/weiss
Full name: Terje Kirstihagen

Re: Can't get Minic to run perft?

Post by Terje »

mvanthoor wrote: Mon May 04, 2020 9:56 pm Rustic has no move ordering yet, but it does incremental material updates. How could I add scores without running the evaluation?
Captures in Weiss are scored by MvvLva . Quiet moves by killers and history, both of which will be 0 in perft, but they are still added.
User avatar
mvanthoor
Posts: 1784
Joined: Wed Jul 03, 2019 4:42 pm
Location: Netherlands
Full name: Marcel Vanthoor

Re: Can't get Minic to run perft?

Post by mvanthoor »

O, that.

No, I don't have that yet. After I released version Alpha 1, Mvv/Lva will probably be the first addition.

I'll also look into (optional) PEXT bitboards at some point, but they are not a very high priority.
Author of Rustic, an engine written in Rust.
Releases | Code | Docs | Progress | CCRL
Terje
Posts: 347
Joined: Tue Nov 19, 2019 4:34 am
Location: https://github.com/TerjeKir/weiss
Full name: Terje Kirstihagen

Re: Can't get Minic to run perft?

Post by Terje »

mvanthoor wrote: Tue May 05, 2020 12:16 pm O, that.

No, I don't have that yet. After I released version Alpha 1, Mvv/Lva will probably be the first addition.

I'll also look into (optional) PEXT bitboards at some point, but they are not a very high priority.
Removing it gives Weiss about a 7% speed increase in perft :)

(and triples the amount of nodes searched by bench to depth 13 haha)
Daniel Anulliero
Posts: 759
Joined: Fri Jan 04, 2013 4:55 pm
Location: Nice

Re: Can't get Minic to run perft?

Post by Daniel Anulliero »

Hello
Very interresting thread !
Actually I'm rewriting my engine (Isa) from scratch, to make it all bitboard friendly (last Isa is Array based and started from Firstchess.c) .
While debuging my new movegen , I'll follow your discussion with great interest !
Even your mistake about the castles gen was very useful , as I did the same mistake as yours 😊
All the best
Dany
Isa download :
User avatar
mvanthoor
Posts: 1784
Joined: Wed Jul 03, 2019 4:42 pm
Location: Netherlands
Full name: Marcel Vanthoor

Re: Can't get Minic to run perft?

Post by mvanthoor »

The rook capture was a nasty omission. Until I thought long and hard about what could be wrong with the move generator, make or unmake, it suddenly dawned on me that you could lose castling permissions by having a rook captured before it moved. I just plainly forgot about it, because I have never experienced this in an over the board game (except maybe 30 years ago as a pure beginner). Even if it happens, no one would even think about castling to that side because the rook is just gone.

Good luck writing your engine :)

I'll have a look at your engine. I'm going to release my engine as soon as it's able to play a bunch of tournaments under a GUI. (The first version will be extremely weak because it won't have any features... but it'll be a good starting point for other engine authors, if they don't want to start from scratch.)
Author of Rustic, an engine written in Rust.
Releases | Code | Docs | Progress | CCRL
Terje
Posts: 347
Joined: Tue Nov 19, 2019 4:34 am
Location: https://github.com/TerjeKir/weiss
Full name: Terje Kirstihagen

Re: Can't get Minic to run perft?

Post by Terje »

I have refactored so move scoring isn't done during movegen. ~7% speedup of perft, and a tiny speedup of normal search! Do I break into the 190s range for kiwipete now? :D
User avatar
mvanthoor
Posts: 1784
Joined: Wed Jul 03, 2019 4:42 pm
Location: Netherlands
Full name: Marcel Vanthoor

Re: Can't get Minic to run perft?

Post by mvanthoor »

Hi Terje,

I wonder where you've put the move ordering code. I haven't looked at it yet. Probably its now just behind where you create the move list, so you don't need to use that code in perft.

Your engine got faster in perft with removing the code. I did a few runs for the start position and KiwiPete, and I'll report the fastest run. (Variance is about +/- 0.5 seconds around an average with both engines.)

Without PEXT.
Startpos perft 7: 75.77 seconds
Kiwipete perft 6: 202.43 seconds

So no, you didn't drop into the 190 range with KiwiPete6. Not by a long shot :D You did gain 12 seconds in non-pext mode. I haven't tested PEXT-mode yet, but I do suspect it will gain enough to get there; somewhere around 195 seconds, as PEXT gains you 3-4%.

Rustic's speed (without PEXT):
Startpos7: 78.10
KiwiPete6: 200.77

Even though Rustic is a bit slower than Weiss in Startpos7, it is a bit faster in KiwiPete6. If we're both compiling without PEXT-mode, I think we're now squarely down to being at the mercy of compiler and toolchain differences.

When I compile Rustic with different toolchains (gnu+ld-linker, msvc+ms-linker, msvc+llvm-linker), the difference between the fastest (gnu/ld) and the slowest (msvc/ms-link) is almost 4 (!) seconds in Startpos7; much more in KiwiPete6. msvc/llvm-linker is in between, being about 2 seconds slower than gnu/ld.

One thing I'm going to quickly try over the weekend and then put away for later improvement is to do a pin-check on the knight. The knight is the -only- piece that can't move anywhere when it's pinned, so it's really easy to check:

- When generating knight moves...
- First remove the knight from the board.
- Is the king now in check?
- If no, put back the knight, and generate moves.
- If yes, put back the knight, and done. It can't move.

It will cost you the actions of removing the knight, checking for check, and putting it back again, but -if- the knight was pinned, you'll have 8 moves less to try. It'll spare 8 useless make_moves (that will all fail), all the unmakes, and all the extra stuff that comes with it.

Other pieces are harder to check, but if you manage this with a queen...
https://lichess.org/editor/3kq3/2n5/8/8 ... _w_-_-_0_1

This queen would normally have 26 moves if not pinned, but now it only has 6. You would need to do extra checks and stuff to have the queen move along the pin ray, but you'd spare 20 useless makes and unmakes and all the stuff that comes with it.

I assume this will make an engine faster (all saved make/unmake calculations > all pin calculations), but as the move generator and even make/unmake are a relatively small part of code execution compared to the evaluation function, this will be one of the (much) later improvements, after lots of stuff in the search.

But... the knight is tempting to try and then construct a position with two knight pins in it to see how much it gains, because it's so easy to implement.

Weiss:

Code: Select all

Marcel@WORKSTATION MINGW64 /c/code/weiss/src
$ ./weiss-dev.exe
perft 7 rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1

Starting perft to depth 7

move 1 : a2a3 : 106743106
....

Perft complete:
Time  : 75768ms
Leaves: 3195901860
LPS   : 42180100

Code: Select all

Marcel@WORKSTATION MINGW64 /c/code/weiss/src
$ ./weiss-dev.exe
perft 6 r3k2r/p1ppqpb1/bn2pnp1/3PN3/1p2P3/2N2Q1p/PPPBBPPP/R3K2R w KQkq - 0 1

Starting perft to depth 6

move 1 : g2h3 : 158328615
....

Perft complete:
Time  : 202429ms
Leaves: 8031647685
LPS   : 39676368
Rustic:

Code: Select all

Benchmarking perft 1-7:

8   r n b q k b n r
7   i i i i i i i i
6   . . . . . . . .
5   . . . . . . . .
4   . . . . . . . .
3   . . . . . . . .
2   I I I I I I I I
1   R N B Q K B N R

    A B C D E F G H

Zobrist key:        819aa694337673fb
Active Color:       White
Castling:           KQkq
En Passant:         -
Half-move clock:    0
Full-move number:   1

Perft 1: 20 (0 ms, inf leaves/sec)
Perft 2: 400 (0 ms, inf leaves/sec)
Perft 3: 8902 (0 ms, inf leaves/sec)
Perft 4: 197281 (5 ms, 39456200 leaves/sec)
Perft 5: 4865609 (120 ms, 40546741 leaves/sec)
Perft 6: 119060324 (2979 ms, 39966540 leaves/sec)
Perft 7: 3195901860 (78776 ms, 40569486 leaves/sec)
Total time spent: 81880 ms
Execution speed: 40547562 leaves/second

Code: Select all

Benchmarking perft 1-6:

8   r . . . k . . r
7   i . i i q i b .
6   b n . . i n i .
5   . . . I N . . .
4   . i . . I . . .
3   . . N . . Q . i
2   I I I B B I I I
1   R . . . K . . R

    A B C D E F G H

Zobrist key:        fb86acbc2034870e
Active Color:       White
Castling:           KQkq
En Passant:         -
Half-move clock:    0
Full-move number:   1

Perft 1: 48 (0 ms, inf leaves/sec)
Perft 2: 2039 (0 ms, inf leaves/sec)
Perft 3: 97862 (2 ms, 48931000 leaves/sec)
Perft 4: 4085603 (101 ms, 40451514 leaves/sec)
Perft 5: 193690690 (4772 ms, 40588996 leaves/sec)
Perft 6: 8031647685 (200773 ms, 40003624 leaves/sec)
Total time spent: 205648 ms
Execution speed: 40017524 leaves/second
Author of Rustic, an engine written in Rust.
Releases | Code | Docs | Progress | CCRL
Terje
Posts: 347
Joined: Tue Nov 19, 2019 4:34 am
Location: https://github.com/TerjeKir/weiss
Full name: Terje Kirstihagen

Re: Can't get Minic to run perft?

Post by Terje »

mvanthoor wrote: Wed May 06, 2020 10:30 pm Hi Terje,

I wonder where you've put the move ordering code. I haven't looked at it yet. Probably its now just behind where you create the move list, so you don't need to use that code in perft.
Yeah, I loop over the list and add scores after generating. The code makes more sense where it is located now, and shortens movegen which is one of the longer files so that's a nice bonus :)

Damn not 200, but getting close!
User avatar
mvanthoor
Posts: 1784
Joined: Wed Jul 03, 2019 4:42 pm
Location: Netherlands
Full name: Marcel Vanthoor

Re: Can't get Minic to run perft?

Post by mvanthoor »

Terje wrote: Wed May 06, 2020 11:24 pm Damn not 200, but getting close!
I think it's going to be very hard to optimize reliably. I'm at the point where I can't even change the order of some statements and have perft become 10% slower.

Even this... (pseudo-code)

Code: Select all

board.swap_color();

is_legal = !king_in_check();
if !is_legal { unmake(); return false; }

return true;
This will first swap colors (which is one of the more expensive things to do in make() in my engine) and then unmake everything if the move turns out to be illegal. So, If I only swap colors if the move is legal...

Code: Select all


is_legal = !king_in_check();
if !is_legal { unmake(); return false };

board.swap_color();

return true;
I should be saving "swap_colors()" on illegal moves. Guess what? Everything becomes 10% slower. Startpos7 increases from 78.xx to about 84 seconds.

I have also refactored something in add_move(), just on a hunch, which dropped about 1.5 seconds off of KiwiPete6, but not Startpos7... First, I checked if a move was a promotion because a pawn ended on the 8th rank. If not, I'd add the "NO_PIECE" to the promotion flag and push the move to the move list. If it was, I would put four moves into the list; one for Q, R, K, and B.

It is actually faster to just add "NO_PIECE" to the move data, and THEN check if it was a promotion... and if it was, remove NO_PIECE again, and then add four promotion moves. It seems the CPU's branch predictor is hell-bent on doing the no-promotion case because it occurs the most. Checking each time is more expensive than just assuming promotion didn't happen, and then fixing it when it suddenly does happen.

Current CPU's compilers and CPU's are too unreliable to try and do micro-optimizations. The best way seems to be to code as straight forward as possible for the case that will probably happen the most, and then make an exception (or even undo some stuff) if something else needs to happen. This can apparently be faster, even if the straight forward case sometimes does more work.




Where stuff is in memory, and the current CPU's with their branch predictors and out of order executions are too unpredictable.
Author of Rustic, an engine written in Rust.
Releases | Code | Docs | Progress | CCRL
Terje
Posts: 347
Joined: Tue Nov 19, 2019 4:34 am
Location: https://github.com/TerjeKir/weiss
Full name: Terje Kirstihagen

Re: Can't get Minic to run perft?

Post by Terje »

Got another 1.5% on my machine, and also tested well for real games :)