Devlog of Leorik

lithander · Post by **lithander** » Wed Jan 05, 2022 11:51 pm

Mike Sherwin wrote: ↑Wed Jan 05, 2022 11:25 pm MakeMove(t, m);
if (depth > 1) UpdateHash(t, m);
// call search
if (depth > 1) UpdateHash(t, m);
TakeBack(t, m);

Results were 204,152,123m/14.79s finishing depth 11 and a node rate of 13,803,389 nodes / sec.

I didn't even consider updating the hash so separately from making the move. In my current implementation it would mean doing some work twice but it could probably be refactored. Thanks for the idea!

emadsen · Post by **emadsen** » Thu Jan 06, 2022 1:41 am

lithander wrote: ↑Wed Jan 05, 2022 2:57 pmIn MinimalChess I'm using the hash for a transposition table but I don't set or query the TT in QSearch and QSearch is where the bulk of the work is spend... So it made me wonder... when I make the zobrist calculation incremental I'll suddenly also do a lot of extra work in qsearch where I don't ever need it.

I examine the Zobrist hash of previous positions in qsearch to determine if a position is a repeat position. If so, it's scored as a draw. Interesting idea though. I doubt the expense of incremental update of Zobrist hash outweighs the loss of knowledge of draw-by-repeat in qsearch, but who knows? I've never tested it. Repeats should not occur within qsearch (being captures and check evasions) unless the repeat occurs across the main search / qsearch boundary.

On the other hand, the idea bothers me because of wanton disregard for correctness. Seems like a bug factory to leave incorrect position hashes. Then again, chess engine design is all about accepting "approximately correct" techniques and deferring work to boost speed.

Regarding testing for repeat positions: You don't have to enumerate all moves played in the game. Just iterate back (2 ply at a time) to the last pawn move or capture. Both are irreversible moves so looking beyond them is unnecessary: no repeats will be found there.

mvanthoor · Post by **mvanthoor** » Thu Jan 06, 2022 4:53 pm

lithander wrote: ↑Wed Jan 05, 2022 11:51 pm ...

Congratulations with the development of the new engine.

Rustic is a bit stuck at the moment due to lack of time. Massive project at work that had to be finished before the end of the year due to changes in government regulations. No Christmas holiday as a result. The last commit (apart from changing the copyright year) was weeks ago. Fortunately the XBoard-protocol is on the verge of being finished, so I can start on the tuner.. (Rustic did receive a _massive_ refactoring during the implementation of the XBoard protocol, and that isn't yet finished.)

61M leaves/sec is very fast. Rustic runs at 41M leaves/s on my computer (but that is an older i7-6700K. Even though Passmark says the performance per thread is about equal in their benchmarks, I expect it to be faster for chess engines. If the CPU's had been equal thread-for-thread, your 6-core should score around 12.000 points, but it scores over 17.000.)

Expect the speed to drop when you start keeping incremental stuff. At the moment Rustic keeps the following data incrementally:

- Game State
- Zobrist hash
- Game phase
- Both mid-game and end-game PST's for both white and black (so 4 PST values in total)

I can provide the following data for Rustic versions if this is of any help:

- No incremental data (Rustic was just a pert-tool back then): 41 M nodes/s
- Alpha 1 (Incremental gamestate, zobrist, 2 PST's): 38 M nodes/s (-7.4% compared to no incremental updates)
- Alpha 2 (Same as alpha 1, but 2x if-check in perft so TT can be disabled): 36.2 M leaves/s (-11.8% vs no incremental updates)
- Alpha 3 (Same as alpha 2, but with extra game phase and 2 more PST's): 33 M leaves/s (-19.6% vs no incrementals)

Next versions are going to have at least a zobrist hash for the pawn hash table and some evaluation terms such as the bishop pair. (It has to only be recalculated if a bishop is captured or a piece promotes to a bishop, so keeping this incrementally is faster than looking at the bishops each time during evaluation. You don't have the bishop pair if both are on the same color square.)

After I got perft as fast as it could go (optimizing the move generator, make/unmake, and perft itself) I stopped trying to make perft faster and just added to the engine what I need for chess. I expect perft to become slower and slower as I add more incremental updates; but those updates help while playing chess, because the evaluation doesn't have to calculate everything from the start. It will have all information readily available. (Updating PST values one move at a time is faster than recalculating the value from scratch every time. Things such as open lines and isolated pawns don't change except if pawns are moved or captured, so the pawn hash will be faster than recalculating those things from scratch.)

After you get perft to be as fast as possible in an otherwise completely bare engine you know you've optimized perft, make, unmake and the move generator; after that, perft speed becomes unimportant. Nodes/second during the search is at that point much more indicative of the real engine speed, and that speed will go UP as keep MORE incremental values, despite perft going DOWN because of those same incremental values.

lithander · Post by **lithander** » Sun Jan 09, 2022 2:51 am

Good to hear you haven't abandoned Rustic, Marcel! I was getting worried

mvanthoor wrote: ↑Thu Jan 06, 2022 4:53 pm I can provide the following data for Rustic versions if this is of any help:

- No incremental data (Rustic was just a pert-tool back then): 41 M nodes/s
- Alpha 1 (Incremental gamestate, zobrist, 2 PST's): 38 M nodes/s (-7.4% compared to no incremental updates)
- Alpha 2 (Same as alpha 1, but 2x if-check in perft so TT can be disabled): 36.2 M leaves/s (-11.8% vs no incremental updates)
- Alpha 3 (Same as alpha 2, but with extra game phase and 2 more PST's): 33 M leaves/s (-19.6% vs no incrementals)

That's very interesting stats. Your perft speed has barely suffered from adding zobrist and eval etc!

My raw perft speed may be currently faster than your original version but that might just be because my move generator really only generates the from and to square and the moving piece. Whether the move is a capture and the type of the captured piece if it is, that's information I'm still missing. (because I don't need it to play the move in a perft situation)

Without that information however when updating the zobrist hash I have to interact with the bitboards to find out whether there was something captured and then fetch the piece's type so that I can remove it from the zobrist hash. This is really not cheap. So either I'm going to pay the price later or I have to revisit the move generator and add all the information that was not needed for perft but is required for an actual chess engine. In any case I expect my perft performance to drop quickly and I'll be happy enough if I'll end up with something akin to Rustics performance in the end!

mvanthoor · Post by **mvanthoor** » Sun Jan 09, 2022 3:38 am

lithander wrote: ↑Sun Jan 09, 2022 2:51 am Good to hear you haven't abandoned Rustic, Marcel! I was getting worried

I very seldomly completely abandon stuff, but I may put it aside for a while, from time to time. Chess programming is just a hobby, just as chess playing, reading, piano playing, and photography. Sometimes my interest swings from one hobby to the other for a few weeks or months. (I REALLY need to get some solid piano practice in though; let's say half an hour, or I'll be stuck at my intermediate level forever.)

Last year was just very hectic at work with regard to that huge project I mentioned. (Try to do a work day, then upgrading a system from 0:00 to 2:00 in the morning, then be online again to catch the first people starting to use the new stuff at 8:00 and try to fix any super-weird edge cases before 9:00 before the largest part of people comes in... do that a few times a week and you'd rather go to bed at 21:00 if you get the chance than do some chess programming.)

I hope it's going to be a bit quieter now so I can get some work done on Rustic again

lithander wrote: ↑Sun Jan 09, 2022 2:51 am That's very interesting stats. Your perft speed has barely suffered from adding zobrist and eval etc!

A 20% drop isn't nothing... but still, on this somewhat oldish-feeling computer, Rustic searches at up to 7.8M nodes/second during games.(I count every node, even if it is later decided to be cut off, but I do take care to NOT count a node and then count it again in QSearch.)

Even after this long a time of running testing games on behalf of the XBoard protocol I still think it's funny that some engines Rustic tests against have null move, often LMR, and sometimes even other features such as evaluation terms implemented to reach the same strength. I really look forward to picking up development again. I hope to be able to do so before the end of the month. I can't wait to release version 4 (with XBoard and my own tuned tables), which should play at ~2160. Then I'll see what new features will bring. I wonder if null move brings more Elo for a fast engine than it does for a slow one.

I'll see what happens after implementing null move.

My raw perft speed may be currently faster than your original version but that might just be because my move generator really only generates the from and to square and the moving piece. Whether the move is a capture and the type of the captured piece if it is, that's information I'm still missing. (because I don't need it to play the move in a perft situation)

In that case you're indeed missing lots of information that comes in very handy later.

Without that information however when updating the zobrist hash I have to interact with the bitboards to find out whether there was something captured and then fetch the piece's type so that I can remove it from the zobrist hash. This is really not cheap.

Indeed. I added a mailbox list to Rustic (I actually forgot to mention that, as it is also incrementally kept) ASAP. At first, after moving a piece, the engine iterated though all the pieces to see what it captured (which could also be "none"). That took so much time that I added that mailbox list, so the engine just has to look up piece_type[square] after a move and it immediately knows if and what it captured.

That gained so much speed that I immediately decided to keep everything that I can incrementally where possible. I see many engines that calculate PST values from scratch every time, or don't use a pawn hash and thus have to find out which lines are open, which pawns are isolated, over and over again for each move... even though the pawn structure doesn't change. So after implementing null move and static null move for version 5, then version 6 will probably have a pawn hash if I'm going to do something with the evaluation

So either I'm going to pay the price later or I have to revisit the move generator and add all the information that was not needed for perft but is required for an actual chess engine. In any case I expect my perft performance to drop quickly and I'll be happy enough if I'll end up with something akin to Rustics performance in the end!

In the beginning, perft was very important to me; before the engine actually played chess. However, after I felt that all the movegen functions and make/unmake were as fast as they'd possibly going to get, I decided it was 'good enough'; therefore I didn't spend any time to implement functions that specifically make perft faster, such as bulk counting.

I was thinking about a completely legal move generator to prevent some make/unmake's for illegal moves, but... the engine is also going to have staged move generation. 90+ of the time the TT-move will be executed and cause a cutoff, so 90% of the time no moves will be generated. I wonder if it is useful to implement a legal move generator for the last 10%. Maybe if I ever get to the point where I'm struggling for the next 10 Elo to reach 3500... but not now.

I'm wondering to see what you can do with a bitboard engine in C#. I'd love to see it having exactly the same features as MinimalChess, and then see how much stronger it is due to the extra speed. I expect it to be at least 70 Elo.

lithander · Post by **lithander** » Sun Jan 09, 2022 2:23 pm

mvanthoor wrote: ↑Sun Jan 09, 2022 3:38 am I added a mailbox list to Rustic (I actually forgot to mention that, as it is also incrementally kept) ASAP. At first, after moving a piece, the engine iterated though all the pieces to see what it captured (which could also be "none"). That took so much time that I added that mailbox list, so the engine just has to look up piece_type[square] after a move and it immediately knows if and what it captured.

Encountering that problem was the first time that I missed my old mailbox engine where looking up the piece on a square was indeed trivial. My current solution to the problem with purely bitboards looks like this:

Code: Select all

        [MethodImpl(MethodImplOptions.AggressiveInlining)]
        public Piece GetPiece(int square)
        {
            /*
            2nd Bit = White or Black? (implies the Piece bit to be set to)
            Black = 1,      //01
            White = 3,      //11

            3rd+ Bits = Type of Piece
            Pawn = 4,       //00100
            Knight = 8,     //01000
            Bishop = 12,    //01100
            Rook = 16,      //10000
            Queen = 20,     //10100
            King = 24,      //11000
            */
            return (Piece)(Bit(Black | White, square, 0) | 
                           Bit(White, square, 1) | 
                           Bit(Pawns | Bishops | Queens, square, 2)| 
                           Bit(Knights | Bishops | Kings, square, 3) | 
                           Bit(Kings | Rooks | Queens, square, 4));
        }

        [MethodImpl(MethodImplOptions.AggressiveInlining)]
        private int Bit(ulong bb, int square, int shift) => (int)((bb >> square) & 1) << shift;

...I'm not sure if I'm proud of that solution or scared but at least it doesn't depend on a bunch of branching statements which was my first idea. Operating on the level of individual bits it feels much more like I always imagined bitboard engines to operate under the hood.

mvanthoor wrote: ↑Sun Jan 09, 2022 3:38 am I'd love to see it having exactly the same features as MinimalChess, and then see how much stronger it is due to the extra speed. I expect it to be at least 70 Elo.

That's exactly what I have planned for version 1.0. Half of the inspiration for the new engine's name is Leoric the Skeleton King, an iconic boss in one of my favorite video games: Diablo. So there's this mental image that my bare-bones (skeletal) engine MMC is rising from it's grave stronger than ever. It's probably an inside joke nobody else cares about but it implies (to me) that Leorik and MinimalChess should be equally strong when constrained to the same search depth or node count but that the new engine is much faster when unshackled from such constraints.

How much faster exactly I'm curious to find out!

tcusr · Post by **tcusr** » Sun Jan 09, 2022 2:42 pm

i consider this setup the best

Code: Select all

uint64_t piece[6], color[2];

to find what piece is on a square i loop through the 6 piece bitboards. this may seem slow but the loop will be unrolled and the 2 loops (to find piece on "from" and "to") will be merged in a jump table, you can verify this on compiler explorer.
also to find if a move is a capture you just need to check if the to bit is set in the occupancy bitboard (this is a bit risky because if there is a bug in move gen it will also break this)

lithander · Post by **lithander** » Sun Jan 09, 2022 3:04 pm

tcusr wrote: ↑Sun Jan 09, 2022 2:42 pm i consider this setup the best
Code: Select all
uint64_t piece[6], color[2];
to find what piece is on a square i loop through the 6 piece bitboards. this may seem slow but the loop will be unrolled and the 2 loops (to find piece on "from" and "to") will be merged in a jump table, you can verify this on compiler explorer.
also to find if a move is a capture you just need to check if the to bit is set in the occupancy bitboard (this is a bit risky because if there is a bug in move gen it will also break this)

I'm already using basically the same setup (2 bitboards for the color and 6 for the pieces) but not as arrays.

Code: Select all

        public ulong White;
        public ulong Black;
        public ulong Pawns;
        public ulong Knights;
        public ulong Bishops;
        public ulong Rooks;
        public ulong Queens;
        public ulong Kings;

In safe code, a C# struct that contains an array doesn't contain the array elements. The struct contains a reference to the elements instead. You can embed an array of fixed size in a struct (but not a class) when it's used in an unsafe code block. More info here. Personally I think a C# chess engine should not depend on unsafe code. It's a managed language after all and if you don't want that you should pick C/C++ imo.

So I'm aware that the last few percent of speed I'll only get when I port Leorik to C++ at some point in the far future!

But for now I'd rather attempt to write the strongest C# chess engine than the 100th strongest C/C++ one, hehe.

Mike Sherwin · Post by **Mike Sherwin** » Sun Jan 09, 2022 7:13 pm

lithander wrote: ↑Sun Jan 09, 2022 3:04 pm
tcusr wrote: ↑Sun Jan 09, 2022 2:42 pm i consider this setup the best
Code: Select all
uint64_t piece[6], color[2];
to find what piece is on a square i loop through the 6 piece bitboards. this may seem slow but the loop will be unrolled and the 2 loops (to find piece on "from" and "to") will be merged in a jump table, you can verify this on compiler explorer.
also to find if a move is a capture you just need to check if the to bit is set in the occupancy bitboard (this is a bit risky because if there is a bug in move gen it will also break this)
I'm already using basically the same setup (2 bitboards for the color and 6 for the pieces) but not as arrays.
Code: Select all
        public ulong White;
        public ulong Black;
        public ulong Pawns;
        public ulong Knights;
        public ulong Bishops;
        public ulong Rooks;
        public ulong Queens;
        public ulong Kings;
In safe code, a C# struct that contains an array doesn't contain the array elements. The struct contains a reference to the elements instead. You can embed an array of fixed size in a struct (but not a class) when it's used in an unsafe code block. More info here. Personally I think a C# chess engine should not depend on unsafe code. It's a managed language after all and if you don't want that you should pick C/C++ imo.

So I'm aware that the last few percent of speed I'll only get when I port Leorik to C++ at some point in the far future! But for now I'd rather attempt to write the strongest C# chess engine than the 100th strongest C/C++ one, hehe.

Hi Thomas,

I never understood the pure bitboard preference. Is there an advantage. Why not have an int board[64] with piece types? My piece types are enum {EMPTY, WP, WN, WB, WRC, WR, WQ, WKC, WK, BP ...}. The types with 'C' can castle. If the WKC (WRC) moves it becomes a WK (WR) and becomes a WKC (WRC) again when the move is retracted. That means that as soon as a WKC becomes a WK no more cycles are wasted on castling. The savings are negligible, I guess, but castling becomes more simple. In my engine when a WP move on the fifth is generated that may be an en-passant capture it becomes a We type, not on the board, but for make and unmake so a switch can jump directly to the en-passant code. This code from Bricabrac handles en-passant correctly, uses no "if" statements and doesn't ever know if it is an en-passant capture or not. I'm curious what your thoughts are.

Code: Select all

case We:
    sq = m->ts - ((epbb[ply] == (one << m->ts)) << 3);
    ctype = board[sq];
    mat[BLACK] -= value[ctype];
    pos[WHITE] += (pcePST[WP][m->ts] - pcePST[WP][m->fs]);
    pos[BLACK] -= pcePST[ctype][sq];
    piece[BLACK] ^= (u64)(ctype != EMPTY) << sq;
    board[sq] = EMPTY;
    board[m->ts] = WP;
    break;

lithander · Post by **lithander** » Tue Jan 11, 2022 12:08 am

My piece types are enum {EMPTY, WP, WN, WB, WRC, WR, WQ, WKC, WK, BP ...}. The types with 'C' can castle. If the WKC (WRC) moves it becomes a WK (WR) and becomes a WKC (WRC) again when the move is retracted. That means that as soon as a WKC becomes a WK no more cycles are wasted on castling. [...] I'm curious what your thoughts are.

MinimalChess uses a 64-slot array of pieces as a board representation, too. But just the normal pieces and so it needs two more variables: castling rights is a bitfield (flags) and enPassantSquare is the square that enPassant can capture on. I was never too bothered about these extra variables but I really like your idea of having additional "virtual" pieces that encode castling rights and en passant on the board. That's pretty clever. Whether I prefer it I could only say after trying it out in MMC so that I'd know all implications.

I never understood the pure bitboard preference. Is there an advantage. Why not have an int board[64] with piece types?

Why not use mailbox in combination with the bitboard representation, you mean? Well, because you now have redundant information that you need to keep in sync. It makes the board representation much larger, too, so that Copy/Make is probably not an option anymore. And then you have to implement an Unmake method that you also have to keep in sync with the Make method. All in all it strikes me as a bit too complicated to be my first choice.

Maybe the performance benefits make it all worth it but when in doubt simple is best! (for me)

But I really don't want to argue against any of your solutions or claim that what I am doing would be superior. It's part of the beauty of this hobby that there seems to be no obviously best solution. And my engine is first and foremost just an expression of my personal preferences or testament to whatever piqued my interests at the time of writing. And to my ignorance, oftentimes, I'm sure.

Devlog of Leorik

Re: Devlog of Leorik

Re: Devlog of Leorik

Re: Devlog of Leorik

Re: Devlog of Leorik

Re: Devlog of Leorik

Re: Devlog of Leorik

Re: Devlog of Leorik

Re: Devlog of Leorik

Re: Devlog of Leorik

Re: Devlog of Leorik