Progress on Rustic

mvanthoor · Post by **mvanthoor** » Wed Mar 29, 2023 12:24 am

Hi

I've been away from chess programming for some time. Nothing serious: just life getting in the way with lots of work and moving across the city from an apartment to a house; which obviously requires more work to get set up at that new place.

Everything's now calming down and it looks like I'll be having some free time to spend that I actually want to spend at the computer. Therefore I'll be picking up chess programming again as well. So, Rustic is not dead. (It probably never will be, because I've made chess programming the rest-of-my-life hobby, so I always have something to program if I want to.)

Rustic 4, while originally planned for release around this time the previous year, has had a massive refactoring compared to version Alpha 3. It also includes XBoard now, for which I only need to implement one more feature to get it on par with UCI. The last thing to do is to write a tuner (which I have been procrastinating for quite some time to be honest, because I don't yet understand enough about it to explain it in my Rustic book.)

I've also maintained Rustic Alpha 1, 2 and 3 to cleanly compile with the latest versions of Rust and the latest versions of the libraries they depend on. The rustic-chess.org website has seen some updates in the last year as well, and the broken releases at Github have been fixed.

Besides that, I've also decided what Rust libraries I'm going to use to write that Picochess replacement and start a database program of my own, which will be based on an SQL database. (For that, I'm certainly going to re-read the chess database SQL topic that should still be floating around.) Obviously Rustic 5 or 6's business code will be transformed into a library itself, so it can be used as the backend of both the Picochess replacement and the database program. I'm not going to rewrite or copy/paste the move generator and related stuff.

Kind regards,
Me

Mike Sherwin · Post by **Mike Sherwin** » Wed Mar 29, 2023 1:15 am

mvanthoor wrote: ↑Wed Mar 29, 2023 12:24 am Hi

I've been away from chess programming for some time. Nothing serious: just life getting in the way with lots of work and moving across the city from an apartment to a house; which obviously requires more work to get set up at that new place.

Everything's now calming down and it looks like I'll be having some free time to spend that I actually want to spend at the computer. Therefore I'll be picking up chess programming again as well. So, Rustic is not dead. (It probably never will be, because I've made chess programming the rest-of-my-life hobby, so I always have something to program if I want to.)

Rustic 4, while originally planned for release around this time the previous year, has had a massive refactoring compared to version Alpha 3. It also includes XBoard now, for which I only need to implement one more feature to get it on par with UCI. The last thing to do is to write a tuner (which I have been procrastinating for quite some time to be honest, because I don't yet understand enough about it to explain it in my Rustic book.)

I've also maintained Rustic Alpha 1, 2 and 3 to cleanly compile with the latest versions of Rust and the latest versions of the libraries they depend on. The rustic-chess.org website has seen some updates in the last year as well, and the broken releases at Github have been fixed.

Besides that, I've also decided what Rust libraries I'm going to use to write that Picochess replacement and start a database program of my own, which will be based on an SQL database. (For that, I'm certainly going to re-read the chess database SQL topic that should still be floating around.) Obviously Rustic 5 or 6's business code will be transformed into a library itself, so it can be used as the backend of both the Picochess replacement and the database program. I'm not going to rewrite or copy/paste the move generator and related stuff.

Kind regards,
Me

Good to see you back! You have some catching up to do as both Blunder and Leorik are now 2700+ engines. Leorik even is ~2950 in blitz and ~2800 in slower tc.

I've been sidelined because of a hemorrhagic stroke, Bell's Palsy, and 257/139 blood pressure.That got me a weeks stay in the hospital. Lucky though the brain bleeder stopped on its own and I did not need brain surgery.

Despite all that I did manage to write a new move generator that is a little bit faster than Black Magic bitboards in Daniel's moveGen compare test.

I combined ideas from Kindergarten and SISSY to make Kindergarten Super SISSY bitboards, lol. But there does not seem to be much interest.

Good luck with your future endeavors!

algerbrex · Post by **algerbrex** » Wed Mar 29, 2023 6:11 am

mvanthoor wrote: ↑Wed Mar 29, 2023 12:24 am Hi

I've been away from chess programming for some time. Nothing serious: just life getting in the way with lots of work and moving across the city from an apartment to a house; which obviously requires more work to get set up at that new place.

Everything's now calming down and it looks like I'll be having some free time to spend that I actually want to spend at the computer. Therefore I'll be picking up chess programming again as well. So, Rustic is not dead. (It probably never will be, because I've made chess programming the rest-of-my-life hobby, so I always have something to program if I want to.)

Rustic 4, while originally planned for release around this time the previous year, has had a massive refactoring compared to version Alpha 3. It also includes XBoard now, for which I only need to implement one more feature to get it on par with UCI. The last thing to do is to write a tuner (which I have been procrastinating for quite some time to be honest, because I don't yet understand enough about it to explain it in my Rustic book.)

I've also maintained Rustic Alpha 1, 2 and 3 to cleanly compile with the latest versions of Rust and the latest versions of the libraries they depend on. The rustic-chess.org website has seen some updates in the last year as well, and the broken releases at Github have been fixed.

Besides that, I've also decided what Rust libraries I'm going to use to write that Picochess replacement and start a database program of my own, which will be based on an SQL database. (For that, I'm certainly going to re-read the chess database SQL topic that should still be floating around.) Obviously Rustic 5 or 6's business code will be transformed into a library itself, so it can be used as the backend of both the Picochess replacement and the database program. I'm not going to rewrite or copy/paste the move generator and related stuff.

Kind regards,
Me

Great news Marcel, good to have you back

mvanthoor · Post by **mvanthoor** » Wed Mar 29, 2023 1:05 pm

Mike Sherwin wrote: ↑Wed Mar 29, 2023 1:15 am Good to see you back! You have some catching up to do as both Blunder and Leorik are now 2700+ engines. Leorik even is ~2950 in blitz and ~2800 in slower tc.

Fortunately for me, I don't have any rivals in the chess engine community. My goal is still to write the best-documented chess engine in the world (and I might even at some point convert the online book to a real one and self-publish it somehow), and make it as strong with the least amount of functionality. (I.e.: each feature should add the maximum playing strength possible for that feature.)

I've been sidelined because of a hemorrhagic stroke, Bell's Palsy, and 257/139 blood pressure.That got me a weeks stay in the hospital. Lucky though the brain bleeder stopped on its own and I did not need brain surgery.

Stop it... you're scaring the hell out of me. One of my goals in life is to become at least 91 years old in perfect mental and physical health and then just spontaneously die in the middle of the night.

Despite all that I did manage to write a new move generator that is a little bit faster than Black Magic bitboards in Daniel's moveGen compare test. I combined ideas from Kindergarten and SISSY to make Kindergarten Super SISSY bitboards, lol. But there does not seem to be much interest.

Some day... you must stop inventing new bitboards. It took me enough time to understand the Magic Bitboards, and with you adding another bitboard variant every other week, I feel inadequate

Good luck with your future endeavors!

Thanks

algerbrex wrote: ↑Wed Mar 29, 2023 6:11 am Great news Marcel, good to have you back

Thanks

I hope to be able to finish Rustic 4 somewhere in the coming months. If I actually find the time to build that new system of which the parts have been lying around here in their boxes, testing should go four times as fast. I'm moving up from a 4-core i7-6700K to a 16-core 7950X, specifically because of chess programming. Otherwise I would probably have kept my current computer for another 2-3 years.

lithander · Post by **lithander** » Wed Mar 29, 2023 2:28 pm

mvanthoor wrote: ↑Wed Mar 29, 2023 12:24 am I'll be picking up chess programming again as well. So, Rustic is not dead.

Great to have you back as an active forum member!

mvanthoor wrote: ↑Wed Mar 29, 2023 1:05 pm I'm moving up from a 4-core i7-6700K to a 16-core 7950X, specifically because of chess programming.

That's going to be super helpful! In the beginning I thought that writing a chess engine was just a huge programming task but then more and more of my time has been spent on testing, tuning and data-generation. Being able to do that at scale is not only going to speed up the development of your engine but also going to protect your sanity and mental well being.

Mike Sherwin wrote: ↑Wed Mar 29, 2023 1:15 am Despite all that I did manage to write a new move generator that is a little bit faster than Black Magic bitboards in Daniel's moveGen compare test. I combined ideas from Kindergarten and SISSY to make Kindergarten Super SISSY bitboards, lol. But there does not seem to be much interest.

I guess that for many of us move generation is just something we're happy to have solved in our own engines and we don't look back. Also, it's been hard to follow the progress on your new move generator because I feel the relevant information is scattered over many posts and in at least two threads. Not even the name is easy to diggest

But for what it's worth I think that if I wanted to improve the speed of my engine right now your algorithm would be my favorite. It's relatively cache-friendly and achieves great performance without relying on super advanced intrinsic that are only available on a small subset of the hardware of our target audience. (E.g. PEXT or the Galois Field transforms) Maybe I should actually try it in Leorik after having been teased about low hanging fruits...

But this thread isn't about me! It's about Rustic! Welcome back Marcel! I look forward to play Rustic 4 as soon as it's ready!

mvanthoor · Post by **mvanthoor** » Wed Mar 29, 2023 2:49 pm

lithander wrote: ↑Wed Mar 29, 2023 2:28 pm Great to have you back as an active forum member!

To be honest, it's good to be back. It's nice to have a forum to be able to read back what was said in the past (which can't be done easily with Discord AFAIK, and it's a mosh-pit of chats instead of threads). Sometimes, chess programming has to take a back seat to life stuff that actually is important. Like... furnishing a house so you can live in it. We're now far enough along that we don't have to do huge stuff anymore. Just dotting the i's in the weekend, basically.

That's going to be super helpful! In the beginning I thought that writing a chess engine was just a huge programming task but then more and more of my time has been spent on testing, tuning and data-generation. Being able to do that at scale is not only going to speed up the development of your engine but also going to protect your sanity and mental well being.

Yes. Testing Rustic in a gauntlet took up to 10 hours during the night. On this CPU, it will take 2.5 hours and then I'm not even using any of the logical threads. I'm still unsure about that. In the past using SMT / HT threads for chess engine testing was bad practice, but that may have changed.

And yes, after the base engine is done (which would be Rustic 4, in my case, with the minimum search functionality and a tuner), it comes down to adding more features. Each new feature has less and less gain in playing strength and thus requires more and more testing. I hope I can just use Rayon and leverage its "par_for" loop to auto-parallelize things like the tuner.

But this thread isn't about me! It's about Rustic! Welcome back Marcel! I look forward to play Rustic 4 as soon as it's ready!

For all intents and purposes, it's ready now. It is missing just one XBoard feature to put that on par with UCI (it is missing what I call "GameTime"; as opposed to MoveTime, Depth, Nodes, etc). After that only the tuner remains so I can replace MinimalChess' evaluation tables with my own.

algerbrex · Post by **algerbrex** » Wed Mar 29, 2023 9:34 pm

mvanthoor wrote: ↑Wed Mar 29, 2023 2:49 pm And yes, after the base engine is done (which would be Rustic 4, in my case, with the minimum search functionality and a tuner), it comes down to adding more features. Each new feature has less and less gain in playing strength and thus requires more and more testing. I hope I can just use Rayon and leverage its "par_for" loop to auto-parallelize things like the tuner.

I think that plan should work well enough starting out, particularly since Rustic is a faster language than Golang, I didn't make much of an effort to optimize my original Texel tuner that utilized the naive local optimization algorithm found on the CPW, and since you're updating your setup.

But if the speed isn't up to par (no pun intended), basic gradient descent does work quite well, and I got better values when I switched over to it. Just something to think about.

mvanthoor · Post by **mvanthoor** » Sun Apr 02, 2023 9:45 pm

Yesterday and today I finally built my new computer. AMD 7950X, 64GB of RAM, 2x 2TB NVME SSD's, and an RX 6750 XT in case I decide to play a new game such as Baldur's Gate III at some point.

I have to say: I have never in my life connected so many power cables for a system that only has a mainboard, a CPU and a graphics card. The power usage of current-day systems is extreme. 24-pin ATX-cable, 6-pin PCI-E cable to power the USB4-c / Thunderbolt ports, 2x 8-pin for the CPU, and 2x 8-pin for the graphcs card. And this isn't even the highest-end you could go... (GTX 4090, Xeon or Epyc CPU's, etc, etc...)

Now I "just" need to set up Linux again (Debian 12 Bookworm Testing, which will roll into Stable automatically as soon as it is released), and a Windows-VM for testing Windows-compiles of my software, and to use Chessbase, Capture One, and (maybe) Affinity Photo. These three programs just won't work under any version of Wine.

All hardware still needs to be tested: network, sound, front-panel I/O, etc, stress-test... but that can only be done when the OS is up and running. Then I can set up the BIOS for ECO mode and stability, and I can finally get on with some chess programming AND run tests in a decent amount of time.

I hope this will carry me through the next decade without issues or changes. (Except for, maybe, a newer graphics card and/or more storage. Assuming that there will still be graphics cards that don't draw 2000 Watts of power.)

Having a seperate PSU-compartment in the bottom with a cable shroud and an extra half an inch between the right panel and the back for cable management was a nice to have in 2016, and completely non-existent in 2006. For current-day computers, it is an utter necessity.

algerbrex wrote: ↑Wed Mar 29, 2023 9:34 pm But if the speed isn't up to par (no pun intended), basic gradient descent does work quite well, and I got better values when I switched over to it. Just something to think about.

I'll probably look at your tuner to see how you implemented it. I understand gradient descent (and the idea of Texel tuning, using the pseudo-code example at CPW). I can find a local minimum or optimum in a graph using gradient descent, but with regard to chessprogramming, when hundreds of values are involved, I can't picture how this would work.

algerbrex · Post by **algerbrex** » Mon Apr 03, 2023 2:27 am

mvanthoor wrote: ↑Sun Apr 02, 2023 9:45 pm I'll probably look at your tuner to see how you implemented it.

Yea, please feel free to. If you do choose to do so, make sure to check out the implementation in the 9.0.0-dev branch (https://github.com/algerbrex/blunder/tree/9.0.0-dev) rather than the main branch. I re-wrote the tuner code a couple of months ago, the underlining logic and math is the same, but the implementation IMO is much cleaner and relies much less on hardcoded indexes and magic numbers scattered throughout the code.

mvanthoor wrote: ↑Sun Apr 02, 2023 9:45 pm I understand gradient descent (and the idea of Texel tuning, using the pseudo-code example at CPW). I can find a local minimum or optimum in a graph using gradient descent, but with regard to chessprogramming, when hundreds of values are involved, I can't picture how this would work.

The idea is essentially the same. The only difference is that instead of working with a single-variable function, where we can plot it on the 2d-Cartesian plane, we're working with multi-variable functions.

The mean-squared error calculation that's used in texel tuning can be thought of as a function of the evaluation weights, and the output is the total error calculated after going through every position you use in tuning. And we can compute the gradient by computing the partial derivative with respect to each weight parameter. The actual derivation is a little messy, but the implementation isn't terrible. Feel free to message if you have any questions.

lithander · Post by **lithander** » Mon Apr 03, 2023 1:39 pm

algerbrex wrote: ↑Mon Apr 03, 2023 2:27 am And we can compute the gradient by computing the partial derivative with respect to each weight parameter. The actual derivation is a little messy, but the implementation isn't terrible.

You obviously enjoy the benefit of a recent math class.

We old folks have to make do with what we can remember and then deriving the partial derivate of something involving multip-dimensional vectors can easily seem like an insurmountable obstacle.

But if you look at the final implementation and just forget that it implements gradient descent and that it involves partial derivatives you can still try to make sense of why the code works intuitively.

Code: Select all

        public static void Minimize(List<TuningData> data, float[] coefficients, float evalScaling, float alpha)
        {
            float[] accu = new float[AllWeigths];
            foreach (TuningData entry in data)
            {
                float eval = Evaluate(entry, coefficients);
                float error = Sigmoid(eval, evalScaling) - entry.Result;

                foreach (Feature f in entry.Features)
                    accu[f.Index] += error * f.Value;
            }

            for (int i = 0; i < AllWeigths; i++)
                coefficients[i] -= alpha * accu[i] / data.Count;
        }

This is the heart of a gradient descent tuner. You have a feature-vector (entry.Features) and for each feature you have an associated weight (coefficients), so you also have a weight vector. These vectors are not 2D or 3D vectors but have several hundred dimensions. But it doesn't change the fact that Evaluate() is just a dot product of two vectors. Each (feature, coefficient) pair will affect the evaluation in a small way and the sum of these changes is the eval value. We are tuning with labeled positions so we also have an expected value (entry.Result) and from the difference we can compute the error of this particular position compared to it's label.

The error can be positive or negative, and we are trying find coefficients to minimize the average error on all the training set.

To do that we will remember how each (feature, coefficient) pair contributed to that error in the 'accu'. When the feature was present we just multiply the feature with the error and adjust the value in the accumulator.

In a perfectly tuned evaluation we would not assume that there is now error. That's unrealistic. But we can expect that a well tuned (feature, coefficient) pair is sometimes contributing too much and sometimes too little and that the magnitude and direction of the individual "errors" we accumulated cancel out.

If that is not the case we now have the chance to make an adjustment on the coefficient based. The further the coefficient is from the equilibrium the more we move it in the "right" direction.

Code: Select all

coefficients[i] -= alpha * accu[i] / data.Count;

Alpha controls how much we adjust the coefficients. Generally you'll want to find values for alpha that are as large as possible without 'overshooting' and preventing a conversion. There's no right value here that fits well for all coefficients. Algorithms like AdaGrad try to adress this problem.

The power of Gradient Descent stems from the fact that with just one pass over the training data you gather information that allows you to adjust all coefficients at once. Also, because instead of adjusting a coefficient by incrementing/decrementing the value by always 1 as in Texel you can now adjust based on how far outside the equilibrium coefficient currently is.

Explaining it like that I feel like you don't even have to know what a derivative is to understand why the above code works well in tuning the weights to minimize error on a given dataset.

Progress on Rustic

Re: Progress on Rustic

Re: Progress on Rustic

Re: Progress on Rustic

Re: Progress on Rustic

Re: Progress on Rustic

Re: Progress on Rustic

Re: Progress on Rustic

Re: Progress on Rustic

Re: Progress on Rustic

Re: Progress on Rustic