Error handling in chess engines

mvanthoor · Post by **mvanthoor** » Sat Jan 06, 2024 4:23 pm

Hi

One thing I've noticed over the years is that in many chess engines, error handling doesn't seem to be much of a thing.

Very often only the happy path is written. For example: if a UCI-string containing an error comes in, the engine will crash. If UCI-commands are given in a wrong order and/or omitted, it can happen that the engine expects to have received a command but didn't, and will crash on the subsequent command.

I'm noticing this again while writing my Texel tuner (*). I've read through a few tuners with a cursory glance, but most basically just read the data file, and assume that all data is 100% correct. Therefore the tuner might end up with corrupted FEN-strings and crash (or in the better way, skip the position).

In short, a rather large part of the code in my engine consists of error handling. (In Rust, you can NOT just SKIP error handling without jumping through hoops. You MUST handle any possible error, or specifically state that you won't.) Thus I check and sanitize all the data that comes into the engine; be it UCI-commands, typed at the keyboard on the command-line, or when reading the tuning data file. The engine should therefore always either reject incoming stuff and do nothing, or be working with correct data.

Do you spend time in error checking everything, or do you assume that some stuff will just be correct? For example: UCI: assume the GUI will do it correctly, tuning data: no need to check because I'm the only one making the data.

Not having to do the error checks would make writing the code about twice as fast, and half as long...

(*) It is actually much less difficult than I thought. I've FINALLY taken the time in my Christmas holiday to read up on everything I collected over the years. Now that I know what steps to take, writing the code to do so is rather easy. Stuff that stumped me in the past were sentences such as: "Create a vector of mutable coefficients." Darn. Just write "make a list of the variables you want to tune."

JVMerlino · Post by **JVMerlino** » Mon Jan 08, 2024 3:19 am

mvanthoor wrote: ↑Sat Jan 06, 2024 4:23 pm (*) It is actually much less difficult than I thought. I've FINALLY taken the time in my Christmas holiday to read up on everything I collected over the years. Now that I know what steps to take, writing the code to do so is rather easy. Stuff that stumped me in the past were sentences such as: "Create a vector of mutable coefficients." Darn. Just write "make a list of the variables you want to tune."

I was in the same spot for years, trying to read papers that were filled with (to me) incomprehensible math and jargon. Then I finally saw a video (can't remember where) that explained it in simple terms, with pseudo-code, and after that it was easy for me to create a VERY rudimentary implementation.

fasterik · Post by **fasterik** » Mon Jan 08, 2024 8:52 am

The UCI protocol seems designed to put as much responsibility as possible on the GUI for input validation and error handling. So for a UCI chess engine, I think it's reasonable to assume that you're receiving valid input. This is especially true if your primary goal is to compete with other engines in a specific tournament environment, as opposed to being used by humans in a wide variety of situations.

More generally, error handling is a cost vs. benefit thing. The costs associated with error handling are cognitive overhead, code maintenance, and API complexity. I'm a fan of crashing (i.e. exiting with a non-zero exit code) as an error handling strategy because it has a low cost on all of these dimensions. Of course it's not always appropriate to crash on an error, but when it is you can greatly simplify the surrounding code.

Within a single program, we tend to have some modules that validate input, and others that trust their input. We can apply the same principle to multiple programs in some pipeline or ecosystem. If it's receiving input from a specific, highly trusted source, then input validation might be a waste of time. On the other hand, programs designed for general purpose use should have robust error handling and reporting, otherwise they won't be pleasant to use.

I stumbled across a series of blog posts a few years ago that has influenced my thinking about error handling:

https://bitsquid.blogspot.com/2012/01/s ... art-1.html
https://bitsquid.blogspot.com/2012/01/s ... art-2.html
https://bitsquid.blogspot.com/2012/01/s ... art-3.html

mvanthoor · Post by **mvanthoor** » Mon Jan 08, 2024 3:57 pm

JVMerlino wrote: ↑Mon Jan 08, 2024 3:19 am I was in the same spot for years, trying to read papers that were filled with (to me) incomprehensible math and jargon. Then I finally saw a video (can't remember where) that explained it in simple terms, with pseudo-code, and after that it was easy for me to create a VERY rudimentary implementation.

Probably this page: Texel Tuning, WukongJS

From Maksim Korzh. IIRC, he got his explanation from the author of Rofchade. I basically figured out everything on my own by piecing together stuff from papers, sites, and tutorials, except for the "Create a vector of variable coëfficients", which turns out to just be a list of all the parameters (PST values, etc) from your evaluation function.

So I refactored my evaluation a bit; it now doesn't use constants directly, but it accepts an incoming struct holding the PST's and other weights. So I'm now writing three functions:

- one that can convert the struct to a flat list (so the tuner can loop over it and change parameters)
- one that converts the flat list into the struct, so changes can be passed to the evaluation function
- one that dumps the struct to the hard disk at the end of each run (each time all variables have been tuned +/- 1)

That should work.

Ras · Post by **Ras** » Mon Jan 08, 2024 6:58 pm

mvanthoor wrote: ↑Sat Jan 06, 2024 4:23 pmDo you spend time in error checking everything, or do you assume that some stuff will just be correct?

In normal game play over UCI, I regard any engine crash as bug. My engine is not supposed to crash no matter how malformed the input is, and it would crash e.g. with the side to move giving check. In languages such as C/C++, any crash (in the sense of e.g. segfaults) is a potential security issue, after all. The best way to robust systems is defence in depth. That's also why pushing the blame on the GUI is the wrong way of thinking IMO.

Upon starting, i.e. after receiving uci, I just set the engine position to the starting position so that even an immediate go command will not cause trouble. If my engine gets a critical position, it will refuse to move and also indicate the reason:

Code: Select all

position fen 8/8/3k3R/8/3K4/8/8/8 w - - 0 1
info string error (illegal position: side to move giving check)
go depth 5
info string error (illegal position)
bestmove 0000

I don't currently evaluate that sanity checking in the tuner, but that's not critical since it's only for development, and the tuner code is not compiled into the release builds.

Position aspects such as values castling rights, ep square are silently sanitised. Parameters in the go command or via setoption are silently clamped to the valid range. E.g. giving negative time values are auto-corrected to 0, and the hash size is limited to what the engine announces to the GUI.

Error handling in chess engines

Error handling in chess engines

Re: Error handling in chess engines

Re: Error handling in chess engines

Re: Error handling in chess engines

Re: Error handling in chess engines