Unit tests for engine code?

dangi12012 · Post by **dangi12012** » Fri Sep 16, 2022 9:31 pm

I wanted to ask if someone here does tests.
Do you use a Test framework like gtest, or just write a function that is called which calls all the functions to be tested?

hgm · Post by **hgm** » Fri Sep 16, 2022 9:52 pm

If I implement complex incremental updates, e.g. of an attack map, I start with writing a function that calculates it from scratch, and one to compare that with the incrementally updated one. Otherwise, in something as simple as an engine, I just test it by having it play games, and look for blunders in those. Sometimes I add some code to copy all variables that contain or depend on the game state, and to test those after each UnMake to make sure these are still the same.

fasterik · Post by **fasterik** » Fri Sep 16, 2022 11:16 pm

I do something like this:

Code: Select all

bool run_tests() {
	int num_failures = 0;
	num_failures += test_a();
	num_failures += test_b();
	num_failures += test_c();
	// ...
	
	if (num_failures)
		printf("%d tests failed.\n", num_failures);
	else
		printf("All tests succeeded.\n");
	
	return num_failures == 0;
}

int main(int argc, char **argv) {
	if (!run_tests())
		return 1;
	
	// Main program.
	// ...

	return 0;
}

Each test function can run one or more tests and return the number of failures. These will also use macros to output the line number when a test condition fails, for example:

Code: Select all

#define TestAssert(x) if (!(x)) { printf("Test assertion failed (%s:%d):\n\n\t%s\n". __FILE__, __LINE__, #x); num_failures++; }

I also like the idea of randomized testing or fuzzing because I think it covers a lot more cases than a couple of hand-picked tests can. I find it useful to write a function that plays several thousand games of random legal moves and feeds those positions into other tests.

j.t. · Post by **j.t.** » Sat Sep 17, 2022 12:04 am

I simply code everything correctly first try, this way I don't need to write any tests.

Fabio Gobbato · Post by **Fabio Gobbato** » Sat Sep 17, 2022 10:41 am

I don't use a test framework but I have single functions for testing perft, hash move validation, see, tb probing code and polyglot book probe.

smatovic · Post by **smatovic** » Sat Sep 17, 2022 11:29 am

Fabio Gobbato wrote: ↑Sat Sep 17, 2022 10:41 am I don't use a test framework but I have single functions for testing perft, hash move validation, see, tb probing code and polyglot book probe.

+1

I have a selftest function for perft and hashes, and write specific ones on demand.

--
Srdja

dangi12012 · Post by **dangi12012** » Sun Sep 18, 2022 5:15 pm

All good ideas.

I also will add regression tests.
Just a big number of stored chess trees (multiple gb) and comparing the number of nodes until the correct PV is found.

Haven't solved the issue of generating trees (domain free) yet.
I would like to have trees of high testing quality like poisoned chess positions. Where moves look very good in the short term but turn out to be bad.

clayt · Post by **clayt** » Sun Sep 18, 2022 6:15 pm

I do both unit testing and run games while observing. In general, I usually write a unit test every time I catch a bug to ensure that I don't create one again.

hgm · Post by **hgm** » Sun Sep 18, 2022 6:17 pm

dangi12012 wrote: ↑Sun Sep 18, 2022 5:15 pm Haven't solved the issue of generating trees (domain free) yet.
I would like to have trees of high testing quality like poisoned chess positions. Where moves look very good in the short term but turn out to be bad.

You did not like the method I gave you?

dangi12012 · Post by **dangi12012** » Sun Sep 18, 2022 7:19 pm

Description:

Code: Select all

For a given depth you can start with a PV, and assign it the score you want in the end leaf. 
You can then built a refutation tree (of alternating cut-nodes and all nodes) by generating only a single move in the cut-nodes 
(the cut-move), and assign all the leaves a score better 
(for the side playing the cut-moves) than the PV score. 
You can then go back to the cut-nodes, and generate trees from them 
through their other moves with totally random leaf evaluation.

In my words (Is this an apt description?)
Known PV with a score in the end. (single line with score 8)
Then add a cut move leading to leaves which are all better than the PV. (all leaves score 9 or above)
Then fill other nodes with random leaf evals.

Code:

Code: Select all

Tree(depth, score, bound)
{
  if(depth == 0) {
    switch(bound) {
      case EXACT: return score;
      case UPPER: return score - 1;
      case LOWER: return score + 1;
      case ANY: return randomScore();
    }
  } else {
    n = N; // desired number of children
    switch(bound) {
      case EXACT: // PV node
        Tree(depth-1, -score, EXACT); n--; // one PV child
        children = LOWER; // other moves are refuted, i.e. lead to cut-nodes
        break;
      case LOWER: // cut-node
        Tree(dept-1, -score, UPPER); n--; // one cut-move
        children = ANY; // the other children are irrelevant
        break;
      case UPPER: // all-node
        children = LOWER; // all children are cut-nodes
        break;
      case ANY: // unspecified node type
        children = ANY;
    }
    while(n-- > 0) {
      Tree(depth-1, -score, children);
    }
    
  }
}

I think for tests I would need to expand your code to store the actual tree in a typical Node* children, Node* parent structure. I think its a good scaffolding - except I dont understand how this can be used to generate a poisoned tree and also how adding random leaf nodes will keep the eval in tact.

Also the non leaf nodes dont have a score here so how could Alphabeta(depth -1) even return some value?
What would be needed is that Alphabeta(5) suggest strongly a move but Alphabeta(8) will prove that this was a big mistake.

Is this a poisoned position?

Code: Select all

Tree(9, 8, EXACT)

Unit tests for engine code?

Unit tests for engine code?

Re: Unit tests for engine code?

Re: Unit tests for engine code?

Re: Unit tests for engine code?

Re: Unit tests for engine code?

Re: Unit tests for engine code?

Re: Unit tests for engine code?

Re: Unit tests for engine code?

Re: Unit tests for engine code?

Re: Unit tests for engine code?