Future of Gull

THyer · Post by **THyer** » Tue Aug 23, 2016 3:38 am

As of this writing, Gull is becoming obsolete.

I have released into the public domain an engine, provisionally titled "Slizzard", which can serve as a platform for ongoing development of Gull. It is closely based on the Gull 3 source from SourceForge. It incorporates the following changes:

-- code has been ported to C++11, with far fewer macros and gotos
-- minimum increment of score is now 1/4 centipawn
-- Gull's two-phase eval (opening and endgame scores, as in Glaurung) is optionally replaced with three-phase (opening, middle, endgame) or four-phase (opening, middle, endgame, closed). The only performance loss is from the computation of "closed-ness". This introduces many new parameters which have not been tuned.
-- A Houdini-style contempt (proportional to phase) has been added.

The code is not perfect:
-- In my tests using Arena, Slizzard (in two- or three-phase mode) seems to be about 20 Elo weaker than Gull. In regression tests, I can find no difference. I do not know the reason for this weakness.
-- There is still some scar tissue from earlier experiments on the code. I will be gradually removing this.
-- I have not changed the Tuner code, because I do not understand it. It does not compile cleanly in Tuner mode, because the Gull 3 source does not.

I hope that the Gull developers will take this opportunity to unify our efforts. Gull can become much stronger.

The code can be found at: https://bitbucket.org/hyer/sonsofthebird

Tom Hyer

Dann Corbit · Post by **Dann Corbit** » Tue Aug 23, 2016 10:01 am

slizzard.cpp:395:24: fatal error: TheGenome.gg: No such file or directory
#include "TheGenome.gg"
^

Sven · Post by **Sven** » Tue Aug 23, 2016 1:10 pm

and the following 47 include statements for "Chrom*.gc" would fail as well

velmarin · Post by **velmarin** » Tue Aug 23, 2016 1:47 pm

Sven Schüle wrote:and the following 47 include statements for "Chrom*.gc" would fail as well

I guess that just delete it.
And compile

THyer · Post by **THyer** » Wed Aug 24, 2016 1:33 am

Apologies. Those are vestigial traces from my (failed) attempt to use a genetic algorithm to evolve a better Slizzard.

I've removed them and pushed a version which should compile in isolation.

Tom

basil00 · Post by **basil00** » Wed Aug 24, 2016 4:20 pm

THyer wrote: The code can be found at: https://bitbucket.org/hyer/sonsofthebird

One of the main obstacles for improving Gull is the opaque source code. The new code is much cleaner which will hopefully inspire further development.

THyer · Post by **THyer** » Wed Aug 24, 2016 6:11 pm

Thanks for that, Basil.

Any chance of a TB-support pull request?

Tom

THyer · Post by **THyer** » Wed Aug 24, 2016 7:16 pm

(Some comments regarding phased evaluation. Includes an introduction for those unfamiliar with the idea.)

Many engines have "two-phase" evaluation. During the evaluation process, two separate values are accumulated: one appropriate for the opening, and a different one for the endgame. Once accumulation is complete, the final output eval is interpolated between these two values based on the "phase" (a measure of the remaining material).

This is very efficient because the two evals can be "packed" as 16-bit fields within a single 32-bit integer. I first encountered this idea in Glaurung, the precursor of Stockfish, but its origin might be older than that. Gull also uses 2-phase evaluation.

Since most computers today have native 64-bit arithmetic, there is an opportunity to expand to more than two phases. From the tuned Gull weights, there is some reason to expect this can be beneficial. For instance, consider the logarithmic contributions to N, R and Q mobility, which all change sign from opening to endgame. Probably straight-line interpolation does not accurately capture this sort of phase change.

Adding a third phase is not extremely difficult. In fact, Slizzard presently supports the compiler flags TWO_PHASE and THREE_PHASE, which choose the desired behavior using nearly the same source code. So we can proceed from (opening, endgame) to (opening, middle, endgame). For the present I have chosen "middle" to be halfway between opening and endgame (phase=64 in Gull parlance), but I do not regard this choice as definitive.

Adding a fourth phase in this manner seems brain-dead. Surely there is a better role for this extra information than just doing a finer-grained interpolation. At first I experimented with material asymmetry as the fourth dimension, but it seems clear that closure (closed-ness) of a position is the better choice.

Slizzard contains a crude measure of closure, a measure of the presence of pawns without room to advance. Unfortunately a better measure might be too slow to evaluate. This is a research area.

The presence of these new phases expands (basically doubles) the number of degrees of freedom for the tuner. Right now I am just using the two-phase weights built into Gull; thus the middle value is always halfway between the opening and endgame values, and the closure value is always zero.

Unfortunately, successfully tuning these additional parameters will lead to evaluations which the two-phase engine cannot reproduce; thus a shared parameter set will no longer be desirable.

-- Tom Hyer

Footnote: The negative weights for R/Q log mobility in opening are best understood as tuning artifacts. It's not that limited R/Q mobility is truly desirable; but that pretending to desire it encourages B/N to be developed first. Misevaluation is baked into the weights to make them serve a higher purpose.

THyer · Post by **THyer** » Wed Aug 24, 2016 10:16 pm

(Some comments regarding King Tropism and board inhomogeneity.)

The simplest mobility measure is based on a count of squares, excluding those where the piece in question can be captured by a less-valuable opposing piece. In Slizzard this is written (e.g., for rooks):

Code: Select all

uint64 control = att & EI.free&#91;me&#93;;
IncV&#40;EI.score, Mobility&#91;PieceType&#91;WhiteRook&#93; - 1&#93;&#91;pop&#40;control&#41;&#93;);

The precomputed Mobility scores are a function of one variable, which is the number of (non-suicidally) attainable squares. Gull uses a sum of linear and logarithmic parts for this function.

But some squares are probably more important than others. Probably control of e5 is more important than control of a1. And if the opposing king is at g8, then control of f8 should count for more.

Slizzard attempts to capture both of these effects with a single extra evaluator:

Code: Select all

IncV&#40;EI.score, MobLocus&#91;PieceType&#91;WhiteRook&#93; - 1&#93;&#91;pop&#40;control & KingLocus&#91;EI.king&#91;opp&#93;&#93;)&#93;);

This is another function of one variable, which is the number of attainable squares in some desired locus depending on the opposing King location. My present attempt at designing this locus is to take the interior of an ellipse with foci at the board center and at the King location. This should encourage Slizzard to get its pieces to the most crucial part of the board.

I am not familiar with the Stockfish implementation of "King Tropism". I'd be happy to consider suggestions for improving my method, especially if people consider it inferior to the Stockfish approach.

Inevitably, this choice introduces yet another set of parameters to be tuned (4 pieces * 4 phases = 16 weights, size of ellipse, and a relative importance of King vs center). I have not yet mastered the tuner, so the current release has all weights set to zero.

THyer · Post by **THyer** » Tue Sep 06, 2016 4:25 pm

I have added to the Slizzard repo (https://bitbucket.org/hyer/sonsofthebird/src) a tuner based on Joona Kiski's method for Stockfish: https://chessprogramming.wikispaces.com ... ing+Method

My implementation is based on this description, which may be significantly less sophisticated than the current SF implementation. I added a couple of refinements myself. Note that this is an external tuner (a Python script which calls the compiler and cutechess-cli). I still do not understand Gull's internal tuner well enough to use it. I do not add parameters to the UCI interface and figure out the recalculation order, but simply change and recompile the source. Thus some manual labor is needed preparatory to each tuning run.

First, there is no need to change the parameters every game. A parameter in the script, NumPairs, determines how many pairs of games (one with each engine as white) are to be played with a given pair of parameter sets. I am currently using NumPairs=5; thus a match between engines is 10 games.

Because of this, I can monitor the 'Delta' parameter (size of parameter changes to experiment with). I maintain a scale factor Lambda, which is increased if a match ends in a tie and decreased if it is a blowout (which I define as one engine scoring over 75%). Thus, as the tuner runs, the scale factor will ensure that parameter differences are on average meaningful but not too extreme.

Third, note that the tuner updates a current "best" point in parameter space. I retain knowledge of the initial point and choose parameters centered at Initial and at (2*Current-Initial). Thus, while Current is evolving, the tuner favors linesearch along the direction of improvement; and, as convergence is reached, the distribution of test parameters is symmetric around the optimum point.

I have tested the tuning script by tuning the parameters MatLinear and MatClosed, in the four-phase version of Slizzard. The latest source reflects the results of tuning.

Slizzard is now within 5 Elo of Gull 3, to the precision of my measurement. I anticipate that further tuning, taking advantage of the additional phases and king/center tropism, will increase Slizzard's strength further.

Future of Gull

Future of Gull

Re: Future of Gull

Re: Future of Gull

Re: Future of Gull

Re: Future of Gull

Re: Future of Gull

Re: Future of Gull

Re: Future of Gull

Re: Future of Gull

Re: Future of Gull