Xiangqi evaluation

hgm · Post by **hgm** » Sun Jul 01, 2018 11:05 pm

Since my engine HaQiKi D in the past quite often made wrong decisions, declining an easy draw and going on a doomed path instead, I am completely rethinking my evaluation. The main problem is how to evaluate the purely defensive pieces, Elephant and Advisor. I used to evaluate these depending on the amount of attacking material the opponent posessed. But this is wrong, and you can also use your attacking material in defense. E.g. against a Rook you need all 4 defenders to draw, if you have no other material yourself. But if it is Rook against Rook, it is a draw even without any defenders, so that the defenders are basically useless. If you are behind in Pawns, it would be better to go for a single Pawn than to gain all 4 defenders. But HaQiKi D thought the defenders worth their normal share (~2 Pawns) against a Rook, and would prefer to capture two worthless Elephants rather than the Pawn that would make the opponent win, if it had the choice.

To make it realize that in an R vs R situation the defenders are worthless, I now devised the following algorithm:

In the material hash table I calculate the 'defense surplus' for each side. This is the value of the defenders (0-8) minus the 'defender need'. The latter is calculated from the material imbalance of R,C,H in your disadvantage (R=10, H=6, C=4). to which is added 0.2 times the total attacking material including Pawns. (Where P is counted for 3, as it will eventually be when it crosses the River and approaches the Palace.) The latter term represents that defending is more difficult than attacking, so that when both sides have, say, 2 Rooks, (worth 20), you get quickly checkmated if you haven't at least 2 defenders. So initially, with all material on the board, the defender need is larger than what you can muster, and there will be no surpluss defenders. As long as it stays equal, the defender need will slowly taper towards zero as material gets exchanged.

The idea is that a positive defender surplus means you can afford to lose some of your defenders without ill effects, so that you have to strongly discount the value of these defenders. The score discount can be tabulated as a function of the defender surplus; I was thinking to count the first surplus defender as 0.5 to 1 Pawn (after all, it is an insurance against losing material, which would make your defender need go up), and additional surplus defenders for ~0.1 (to prevent it gives them away for nothing). A negative surplus would be clipped to 0 (no discount).

Because tha Pawns are counted as worst case, the defender need is strongly overestimated. Pawns that have not crossed the River and are opposed by enemy Pawns can normally not be promoted in an equal-material situation. When behind in piece material, defending Pawns usually will be lost, and so the opposing Pawns can promote. But c- and g-Pawns will have to pass through a square that can be protected by an elephant for that, and then usually will be traded for that Elephant (worth 2) rather than being able to reach their maximum value of 3. And an opposing Pawn that can be protected by an Elephant is usually able to stop an opposing Pawn when you are behind in pieces. So in a sense Pawns do have defensive quality, against other (uncrossed) Pawns. It is therefore important to keep track of the defensive power of your uncrossed Pawns, to add that to the defender surplus. Also, opponent Pawns that reach the last rank lose most of their attacking value, and should certainly not be counted as 3. (More like 1.)

All this depends on Pawn location, however, so it cannot be calculated from the material key, or stored in the material hash. So I keep a 'pawn key' to index a pawn hash table, similar to the Pawn hash in Chess. Unlike Chess, I am not interested in uniquely identifying the Pawn chains, however, as Pawns across the River are mobile pieces, and any structure there is transient. I just want to distinguish uncrossed and last-rank Pawns. This makes the number of Pawn classes so small that perfect hashing is possible: I reserve 1 bit in the key for each uncrossed Pawn (so 10 bits in total). Because some Pawns can be stopped by Elephants, I work the number of Elephants of each sides into the Key as well (two times 3 possibilities, so 9K possibilities in total). To also account the number of last-rank Pawns of each side the table will get 6x6 = 36 times larger. But the search will alsmost never access positions with multiple last-rank Pawns, so this will not produce any cache pressure. The frequently accessed 9K part of the table can be easily cached. The table would contain the number of attacking Pawn points that are neutralized by the defending Pawns or Elephants, for each side. And it would have to contain that for the case with equal piece material, and when behind (so that defending Pawns are generally doomed).

The evaluation would then find the 'raw defender surplus' in the material table, as well as flags that indicate which side is ahead in piece material (according to R=2, H=C=1). It would then use the pawn key to get the pawn defender value (using mentioned flags to decide which element to use from it), and add that to the raw defender surplusses to get the actual defender surpluses. These would be used (when positive) to look up score corrections for discounting the value of the surplus defenders.

phhnguyen · Post by **phhnguyen** » Wed Jul 04, 2018 6:19 pm

hgm wrote: ↑Sun Jul 01, 2018 11:05 pm ...
But HaQiKi D thought the defenders worth their normal share (~2 Pawns) against a Rook, and would prefer to capture two worthless Elephants rather than the Pawn that would make the opponent win, if it had the choice.

You’re right, you’re wrong

hgm wrote: ...
The latter is calculated from the material imbalance of R,C,H in your disadvantage (R=10, H=6, C=4). to which is added 0.2 times the total attacking material including Pawns.

You’re using amazing values. Typically Cannon’s value is higher than Horse’s one.

hgm wrote: (Where P is counted for 3, as it will eventually be when it crosses the River and approaches the Palace.) The latter term represents that defending is more difficult than attacking, so that when both sides have, say, 2 Rooks, (worth 20), you get quickly checkmated if you haven't at least 2 defenders. So initially, with all material on the board, the defender need is larger than what you can muster, and there will be no surpluss defenders. As long as it stays equal, the defender need will slowly taper towards zero as material gets exchanged.

The idea is that a positive defender surplus means you can afford to lose some of your defenders without ill effects, so that you have to strongly discount the value of these defenders. The score discount can be tabulated as a function of the defender surplus; I was thinking to count the first surplus defender as 0.5 to 1 Pawn (after all, it is an insurance against losing material, which would make your defender need go up), and additional surplus defenders for ~0.1 (to prevent it gives them away for nothing). A negative surplus would be clipped to 0 (no discount).

Because tha Pawns are counted as worst case, the defender need is strongly overestimated. Pawns that have not crossed the River and are opposed by enemy Pawns can normally not be promoted in an equal-material situation. When behind in piece material, defending Pawns usually will be lost, and so the opposing Pawns can promote. But c- and g-Pawns will have to pass through a square that can be protected by an elephant for that, and then usually will be traded for that Elephant (worth 2) rather than being able to reach their maximum value of 3. And an opposing Pawn that can be protected by an Elephant is usually able to stop an opposing Pawn when you are behind in pieces. So in a sense Pawns do have defensive quality, against other (uncrossed) Pawns. It is therefore important to keep track of the defensive power of your uncrossed Pawns, to add that to the defender surplus. Also, opponent Pawns that reach the last rank lose most of their attacking value, and should certainly not be counted as 3. (More like 1.)

All this depends on Pawn location, however, so it cannot be calculated from the material key, or stored in the material hash. So I keep a 'pawn key' to index a pawn hash table, similar to the Pawn hash in Chess. Unlike Chess, I am not interested in uniquely identifying the Pawn chains, however, as Pawns across the River are mobile pieces, and any structure there is transient. I just want to distinguish uncrossed and last-rank Pawns. This makes the number of Pawn classes so small that perfect hashing is possible: I reserve 1 bit in the key for each uncrossed Pawn (so 10 bits in total). Because some Pawns can be stopped by Elephants, I work the number of Elephants of each sides into the Key as well (two times 3 possibilities, so 9K possibilities in total). To also account the number of last-rank Pawns of each side the table will get 6x6 = 36 times larger. But the search will alsmost never access positions with multiple last-rank Pawns, so this will not produce any cache pressure. The frequently accessed 9K part of the table can be easily cached. The table would contain the number of attacking Pawn points that are neutralized by the defending Pawns or Elephants, for each side. And it would have to contain that for the case with equal piece material, and when behind (so that defending Pawns are generally doomed).

The evaluation would then find the 'raw defender surplus' in the material table, as well as flags that indicate which side is ahead in piece material (according to R=2, H=C=1). It would then use the pawn key to get the pawn defender value (using mentioned flags to decide which element to use from it), and add that to the raw defender surplusses to get the actual defender surpluses. These would be used (when positive) to look up score corrections for discounting the value of the surplus defenders.

IMO, if you still taper / scale defenders based on whatever-materials, you may not completely solve your problem.

Differ from chess as well as midgames, Xiangqi endgames are not always smoothly and sometimes results may not be decided by pure materials only. For example, in the endgame KCKPP, the “strong” side Cannon has much higher material value than the side of two Pawns. However the Cannon is harmless since it has no mouth to attack the opposite King and its side cannot win. In contrast, the Pawns’ side may win even they have against that Cannon as an extra defender.

For me, a set of defenders still has same values / importance even they are against 11 attackers in the beginning or only 2 Pawns at the end. Thus I don’t taper their values. Instead I change value of King safety or simply add some bonuses / punishments based on what the opposite can do or how dangerous they are (not their simple material value).

I have a function: bool canWin(Side side). It’s used to check if one side can win the rival. For KC, it cannot win since Cannon has no mouth, regardless what materials the opposite has. In other hand KPP can win when the opposite is K+C/H/P. A Rook cannot win if opposite has a Rook or full defenders... Sometimes this function consideres to locations (such as old Pawns, Rook in middle file) and / or shapes of some pieces (up level King, Advisors cover King).

If both sides cannot win, it’s a draw. Otherwise we can add a bonus to winnable side. Based one that the search function will know which pieces should be kept, which pieces should be exchanged or sacrificed, regardless those pieces are attackers or defenders nor their values, for getting a win/draw.

It’s actually an “open” version or same logic with the way you write codes for known/clear endgames (such as KPK, KRKD).

Good luck to your program HaQiKi in this Computer Olympiad

/Pham

hgm · Post by **hgm** » Wed Jul 04, 2018 8:48 pm

I thought that in the end-game a Horse becomes stronger than a Cannon. The values I mention are not those used as piece value in the evaluation, but just to calculate when a defending piece is superfluous. For this I take end-game values, because in the middle game it is usually too difficult for Pawns to cross the River and advance to the Palace, where they can be traded for defenders. So the accounting of how many defenders will be lost to Pawns, and whether you then still have enough left to defend against the pieces, must be based on end-game values. Anyway, I switch to using R=10, C=H=5 now to determine who is the stronger side.

It seems I have been over-designing this anyway. Now that I am play-testing, I already get significantly positive result by a very simple evaluation term based on material only. (And thus not on whether Pawns have crossed or not, or are uncrossed passers or obstructed by Elephants.) My first try simply assumed every Pawn would cross the River, and then count as 3 attack points, while a defender counts as 2. So the defender surplus of side A is calculated as

RCH = 10*nrOfRooks + 5*nrOfCannons + 5*nrOfHorses;
surplus = rnOfAdvisorsAndElephants(A)*2 + min(2, RCH(A) - RCH(B)) - 3*nrOfPawns(B) - RCH(B)/4;

If surplus >= 0 I then add 'virtual defenders' to the score of that player to fake that he has full defense (well, just slightly lower). So if in reality he has 3 defenders (accounted for in the piece-square score), and if 2 would already suffice, I add the value of ~ 0.75 defender to his score). If he would lose one of the defenders, the two he has left would still be sufficient, so that he then gets 2*0.75 added, but the lost defender would of course have disappeared from the PST score, so that his score only drops by 0.25 times the value of a defender. In addition I add a ponus for each Pawn of a player that has the larger RCH.

This scored ~55% (in ~200 games) against a version that doesn't have this. It didn't solve the problem in the position I set out to solve, however:

[d]3fk4/4f4/4e4/8p/2e6/P3r1O1P/9/4E4/4F4/1R2K4 b 0 47

It still takes the Elephant here, rather than the Pawn on a4! The problem is that without the Elephant it thinks white has no sufficient defense left: with Elephant surplus = 2*2 + 2 - 3 - 10/4 = 1, so white gets a bonus of 1.5 defender, but without it surplus = -1. So capturing the Elephant actually gains black 2.5 defenders!

The problem seemed that it hugely overestimates the danger of the black Pawn. In real life this will never make it to the white Palace, against an opposing Pawn and an extra Rook. So I decided to discount Pawns of the side that is behind in RCH material by about 50%. So the formula becomes:

surplus = rnOfAdvisorsAndElephants(A)*2 + min(2 + 3*nrOfPawns(B)/2, RCH(A) - RCH(B)) - 3*nrOfPawns(B) - RCH(B)/4;

Which is again very course; it would of course be better to actually decide which Pawns do stand a chance to threaten the Palace and which not. After all, the Pawn in question might have already be in the Palace. But statistically it apparently works: the version that uses formula (2) again beats the version that uses formula (1) by ~55%.

I look forward to seeing you in TaiPei! And HaQiKi D will need a lot of luck; it seems the opposition will be quite stiff!

hgm · Post by **hgm** » Thu Jul 05, 2018 12:10 am

I am still working on an attempt to put in mobility evaluation for Rooks and Cannons. To avoid slowing the engine down too much I want to keep track of that through incremental update. So I store the mobility of individual pieces in the piece list. When the piece moves, its old mobility is saved on a stack, and the mobility in its new location is calculated. The difference is then added to the total mobility. Then you have to do it only for a single piece, and only when that piece is R or C. On UnMake() the saved values are put back. The mobility of a captured piece is simply subtracted from the total mobility; the piece mobility can stay in the piece list, as captured pieces are not touched anymore.

The mobility is calculated with the aid of a neighbor table, which stores the nearest occupied square in all directions in an 11x12 board, where the edges are always occupied, (leaving a 9x10 playing area). Because Xiangqi has no diagonal sliders, there are only 4 directions, and they can be packed as bytes in a single int. Rook mobility is calculated by taking the 4 neighbor squares from this table, lookup their distance to the square the Rook is on (length[target-origin]), and adding those up. For a Cannon it is a bit more complex, as you also have to look at the second neighbor (which is specified by the first neigbor) to get the potential number of captures; I let both captures and attacked squares contribute to the Cannon mobility.

Basically each file and rank is a doubly linked list, each occupied square appearing in two lists. When a square gets evacuated, it is just unlinked from the two lists it is in; because it keeps its own links, it can be easily linked back into the lists on UnMake(). When an empty square gets occupied it is more complex, and you have to scan along files or ranks to find the neighbors (and save the old links of that square, not to disturb UnMake() closer to the root in nodes where the square was last evacuated). But this happens only in non-captures, so not in QS, and is thus rare.

This leaves the problem of discovered attacks. This requires updating of the mobility of pieces that are not moved. To make that easy I keep an attack map, which for each square packs a number of bit flags corresponding to individual sliders. Where flags for Cannons come in two flavors: mount and target. If a square gets evacuated, the attack map (through bit extraction) tells which sliders have their move discovered. By looking up the slider, then its location you can deduce the direction of the attack (again a lookup indexed by the relative position). And then look up the next neighbor in that direction where the new attack goes. If the piece that moved away was a Cannon mount, the mount is displaced downstream to the old target, and one step further downstream you find the new target. All that through lookups in the neighbor table. The distances between old and new targets define the mobility change. This sounds rather cumbersome, but the idea is that only a small fraction of the moves discover any slider moves.

Of course you will also have to update the attack map. This can be done in combination with calculation of the mobility change. When you discover a move, and look up where the new target is, you can write the attack on that in the map there, as well as calculate the distance of the target for the mobility change. The same holds for calculation of the mobility from the new location of the moved piece. For captured pieces you can leave their attacks in the map, if each time when you use a map element you mask these out with the aid of a 'presence' mask. For UnMake() you basically do the same, except there is no need to calculate the mobility then (which can be simple restored from a saved value). The moves of a piece, as well as discovered moves, are scattered all over the attack map, though. So saving those is not really competitive, as you would have to save the old value as well as what was changed, and regenerating the attacked squares with the aid of the neighbor table is faster.

phhnguyen · Post by **phhnguyen** » Thu Jul 05, 2018 7:03 am

hgm wrote: ↑Thu Jul 05, 2018 12:10 am I am still working on an attempt to put in mobility evaluation for Rooks and Cannons. To avoid slowing the engine down too much I want to keep track of that through incremental update. So I store the mobility of individual pieces in the piece list. When the piece moves, its old mobility is saved on a stack, and the mobility in its new location is calculated. The difference is then added to the total mobility. Then you have to do it only for a single piece, and only when that piece is R or C. On UnMake() the saved values are put back. The mobility of a captured piece is simply subtracted from the total mobility; the piece mobility can stay in the piece list, as captured pieces are not touched anymore.

I may help you to speed up your engine a bit: don't count mobility for Cannons. It is almost no meaning. You may observe that Cannons can stay in their positions for very long time, with many pieces surround them. Instead, count what they are pinning (the computing for pinning is cheaper than the one for mobility).

In Xiangqi, IMO, only Rooks and Horses should be considered their mobilities. It is cheap computing for Horses. For Rooks, in my very old Xiangqi engine Saola, I used some simple for-loops to count their mobilities and felt OK for their performance since that computing was not so heavy nor required for all cases.

I was a huge disappointed fan of using attack-map / updating-on-fly. After spending a lot of time and labor (Cannons and Horse blocks were nightmare problems) I could make those functions work correctly. Unfortunately, they did not improve speed (they made my engine be slower in some tests) when making my code bigger and much harder to maintain. Thus I have totally removed them all.

/Pham

hgm · Post by **hgm** » Thu Jul 05, 2018 8:55 am

Thanks for your advice on the Cannons; I will try that out. I figured Cannon non-capture mobility must be of some importance, as HaQiKi D often loses Cannons against stronger opponents. A cannon that is trapped in one dimension is quite vulnarable to Rook attacks along the other dimension. But perhaps this situation can be recognized (and penalized) in a much cheaper way as through full Cannon mobility. I had also planned to keep track of the number of squares attacked by the Cannon (i.e. between the mount and the actually attacked/protected piece). For normal pieces this is the same as their number of non-capture moves, but for Cannons it differs. I already evaluate pins by a Cannon as part of the King safety. This is now done 'on the fly', by scanning the board in 3 directions from the King, until you reach the 3rd obstacle.

Updating the mobility incrementally doesn't seem very expensive, though. Most moves in the tree are captures, becase these are the only moves I consider in QS. And in Xiangqi you have only to keep track of orthogonal moves. For those the update of the neighbor table looks like:

Code: Select all

typedef union {
 int all;
 unsigned char d[4]; // number of square we see in each of 4 orthogonal directions.
} View;

View view[229];

void Evacuate(int sqr)
{ // make neighbors of given square view each other
  int up = view[sqr+20].d[0];
  int dn = view[sqr+20].d[2];
  view[up].d[2] = dn;
  view[dn].d[0] = up;
  up = view[sqr+20].d[1];
  dn = view[sqr+20].d[3];
  view[up].d[3] = dn;
  view[dn].d[1] = up;
}

void Reoccupy(int sqr)
{ // unmake of Evacuate()
  sqr += 20;
  int up = view[sqr].d[0];
  int dn = view[sqr].d[2];
  view[up].d[2] = sqr;
  view[dn].d[0] = sqr;
  up = view[sqr].d[1];
  dn = view[sqr].d[3];
  view[up].d[3] = sqr;
  view[dn].d[1] = sqr;
}

And you easily gain that back by using the neighbor table for generating the Rook and Cannon captures, so that you get their target squares with one or two lookups, respectively, instead of having to scan the board until you hit something. And in evaluation you can simply add the four distances to the targets given by the neighbor table, when you calculate mobility on the fly, instead of incrementally. So this part of the scheme seems to 'pay for itself'.

The partial attack map I intended to maintain is just intended as an aid for cheaply calculating Rook (and perhaps Cannon) mobility chnages, in particular due to unblocking their attacks; when in a single lookup you can conclude there were no Rook and Cannons aiming at the evacuated square (which is the most common case) you know without further work that there is no change in the mobility of other Rooks/Cannons than those that were actually moved. The price is that you have to remove and apply the attacks by the piece that was moved, from its old and new position.

You have a point, though, as this is equivalent to generating the moves of 4 pieces, two during MakeMove() and two during UnMake(). And there only are 4 sliders in Xiangqi, maximally. So you could have generated their moves (to calculate their mobility) on the fly in every node, for the same cost. In the incremental update you save work because not all moves are Rook/Cannon moves, and for other pieces you don't have to do anything. But when calculating on the fly you would save work if some of the sliders have been captured. So perhaps the difference is not very big, and not worth the complexity. Especially if I don't do Cannons; then it is only a matter of calculating up to two Rook mobilities, as:

Code: Select all

for(d=0; d<4; d++) {
  int s = view[sqr+20].d[d];
  int dist = length[s-sqr-20];
  mobility += dist;
}

To make the update of an attack map worth the investment, it should really be used for capture generation, so that you can generate these in MVV order, and consequently stop generating when you have found the cut-move. But that would require a full attack map of all pieces (which again raises the complexity of discovering Horse and Elephant moves). And a complete rewrite of the search routine, to use staged move generation.

phhnguyen · Post by **phhnguyen** » Thu Jul 05, 2018 1:34 pm

hgm wrote: ↑Thu Jul 05, 2018 8:55 am Thanks for your advice on the Cannons; I will try that out. I figured Cannon non-capture mobility must be of some importance, as HaQiKi D often loses Cannons against stronger opponents. A cannon that is trapped in one dimension is quite vulnarable to Rook attacks along the other dimension.

Something's weird here. As my experience, Pawns and Horses are the easiest to be trapped and captured but Cannons are much harder or even the hardest ones. Typically you cannot trap a Cannon since it could easily jump out and you need to exchange for them. Rooks almost cannot capture a free (not being pinned) Cannon if they don't want to lose the game because of permanent chases.

hgm wrote: ↑Thu Jul 05, 2018 8:55 am But perhaps this situation can be recognized (and penalized) in a much cheaper way as through full Cannon mobility. I had also planned to keep track of the number of squares attacked by the Cannon (i.e. between the mount and the actually attacked/protected piece). For normal pieces this is the same as their number of non-capture moves, but for Cannons it differs. I already evaluate pins by a Cannon as part of the King safety. This is now done 'on the fly', by scanning the board in 3 directions from the King, until you reach the 3rd obstacle.

I know you are not a fan of bitboards, but for me, using bitboards to do all those tasks on-fly is more natural and easier

hgm · Post by **hgm** » Thu Jul 05, 2018 2:35 pm

Yes, but it is so much slower. Especially for large boards. My Tenjiku Shogi program (mailbox) does 500knps. The bitboard competition only manages 5knps...

phhnguyen · Post by **phhnguyen** » Thu Jul 05, 2018 4:02 pm

lol, I have so different result compared with yours, at least for Xiangqi!!!

I have created a simple chess engine, using templates, thus for same search, eval... functions I can switch between the mailbox and the bitboard to test them fairly. Frankly speaking, I wanted to see the mailbox be faster than the bitboard when I started testing since my old engine was a mailbox one.

However the result surprised me much: for pertf tests, the mailbox won, but for searching, the bitboard won with a significant margin: almost double speed. I know the result may be changed if I add more things for evaluation, search functions, do some twists, change the test sets... but it's quite enough for me to give up mailboxes (I have written already a report and I can send you if you want).

Xiangqi evaluation

Xiangqi evaluation

Re: Xiangqi evaluation

Re: Xiangqi evaluation

Re: Xiangqi evaluation

Re: Xiangqi evaluation

Re: Xiangqi evaluation

Re: Xiangqi evaluation

Re: Xiangqi evaluation

Re: Xiangqi evaluation