Looking for advices

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

User avatar
xr_a_y
Posts: 1871
Joined: Sat Nov 25, 2017 2:28 pm
Location: France

Re: Looking for advices

Post by xr_a_y »

Here is some more details :

Code: Select all

Rank Name                          Elo     +/-   Games   Score   Draws
   1 fruit_21                      752     105     500   98.7%    2.6%
   2 weini4                        -18      24     500   47.4%   38.0%
   3 weini3                        -38      24     500   44.5%   38.2%
   4 weini2                        -62      24     500   41.2%   37.2%
   5 fairymax                     -101      27     500   35.8%   24.0%
   6 weini1                       -128      26     500   32.4%   31.6%
still with ponder ON and LMR active and weiniX X being the number of threadq.
User avatar
hgm
Posts: 27796
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Looking for advices

Post by hgm »

Testing against Fruit makes little sense at this stage. Better use some engines around 2200 Elo as the next milestone.

Note that Fairy-Max is single core, and does not ponder. With a bug-free search and a reasonable evaluation a single-core engine should be able to convingly beat it.
User avatar
xr_a_y
Posts: 1871
Joined: Sat Nov 25, 2017 2:28 pm
Location: France

Re: Looking for advices

Post by xr_a_y »

Yes there must be some other issue. But fairymax (1.2Mnps) is also nearly twice as fast as Weini (700Knps) on my hardware, so pondering just equilize in fact I think.
User avatar
hgm
Posts: 27796
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Looking for advices

Post by hgm »

Well, part of the reason it is so fast is that it counts nodes in an unusual way, namely inside its IID loop. And even in a QS node it goes through that loop twice, first to find the 'best' MVV/LVA move, and then to actually search the captures starting with that best move. Correcting for that would bring the speed down to below 600Knps.

The other reason is that it is fast is that it has no evaluation (other than what it does incrementally, and which thus doesn't take measurable time). And it is a bit unfair to count the slowdown you get from spending time on evaluation as a speed disadvantage. It would mean the evaluation doesn't even make up in strength for the time it uses, and is thus more a waste of time than anything else.
User avatar
xr_a_y
Posts: 1871
Joined: Sat Nov 25, 2017 2:28 pm
Location: France

Re: Looking for advices

Post by xr_a_y »

ok, I get your point but surprisingly, when I profile Weini, the evaluation is not that much the bottleneck. Movegenerator and threat detection is taking a lot of time (together with filling and sorting moves).

Code: Select all

  14,54%  weini                         [.] EvalOptimDriver
  13,83%  weini                         [.] GeneratorHelperCacheSquareNoStat
   7,99%  weini                         [.] GetThreadsHelperFast
   5,90%  weini                         [.] SortMovesFunctor
   5,55%  weini                         [.] Negamax
   5,19%  weini                         [.] NegaQuiesce(
   5,18%  weini                         [.] Run
   3,33%  libc-2.19.so                  [.] malloc
   2,80%  libc-2.19.so                  [.] _int_free
   2,51%  weini                         [.] Generator
   2,43%  weini                         &#91;.&#93; vector<>&#58;&#58;emplace_back<Move>
   2,40%  weini                         &#91;.&#93; ValidatePawn
   2,19%  weini                         &#91;.&#93; SetPiece
   2,09%  weini                         &#91;.&#93; ValidateIsCheck
   2,02%  weini                         &#91;.&#93; ValidateCheckStatus
   2,00%  weini                         &#91;.&#93; GetTTQ
My code for move generatiion (pseudo legal move) is quite classic. What seems to take time is to validate move is not illegal (puts own king in check) during search (GetThreadsHelperFast). For moves management, I've tried using a memorypool or a fixed size container, but I found nothing faster than vector or dequeue.

Still looking for improvement here. I suspect the code is globally to complex and not flat enough.
User avatar
hgm
Posts: 27796
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Looking for advices

Post by hgm »

You might wonder whether it pays at all to check if moves are legal. Fairy-Max doesn't do that at all. Most moves are legal, and on those the check will be a waste of time. On the occasional illegal move, the reply node will find it can capture the King, and simply return a +INF score for that.

When you are in check, you know beforehand that most moves will leave you in check, and then it might pay to explicitly test them for resolving the check, before recursing. But better yet would be to selectively generate moves that resolve the existing check.
User avatar
xr_a_y
Posts: 1871
Joined: Sat Nov 25, 2017 2:28 pm
Location: France

Re: Looking for advices

Post by xr_a_y »

Thanks ! I'll try that soon.
User avatar
xr_a_y
Posts: 1871
Joined: Sat Nov 25, 2017 2:28 pm
Location: France

Re: Looking for advices

Post by xr_a_y »

Not performing IsInCheck check in QSearch indeed gets 100Knps more. I'll check if strength goes the same way today. Thanks !
Ras
Posts: 2487
Joined: Tue Aug 30, 2016 8:19 pm
Full name: Rasmus Althoff

Re: Looking for advices

Post by Ras »

That malloc/free stuff looks suspicious. It's not only those 10% CPU time, but spraying objects throughout the heap also will trash the CPU cache and slow down the rest of the program. I'd consider getting rid of dynamic memory allocation, except for the hashtables (they won't fit otherwise).
User avatar
xr_a_y
Posts: 1871
Joined: Sat Nov 25, 2017 2:28 pm
Location: France

Re: Looking for advices

Post by xr_a_y »

Thanks for the advice.

But trying to do so, I need a container of fixed size for move generation. It appears that in Weini, if moves container is of size 64, the engine is faster than using vectors BUT of course 64 moves is not enough ... if the container if 128 moves sizes, speed is the same than with vector BUT in some case 128 moves is not enough. Using bigger size even around 150 or 160 is slow than with vector. And I found that in a lazy SMP implementation things become harder with this kind of fixed size containers...

Another malloc/free stuff I tried to optimize is the position copy/make or make/unmake thing. I was never able to make make/unmake work fine with nullmove and en-passant (some programming errors) but I try to use a memory pool for position copy. Here again, the memory pool is just as efficient as the standard implementation.

So I stuck here, without ideas on the malloc/free thing that indeed seems to cost a lot...