--- progress update ---
I'm in the process of recreating the bits and pieces that work together in MinimalChess in the new framework one by one.
After pure minmax the next logical step was to add alpha-beta pruning and the difference is massive. Searching each position of the testset to depth 7 took almost an hour before and less then 2 seconds after. The search tree lost 99,95% of it's nodes!?  
 
 
So, I was excited to implement MVV-LVA move ordering next expecting another huge step forward... instead the tree size shrunk only a little retaining 75% of the nodes. For example on depth 9 alpha-beta visited 94B nodes and still visited 69B nodes. Apparently the fact that I already generated and played captures before quiets made the move ordering almost obsolete. Now, if you consider that the move ordering isn't for free and takes *time* so maybe you visit 25% less nodes but if the speed of the search (nodes per second) drops by 25% you have gained nothing.
I thought Mvv-Lva was such a crucial component in any chess engine but my results didn't confirm it.I double and triple-checked the implementation but it seems like the moves are played in the right order... it just doesn't matter much. What is going on?
I postponed the question, shelved the code, and moved on to the next thing considered rather crucial: quiescence search. And guess what... adding it completely killed the performance of my search. So much so that again I thought there must be a bug. With qsearch enabled (but without move-ordering of the captures) searching to a specific depth of 7 took 40x longer. The time a search took *with* quiescence enabled exploded so much that in a runtime- or even node-limited scenario I could get much more solved positions by not doing qsearch at all but searching ~2 plies deeper.
So both individual results had me stumped and caused me to question my abilities to code straight. Only when I combined the two and used move-ordering in the qsearch my results finally started to make *some* sense. By playing something like PxQ first the qsearch part of the tree loses 90% of it's nodes. That's significant. And now the version with quiescence took "only" 5x longer. Still... that's a lot of time spent in qsearch and it's not only that the number of positions visited increases dramatically, the raw speed (visited nodes per second) really doesn't benefit from that kind of deep but narrow search where you invoke the move generator a lot to generate all the captures only to then scan for the one capture that refutes the previous "stupid" move and the rest was generated for nothing.
Considering the 60M+ nps I was getting in perft my current performance metrics of 14M nps with q-search enabled are a little depressing.
Maybe I'm doing too much in my quiescence search?
Code: Select all
        private static int EvaluateQuiet(int depth, int alpha, int beta, MoveGen moveGen)
        {
            NodesVisited++;
            BoardState current = Positions[depth];
            BoardState next = Positions[depth + 1];
            bool inCheck = current.IsChecked(current.SideToMove);
            //if inCheck we can't use standPat, need to escape check!
            if (!inCheck)
            {
                int standPatScore = (int)current.SideToMove * current.Eval.Score;
                if (standPatScore >= beta)
                    return beta;
                if (standPatScore > alpha)
                    alpha = standPatScore;
            }
            //play the possible captures in mvv-lva order
            bool movesPlayed = false;
            for (int i = moveGen.CollectCaptures(current); i < moveGen.Next; i++)
            {
                PickBestMove(i, moveGen.Next);
                if (next.PlayAndUpdate(current, ref Moves[i]))
                {
                    movesPlayed = true;
                    int score = -EvaluateQuiet(depth + 1, -beta, -alpha, moveGen);
                    if (score >= beta)
                        return beta;
                    if (score > alpha)
                        alpha = score;
                }
            }
            if(inCheck)
            {
                //because we need to escape check we have to search all the quiet moves
                for (int i = moveGen.CollectQuiets(current); i < moveGen.Next; i++)
                {
                    if (next.PlayAndUpdate(current, ref Moves[i]))
                    {
                        movesPlayed = true;
                        int score = -EvaluateQuiet(depth + 1, -beta, -alpha, moveGen);
                        if (score >= beta)
                            return beta;
                        if (score > alpha)
                            alpha = score;
                    }
                }
                //no moves escape check? guess it's game over
                if (!movesPlayed)
                    return Evaluation.Checkmate(current.SideToMove, depth);
            }
            //stalemate?
            //if (expandedNodes == 0 && !LegalMoves.HasMoves(position))
            //    return 0;
            return alpha;
        }
...or maybe chess puzzles (
https://www.chessprogramming.org/Win_at_Chess) are not the most representative?
But anyway, after a lot of confusion and trying to make sense of my measurements I think there's nothing seriously wrong and everything is more or less as you should expect it. Just that I really didn't have a very accurate gut-feel about it all. I just hope that when I'm all done there's some significant performance advantage left (compared to MinimalChess) or else this whole endavour would feel rather pointless in retrospect.