Why is MultiPV so slow?

Sesse · Post by **Sesse** » Sat Jan 25, 2020 3:28 pm

So, of course I get why MultiPV is much slower than a regular single-PV search; you can't use the cutoffs and have to research suboptimal moves to score them.

But still, the slowdown seems extreme. Right now, I'm searching the same position on Stockfish 11 on very similar hardware (basically two-socket Haswell/Broadwell servers). Single-PV is at depth 52. Multi-PV, with 29 candidate moves, is still working on depth 31!

A more controlled test: I took the same position and ran to depth 20 with single-threaded Stockfish twice, once with single-PV and once with multi-PV 100 (which then should become 29 in practice, I suppose), with restarts in-between. Single-PV finished in 1086 ms, multi-PV finished in roughly 58 seconds—so not only do I get zero benefit from sharing analysis between the lines, I actually get _more_ than even the 29x slowdown I would assume as worst case?

Another test; I took the same position, added all 29 moves in turn and ran each of those positions at depth 20 (so, overall a deeper search than what I'd expect full multi-PV would do, which would be 19). That completed in roughly 48 seconds, so still faster than automating essentially the same thing with multi-PV. I can't figure out what the story is. Is there something I'm missing? Does multi-PV also make the tree wider deeper in the search, not just at the root?

MikeB · Post by **MikeB** » Sat Jan 25, 2020 4:40 pm

Sesse wrote: ↑Sat Jan 25, 2020 3:28 pm So, of course I get why MultiPV is much slower than a regular single-PV search; you can't use the cutoffs and have to research suboptimal moves to score them.

But still, the slowdown seems extreme. Right now, I'm searching the same position on Stockfish 11 on very similar hardware (basically two-socket Haswell/Broadwell servers). Single-PV is at depth 52. Multi-PV, with 29 candidate moves, is still working on depth 31!

A more controlled test: I took the same position and ran to depth 20 with single-threaded Stockfish twice, once with single-PV and once with multi-PV 100 (which then should become 29 in practice, I suppose), with restarts in-between. Single-PV finished in 1086 ms, multi-PV finished in roughly 58 seconds—so not only do I get zero benefit from sharing analysis between the lines, I actually get _more_ than even the 29x slowdown I would assume as worst case?

Another test; I took the same position, added all 29 moves in turn and ran each of those positions at depth 20 (so, overall a deeper search than what I'd expect full multi-PV would do, which would be 19). That completed in roughly 48 seconds, so still faster than automating essentially the same thing with multi-PV. I can't figure out what the story is. Is there something I'm missing? Does multi-PV also make the tree wider deeper in the search, not just at the root?

simple answer - yes, the mutilipv loop is inside the iterative deepening loop, which would make the search tree wider,(fwiw, a mulitpv setting of 100 , exceeds the number of possible moves found in a normal chess games , which is around 80 or so.)

Stockfish code:

Code: Select all

 // Iterative deepening loop until requested to stop or the target depth is reached
  while (   ++rootDepth < MAX_PLY
         && !Threads.stop
         && !(Limits.depth && mainThread && rootDepth > Limits.depth))
  {
      // Age out PV variability metric
      if (mainThread)
          totBestMoveChanges /= 2;

      // Save the last iteration's scores before first PV line is searched and
      // all the move scores except the (new) PV are set to -VALUE_INFINITE.
      for (RootMove& rm : rootMoves)
          rm.previousScore = rm.score;

      size_t pvFirst = 0;
      pvLast = 0;

      if (!Threads.increaseDepth)
         searchAgainCounter++;

      // MultiPV loop. We perform a full root search for each PV line
      for (pvIdx = 0; pvIdx < multiPV && !Threads.stop; ++pvIdx)
      {
          if (pvIdx == pvLast)
          {
              pvFirst = pvLast;
              for (pvLast++; pvLast < rootMoves.size(); pvLast++)
                  if (rootMoves[pvLast].tbRank != rootMoves[pvFirst].tbRank)
                      break;
          }

          // Reset UCI info selDepth for each depth and each PV line
          selDepth = 0;

          // Reset aspiration window starting size
          if (rootDepth >= 4)
          {
              Value previousScore = rootMoves[pvIdx].previousScore;
              delta = Value(21 + abs(previousScore) / 256);
              alpha = std::max(previousScore - delta,-VALUE_INFINITE);
              beta  = std::min(previousScore + delta, VALUE_INFINITE);

              // Adjust contempt based on root move's previousScore (dynamic contempt)
              int dct = ct + (102 - ct / 2) * previousScore / (abs(previousScore) + 157);

              contempt = (us == WHITE ?  make_score(dct, dct / 2)
                                      : -make_score(dct, dct / 2));
          }

          // Start with a small aspiration window and, in the case of a fail
          // high/low, re-search with a bigger window until we don't fail
          // high/low anymore.
          int failedHighCnt = 0;
          while (true)
          {
              Depth adjustedDepth = std::max(1, rootDepth - failedHighCnt - searchAgainCounter);
              bestValue = ::search<PV>(rootPos, ss, alpha, beta, adjustedDepth, false);

              // Bring the best move to the front. It is critical that sorting
              // is done with a stable algorithm because all the values but the
              // first and eventually the new best one are set to -VALUE_INFINITE
              // and we want to keep the same order for all the moves except the
              // new PV that goes to the front. Note that in case of MultiPV
              // search the already searched PV lines are preserved.
              std::stable_sort(rootMoves.begin() + pvIdx, rootMoves.begin() + pvLast);

              // If search has been stopped, we break immediately. Sorting is
              // safe because RootMoves is still valid, although it refers to
              // the previous iteration.
              if (Threads.stop)
                  break;

              // When failing high/low give some update (without cluttering
              // the UI) before a re-search.
              if (   mainThread
                  && multiPV == 1
                  && (bestValue <= alpha || bestValue >= beta)
                  && Time.elapsed() > 3000)
                  sync_cout << UCI::pv(rootPos, rootDepth, alpha, beta) << sync_endl;

              // In case of failing low/high increase aspiration window and
              // re-search, otherwise exit the loop.
              if (bestValue <= alpha)
              {
                  beta = (alpha + beta) / 2;
                  alpha = std::max(bestValue - delta, -VALUE_INFINITE);

                  failedHighCnt = 0;
                  if (mainThread)
                      mainThread->stopOnPonderhit = false;
              }
              else if (bestValue >= beta)
              {
                  beta = std::min(bestValue + delta, VALUE_INFINITE);
                  ++failedHighCnt;
              }
              else
              {
                  ++rootMoves[pvIdx].bestMoveCount;
                  break;
              }

              delta += delta / 4 + 5;

              assert(alpha >= -VALUE_INFINITE && beta <= VALUE_INFINITE);
          }

Sesse · Post by **Sesse** » Sat Jan 25, 2020 5:27 pm

Yes, the point of setting MultiPV to 100 is to get all moves, so it's intentional. I'm a bit confused at the importance of it being inside iterative deepening, though; I assumed iterative deepening was only run at the root?

elcabesa · Post by **elcabesa** » Sat Jan 25, 2020 5:31 pm

iterative deepening search PV deeper and deeper.

the standard way of doing multiPV (to have all the PV at the same depth) is to do multiPV inside iterative deepening, so you search:
1) all the PV at depth 1,
2) all the PV at depth 2,
...
..
n) all the PV at depth N

this will probably decrease the efficiency of search helpers ( transposition table, and other move ordering algorithms)

hgm · Post by **hgm** » Sat Jan 25, 2020 5:42 pm

Multi-PV with N moves should not take more than N times as much time as a single-move search. If it does, it is not properly implemented.

Sesse · Post by **Sesse** » Sat Jan 25, 2020 6:44 pm

OK, so potentially we're either looking at something wrong in my timing, or a bona fide Stockfish bug?

Joerg Oster · Post by **Joerg Oster** » Sat Jan 25, 2020 6:55 pm

Sesse wrote: ↑Sat Jan 25, 2020 6:44 pm OK, so potentially we're either looking at something wrong in my timing, or a bona fide Stockfish bug?

No bug, it's intentional.
When the 1st move is searched as PV move, all remaining moves are being searched as nonPV moves, too!
Even those which will be searched as PV moves later anyways!
This is also done for the 2nd PV move except the 1st best move is excluded, and so on.

Edit: You can try this https://github.com/joergoster/Moonfish SF fork where I changed the multiPV behavior.

Sesse · Post by **Sesse** » Sat Jan 25, 2020 6:59 pm

That's interesting! But shouldn't it still be sub-linear, since the later moves should be cheaper to search (better move ordering, fewer moves to search) than the main PV?

Joerg Oster · Post by **Joerg Oster** » Sat Jan 25, 2020 7:37 pm

Sesse wrote: ↑Sat Jan 25, 2020 6:59 pm That's interesting! But shouldn't it still be sub-linear, since the later moves should be cheaper to search (better move ordering, fewer moves to search) than the main PV?

Maybe yes, maybe no. I don't know.

Alayan · Post by **Alayan** » Sat Jan 25, 2020 7:44 pm

Later moves tend to be worse. Worse, means more fail-lows. More fail-lows means more time spent to resolve them. So it's not unusual to have the later moves being more costly to search and absorbing a disproportionate part of the "nodes budget".

If you want a quick approximation for all moves, and the bad moves not hurting too much the search of the better ones, you need to reduce the searched depth of bad moves, according to eval and/or move number.

Why is MultiPV so slow?

Why is MultiPV so slow?

Re: Why is MultiPV so slow?

Re: Why is MultiPV so slow?

Re: Why is MultiPV so slow?

Re: Why is MultiPV so slow?

Re: Why is MultiPV so slow?

Re: Why is MultiPV so slow?

Re: Why is MultiPV so slow?

Re: Why is MultiPV so slow?

Re: Why is MultiPV so slow?