Debugging regression tests

Discussion of chess software programming and technical issues.

Moderators: hgm, Harvey Williamson, bob

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
Post Reply
User avatar
Onno Garms
Posts: 224
Joined: Mon Mar 12, 2007 6:31 pm
Location: Bonn, Germany

Debugging regression tests

Post by Onno Garms » Thu Jun 16, 2011 8:37 pm

By Marika's post I became aware that not everybody knows how to debug regression tests. In my opinion it is important to have a standard way to spot what caused the difference. Otherwise you will come in situations where you hunt for the cause for weeks again and again.

I have a class EventLogger:

Code: Select all

#ifndef _EVENT_LOGGER_HPP
#define _EVENT_LOGGER_HPP

#include <cstddef>

class EventLogger
&#123;
private&#58;
  // index of interval in EventLoggerRc&#58;&#58;threshold in that d_number is
  static size_t s_stage;
  // counter for number of events &#40;calls of next while enabled&#41;
  static int    s_number;

public&#58;
  // notifies logger about new event
  // @return should event be logged?
  static bool next    ();
  // number of last event
  static int  number  ();
&#125;;

#endif

Code: Select all

#include "event_logger.hpp"
#include "rc_settings.hpp"

// from rc_settings.hpp; provides vector<int>
// with values read from a configuration file
OG_SETTINGS &#40;EventLogger,
             &#40;IntVector&#41; &#40;frequency&#41; ("")
             &#40;IntVector&#41; &#40;threshold&#41; (""))

static void breakpoint ()
&#123;
    // breakpoint in this function will not be "optimized" away
&#125;

size_t EventLogger&#58;&#58;s_stage    = 0;
int    EventLogger&#58;&#58;s_number   = 0;


bool EventLogger&#58;&#58;next ()
&#123;
  // count event
  ++s_number;
  if &#40;s_number==1234&#41;
    breakpoint&#40;);

  // go to next stage
  if &#40;s_stage < EventLoggerRc&#58;&#58;threshold.size&#40;)
      && s_number > EventLoggerRc&#58;&#58;threshold&#91;s_stage&#93;)
  &#123;
    ++s_stage;
  &#125;

  // compute if to log
  return s_stage >= EventLoggerRc&#58;&#58;frequency.size&#40;)
         || s_number % EventLoggerRc&#58;&#58;frequency&#91;s_stage&#93; == 0;
&#125;


int EventLogger&#58;&#58;number ()
&#123;
  return s_number;
&#125;

This class will count calls to next() in s_number and return true on every n-th call, where n is EventLoggerRc::frequency[s_stage].

s_stage starts with 0 and is incremented every time s_number becomes larger than threshold[s_stage].

When the search to examine is short, I just set frequency to 1 and no thresholds.

When the search is larger, I set frequency to 10000 at the beginning. When I see that the difference arose between call 47110000 and 47120000, I have another run with frequency = { 10000, 1, 10000} and threshold = {47110000, 47120000}. This way I see that the last identical output was for call number 47110815. In a third run, I change 1234 to 47110815 and set a breakpoint in function breakpoint(), resulting in a conditional breakpoint where the condition is compiled in. Normal conditional breakpoints are too slow in most cases.

Once the debugger stands at the last identical event of both versions, I step though them in parallel. If the calls to EventLogger are dense enough, I can spot easily where the versions ramify.

I have calls to EventLogger distributed over my code, commented out by macros. I can add calls at more specific positions in both versions, but in most cases it is sufficient to enable the calls that are already there. Calls look like this:

Code: Select all

#ifdef OG_LOG_BOARD
  if &#40;BoardRc&#58;&#58;log && v_undo && EventLogger&#58;&#58;next&#40;))
  &#123;
    std&#58;&#58;cerr << EventLogger&#58;&#58;number&#40;)
              << " " << g_messenger.rank&#40;) << "&#58;" // thread ID
              << " Board&#58;&#58;do_move " << p_move
              << std&#58;&#58;endl;
  &#125;
#endif
This is a foolproof way to track down differences in any regression tests. It involves dull work, but it is much faster than speculating every time.

Btw. 4711 and 0815 are the German placeholder numbers, just as foobar and John Doe. I don't know about placeholder numbers in other languages.

Rein Halbersma
Posts: 685
Joined: Tue May 22, 2007 9:13 am

Re: Debugging regression tests

Post by Rein Halbersma » Thu Aug 11, 2011 8:09 am

Onno Garms wrote: Btw. 4711 and 0815 are the German placeholder numbers, just as foobar and John Doe. I don't know about placeholder numbers in other languages.
It's not exactly the same as the German placeholders, but Hexspeak comes close:
http://en.wikipedia.org/wiki/Hexspeak

Post Reply