What's the fastest time control you can effectively test at?

jordanbray · Post by **jordanbray** » Sat May 30, 2015 3:28 pm

I've been running more tests to tune my engine, and have now gotten in the habbit of playing lots of games with certain features disabled, to see if anything was actually hurting the engine.

Turns out, at 40/.5+.01 time control, many of the evaluation features have performed poorly. However, I'm becoming a bit concerned that the time control is too fast.

However, running at this short of a time control enables me to run many more tests (46500 games overnight, with 31 (!) builds of the engine).

What time control do you most often test at, and what penalties exist for tuning to these fast time controls?

Ferdy · Post by **Ferdy** » Sat May 30, 2015 4:05 pm

jordanbray wrote:I've been running more tests to tune my engine, and have now gotten in the habbit of playing lots of games with certain features disabled, to see if anything was actually hurting the engine.

Turns out, at 40/.5+.01 time control, many of the evaluation features have performed poorly. However, I'm becoming a bit concerned that the time control is too fast.

However, running at this short of a time control enables me to run many more tests (46500 games overnight, with 31 (!) builds of the engine).

What time control do you most often test at, and what penalties exist for tuning to these fast time controls?

I usually use 60s + 100ms. Anything lower than that is just an experiment to see how values would perform. Candidate for release will use 120s + 100ms. There are times I use 90min +30s just to observe time allocation usage and ebf.
In my computer i7 2600k I don't get reliable result below 60s tc.

Bloodbane · Post by **Bloodbane** » Sat May 30, 2015 5:10 pm

I use 5s+0.05s and 15s+0.05s, and I could go as low as 1s+0.08s but anything lower than that starts causing time losses. I think these time controls are a bit fast and might have caused some quirks since Hakkapeliitta is strongest at short time controls (the shorter the better). However, this was a conscious choice which allows me to keep a fast development pace. Now that the amount of ideas I have is a lot smaller than before I am thinking of increasing the time controls.

matthewlai · Post by **matthewlai** » Sat May 30, 2015 8:40 pm

jordanbray wrote:I've been running more tests to tune my engine, and have now gotten in the habbit of playing lots of games with certain features disabled, to see if anything was actually hurting the engine.

Turns out, at 40/.5+.01 time control, many of the evaluation features have performed poorly. However, I'm becoming a bit concerned that the time control is too fast.

However, running at this short of a time control enables me to run many more tests (46500 games overnight, with 31 (!) builds of the engine).

What time control do you most often test at, and what penalties exist for tuning to these fast time controls?

I find that it's much better to play with low initial time and high increment, since at low increments, moves are made faster and faster towards the end, and long games become lightning games.

I use 1s + 0.5s, and games take about 30 seconds each. My engine can reliably run without time losses at 0.1s+0.01s using my own matching program (similar to cutechess), but most other engines I've tried couldn't.

In any case, I don't want to test that fast anyways, since real games aren't usually played that fast... and also because I have access to a cluster with hundreds of CPUs.

If you are not affiliated with a university (many of which have computing clusters), and don't mind spending some money, Amazon EC2 can rent you fairly fast Xeon cores for about $0.01 per core per hour. The trick is to use older generation spot instances that virtually don't fluctuate in price.

jordanbray · Post by **jordanbray** » Sat May 30, 2015 10:16 pm

Is there a program that can read in a pgn file (or, better, multiple pgn files) and report the elo difference? Something similar to cutechess's reporting?

hgm · Post by **hgm** » Sat May 30, 2015 10:24 pm

I always uses BayesElo for that.

Adam Hair · Post by **Adam Hair** » Sat May 30, 2015 10:46 pm

hgm wrote:I always uses BayesElo for that.

This is where I am suppose to extoll the virtues of Ordo

jdart · Post by **jdart** » Sun May 31, 2015 1:39 am

I am currently using 0:04+0.1 (4 sec + .1 sec increment). At this rate I am getting 10-11 ply searches typically, which seems to be adequate. And most engines do not lose on time at this TC.

--Jon

lucasart · Post by **lucasart** » Sun May 31, 2015 5:27 am

jordanbray wrote:I've been running more tests to tune my engine, and have now gotten in the habbit of playing lots of games with certain features disabled, to see if anything was actually hurting the engine.

Turns out, at 40/.5+.01 time control, many of the evaluation features have performed poorly. However, I'm becoming a bit concerned that the time control is too fast.

However, running at this short of a time control enables me to run many more tests (46500 games overnight, with 31 (!) builds of the engine).

What time control do you most often test at, and what penalties exist for tuning to these fast time controls?

Depends on your engine, and on your testing environment. Beware of overheads (read: Windows + bloated chess GUI). If you compile cutechess-cli from source (latest code from gituhb), you get a sub-millisecond timer (QElapsedTimer). Using Linux and cutechess-cli, I can test Stockfish patches in 1"+0.01", even using HT cores (ie. 7 concurrent games on 4 physical cores, 8 HT cores).

Problem is, sometimes results from extremely fast tc are not always (though often) good predictors of long tc results. That's very annoying, because there's no way you can know in advance (ie. without testing at short TC and then long TC).

Things to watch out for (read: test at short TC, then verify at long TC):
* king safety related
* passed pawn related
* search related (especially if kicks in at high depths, obviously)

Other eval patches are generally fine to test at hyper fast tc, and that's even beneficial, because it increases the sensitivity of the measure.

Another possibility for eval patches is to test at fixed depth (obviously not suitable for search patches). It works more than you can imagine. Given a choice between extreme tc (eg. 1+0.01) or fixed depth, fixed depth behaves better and should be prefered (read: higher draw rate for given throughput).

What's the fastest time control you can effectively test at?

What's the fastest time control you can effectively test at?

Re: What's the fastest time control you can effectively test

Re: What's the fastest time control you can effectively test

Re: What's the fastest time control you can effectively test

Re: What's the fastest time control you can effectively test

Re: What's the fastest time control you can effectively test

Re: What's the fastest time control you can effectively test

Re: What's the fastest time control you can effectively test

Re: What's the fastest time control you can effectively test