Page 1 of 4

Comparing two version of the same engine

Posted: Sun Oct 26, 2008 7:01 pm
by Kempelen
Hi

I release Rodin v1.14 a few months ago. Now I am writting improvements for a new version and have doubts about doing a match between both versions for testing purposes

How many games and what result do you consider that a match between both engines would be necesary to know that the new one is stronger?.

Thx

Re: Comparing two version of the same engine

Posted: Sun Oct 26, 2008 8:33 pm
by Karmazen & Oliver
Kempelen wrote:Hi

I release Rodin v1.14 a few months ago. Now I am writting improvements for a new version and have doubts about doing a match between both versions for testing purposes

How many games and what result do you consider that a match between both engines would be necesary to know that the new one is stronger?.

Thx
If it is the same engine, Do a match with indentical plys, can say you which has but chess knowledge...

and see how many time need for that... if the codig are more fast...

Re: Comparing two version of the same engine

Posted: Sun Oct 26, 2008 8:48 pm
by swami
Kempelen wrote: How many games and what result do you consider that a match between both engines would be necesary to know that the new one is stronger?.

Thx
100 games against distinct engines and blitz 1/1 gauntlet Nunn or Noomen test for a start to keep track of progress in improvement.

Re: Comparing two version of the same engine

Posted: Sun Oct 26, 2008 9:35 pm
by bob
Kempelen wrote:Hi

I release Rodin v1.14 a few months ago. Now I am writting improvements for a new version and have doubts about doing a match between both versions for testing purposes

How many games and what result do you consider that a match between both engines would be necesary to know that the new one is stronger?.

Thx
If you are playing version A against version B, there is no way to determine which is better. You need to play both versions against a group of common opponents instead.

Re: Comparing two version of the same engine

Posted: Mon Oct 27, 2008 1:30 am
by Karmazen & Oliver
bob wrote:
Kempelen wrote:Hi

I release Rodin v1.14 a few months ago. Now I am writting improvements for a new version and have doubts about doing a match between both versions for testing purposes

How many games and what result do you consider that a match between both engines would be necesary to know that the new one is stronger?.

Thx
If you are playing version A against version B, there is no way to determine which is better. You need to play both versions against a group of common opponents instead.
there is no way to determine which is better???

it´s simple . ENGINE A versus ENGINE B.

a match ply: similar.
other macht time similar.

If A is better that B. A win.

Re: Comparing two version of the same engine

Posted: Mon Oct 27, 2008 2:14 am
by geots
Karmazen & Oliver wrote:
bob wrote:
Kempelen wrote:Hi

I release Rodin v1.14 a few months ago. Now I am writting improvements for a new version and have doubts about doing a match between both versions for testing purposes

How many games and what result do you consider that a match between both engines would be necesary to know that the new one is stronger?.

Thx
If you are playing version A against version B, there is no way to determine which is better. You need to play both versions against a group of common opponents instead.
there is no way to determine which is better???

it´s simple . ENGINE A versus ENGINE B.

a match ply: similar.
other macht time similar.

If A is better that B. A win.

Bob is dead-on right on this one. There is no argument to make. Case closed.


Best,

Re: Comparing two version of the same engine

Posted: Mon Oct 27, 2008 8:58 am
by Kempelen
bob wrote:If you are playing version A against version B, there is no way to determine which is better. You need to play both versions against a group of common opponents instead.
Then supposing a gounlet tournament of Engine A versus, for example, 12 engines and other tournment of engine B versus those 12 engines too, how many games are needed, appropiate time level, and % score difference is enought to say that A is better than B?

Re: Comparing two version of the same engine

Posted: Mon Oct 27, 2008 9:28 am
by krazyken
Kempelen wrote:
bob wrote:If you are playing version A against version B, there is no way to determine which is better. You need to play both versions against a group of common opponents instead.
Then supposing a gounlet tournament of Engine A versus, for example, 12 engines and other tournment of engine B versus those 12 engines too, how many games are needed, appropiate time level, and % score difference is enought to say that A is better than B?
There are a few variables to consider, such as how many ELO difference is actually better? The smaller the difference you want to detect, the more games are needed. The quick and easy way is to run some games, and put all the games in one pgn file (make sure both versions have a different name). Then use BayesELO on that pgn file to get relative ratings, where you can see +/- margin of error. You can also use the LOS (likelihood of superiority) function in BayesELO to enhance your picture. If the margins aren't small enough run some more games. An important factor is to try to avoid repeating games, so use different starting positions for each game.

I'm sure someone may come along with exact math for you, my guess is that with 12 opponents you could start with 8 games with each (a 96 game run). That should give you a decent starting point to decide if you want more games.

Re: Comparing two version of the same engine

Posted: Mon Oct 27, 2008 4:47 pm
by bob
Karmazen & Oliver wrote:
bob wrote:
Kempelen wrote:Hi

I release Rodin v1.14 a few months ago. Now I am writting improvements for a new version and have doubts about doing a match between both versions for testing purposes

How many games and what result do you consider that a match between both engines would be necesary to know that the new one is stronger?.

Thx
If you are playing version A against version B, there is no way to determine which is better. You need to play both versions against a group of common opponents instead.
there is no way to determine which is better???

it´s simple . ENGINE A versus ENGINE B.

a match ply: similar.
other macht time similar.

If A is better that B. A win.
It isn't quite that simple. Your new change might have a side-effect of weakening some other part of your game, but your program doesn't understand (say) the finer points of king-side attack, so you won't notice that this new change has actually made your program worse, because the only opponent you test against can't exploit the weakness...

This is why "inbreeding" is bad for biological reproduction.

Re: Comparing two version of the same engine

Posted: Mon Oct 27, 2008 4:49 pm
by bob
Kempelen wrote:
bob wrote:If you are playing version A against version B, there is no way to determine which is better. You need to play both versions against a group of common opponents instead.
Then supposing a gounlet tournament of Engine A versus, for example, 12 engines and other tournment of engine B versus those 12 engines too, how many games are needed, appropiate time level, and % score difference is enought to say that A is better than B?
Depends on how much better. If new version is 200 elo better, you can figure that out in 50 games. If it is 2 elo better, you will need almost 100,000 games...