Measuring elo rating engine outside tournament

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

ydebilloez
Posts: 163
Joined: Tue Jun 27, 2017 11:01 pm
Location: Lubumbashi
Full name: Yves De Billoëz

Measuring elo rating engine outside tournament

Post by ydebilloez »

This must be an old topic, but I cannot find the answer I am seeking on the forum nor on google.

I would like to know how to test the elo rating of an engine based on testsets and not on tournament results.

In tournaments, elo rating is calculated based on the initial elo rating of the participating players. This has 2 drawbacks, if the initial rating is not accurate, and thus the results are not accurate. Secondly, you need a lot of games and a lot of games in between almost equal strength players to have a really accurate result. A player winning or losing all his games is not contributing a lot to calculate the real elo of itself or of his adversaries.

To illustrate, my engine has a difference of about 600 elo in between its lowest and its highest calculated... As a human player, doing tests positions, my actual elo is maximum 100 off my real elo.

Are there any epd test files for different elo ratings that will give you estimated elo based on its score. e.g. thinking time 60 seconds, score 30% means this estimated elo, score 40% gives this....

Given the large span in between the engines, the ones that interest me the most are those that are sub-master level. The top playing programs are already calculated in tournaments. Additional requirement, those engines need to be WB or UCI so we can automate the tests. Any suggestions for good linux engines in the range 1200-2200.

Anyone having usable testsets?

If we have enough engines for which the score calculated is fairly correct, we can use their test results in order to ponder the test sets. Anyone having done this?

If you tried to answer the same question, how have you organised the testing?

Regards,
Yves De Billoëz @ macchess belofte chess
Once owner of a Mephisto I, II, challenger, ... chess computer.
smatovic
Posts: 2642
Joined: Wed Mar 10, 2010 10:18 pm
Location: Hamburg, Germany
Full name: Srdja Matovic

Re: Measuring elo rating engine outside tournament...STS

Post by smatovic »

ydebilloez
Posts: 163
Joined: Tue Jun 27, 2017 11:01 pm
Location: Lubumbashi
Full name: Yves De Billoëz

Re: Measuring elo rating engine outside tournament...STS

Post by ydebilloez »

Thanks.

This is a windows binary.... will give it a try under wine. It would be nice to have the source code to port it to linux/mac.

At the same time, I found following information: BT 2450 et BT 2630

http://eric.terrien.pagesperso-orange.f ... age13.html

Will investigate how to automate. Any further clues still welcome.
Yves De Billoëz @ macchess belofte chess
Once owner of a Mephisto I, II, challenger, ... chess computer.