Measuring elo rating engine outside tournament

Discussion of anything and everything relating to chess playing software and machines.

Moderators: bob, hgm, Harvey Williamson

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
Post Reply
ydebilloez
Posts: 49
Joined: Tue Jun 27, 2017 9:01 pm
Location: Lubumbashi
Full name: Yves De Billoëz
Contact:

Measuring elo rating engine outside tournament

Post by ydebilloez » Tue Jul 04, 2017 12:58 pm

This must be an old topic, but I cannot find the answer I am seeking on the forum nor on google.

I would like to know how to test the elo rating of an engine based on testsets and not on tournament results.

In tournaments, elo rating is calculated based on the initial elo rating of the participating players. This has 2 drawbacks, if the initial rating is not accurate, and thus the results are not accurate. Secondly, you need a lot of games and a lot of games in between almost equal strength players to have a really accurate result. A player winning or losing all his games is not contributing a lot to calculate the real elo of itself or of his adversaries.

To illustrate, my engine has a difference of about 600 elo in between its lowest and its highest calculated... As a human player, doing tests positions, my actual elo is maximum 100 off my real elo.

Are there any epd test files for different elo ratings that will give you estimated elo based on its score. e.g. thinking time 60 seconds, score 30% means this estimated elo, score 40% gives this....

Given the large span in between the engines, the ones that interest me the most are those that are sub-master level. The top playing programs are already calculated in tournaments. Additional requirement, those engines need to be WB or UCI so we can automate the tests. Any suggestions for good linux engines in the range 1200-2200.

Anyone having usable testsets?

If we have enough engines for which the score calculated is fairly correct, we can use their test results in order to ponder the test sets. Anyone having done this?

If you tried to answer the same question, how have you organised the testing?

Regards,
--
Yves De Billoëz
Yves at macchess.internetcontact.be

smatovic
Posts: 1142
Joined: Wed Mar 10, 2010 9:18 pm
Location: Hamburg, Germany
Full name: Srdja Matovic
Contact:

Re: Measuring elo rating engine outside tournament...STS

Post by smatovic » Tue Jul 04, 2017 1:50 pm


ydebilloez
Posts: 49
Joined: Tue Jun 27, 2017 9:01 pm
Location: Lubumbashi
Full name: Yves De Billoëz
Contact:

Re: Measuring elo rating engine outside tournament...STS

Post by ydebilloez » Wed Jul 05, 2017 1:48 pm

Thanks.

This is a windows binary.... will give it a try under wine. It would be nice to have the source code to port it to linux/mac.

At the same time, I found following information: BT 2450 et BT 2630

http://eric.terrien.pagesperso-orange.f ... age13.html

Will investigate how to automate. Any further clues still welcome.
--
Yves De Billoëz
Yves at macchess.internetcontact.be

Post Reply