Rybka odds matches and the strength of engines

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Harvey Williamson, bob

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
User avatar
Laskos
Posts: 8977
Joined: Wed Jul 26, 2006 8:21 pm
Full name: Kai Laskos

Rybka odds matches and the strength of engines

Post by Laskos » Sat Jun 09, 2012 5:16 pm

I put a Rybka self-play randomizer match (search window 3cp) with the "pawn and move" handicap (f7 removed). I don't know if this was done before, maybe some Rybka Forum members would know.

[D]rnbqkbnr/ppppp1pp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq[/D]


200 games at 1s/move

169:31
+145 =48 -7

Some ~300 computer Elo points handicap.

In 2008 Rybka played 8 games with this handicap against GM Roman Dzindzichashvili, and 2 games against GM Vadim Milov, performing at Elo 2550 FIDE level. This is AFAIK the last series of computer-GM games in stable conditions.

Now a bit of speculations: engines improved since 2008 by some ~150 Elo points, for a total difference of ~450 computer Elo points, meaning some ~350 human Elo points (could someone confirm that computer ratings are exaggerating the differences?). Therefore a recent top engine on a quad (and tournament TC) would be ~2550+350 ~ 2900 Elo points on FIDE ratings. Seems a bit low compared to Elo 3200 assumed by many for these engines.

Kai

User avatar
hgm
Posts: 23165
Joined: Fri Mar 10, 2006 9:06 am
Location: Amsterdam
Full name: H G Muller
Contact:

Re: Rybka odds matches and the strength of engines

Post by hgm » Sat Jun 09, 2012 6:50 pm

This has been done before, (indeed on Rybka forum), and the results were not as extreme as what you report. In a far larger number of games the white advantage was ~72%, IIRC. This is in good agreement with what I found in self-play tests of Fairy-Max or Joker, and seemed almost completely independent on the level of play.

User avatar
Laskos
Posts: 8977
Joined: Wed Jul 26, 2006 8:21 pm
Full name: Kai Laskos

Re: Rybka odds matches and the strength of engines

Post by Laskos » Sat Jun 09, 2012 7:14 pm

hgm wrote:This has been done before, (indeed on Rybka forum), and the results were not as extreme as what you report. In a far larger number of games the white advantage was ~72%, IIRC. This is in good agreement with what I found in self-play tests of Fairy-Max or Joker, and seemed almost completely independent on the level of play.
Ok, but seems to depend on the time control (more equal at longer TC, therefore depends on the level?). So, a 2900 Elo FIDE level seems plausible for present top engines on a quad? I am bit surprised (the commonly accepted level is >3000).

Kai

Uri Blass
Posts: 8368
Joined: Wed Mar 08, 2006 11:37 pm
Location: Tel-Aviv Israel

Re: Rybka odds matches and the strength of engines

Post by Uri Blass » Sat Jun 09, 2012 7:48 pm

Laskos wrote:I put a Rybka self-play randomizer match (search window 3cp) with the "pawn and move" handicap (f7 removed). I don't know if this was done before, maybe some Rybka Forum members would know.

[D]rnbqkbnr/ppppp1pp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq[/D]


200 games at 1s/move

169:31
+145 =48 -7

Some ~300 computer Elo points handicap.

In 2008 Rybka played 8 games with this handicap against GM Roman Dzindzichashvili, and 2 games against GM Vadim Milov, performing at Elo 2550 FIDE level. This is AFAIK the last series of computer-GM games in stable conditions.

Now a bit of speculations: engines improved since 2008 by some ~150 Elo points, for a total difference of ~450 computer Elo points, meaning some ~350 human Elo points (could someone confirm that computer ratings are exaggerating the differences?). Therefore a recent top engine on a quad (and tournament TC) would be ~2550+350 ~ 2900 Elo points on FIDE ratings. Seems a bit low compared to Elo 3200 assumed by many for these engines.

Kai
I believe that the difference in playing strength against humans between having pawn advantage and not having pawn advantage is clearly
more than the difference in comp-comp games.

I also do not think that using self-play randomizer at 1 second per move is a good way to estimate elo computer difference.

self play randomizer at 1 second per move
means 2 things:
1)very fast time control that you do not use against humans
2)weaker playing strength relative to not using a randomizer.

Sedat Canbaz
Posts: 3018
Joined: Thu Mar 09, 2006 10:58 am
Location: Antalya/Turkey
Contact:

Re: Rybka odds matches and the strength of engines

Post by Sedat Canbaz » Sat Jun 09, 2012 9:32 pm

Laskos wrote:
hgm wrote:This has been done before, (indeed on Rybka forum), and the results were not as extreme as what you report. In a far larger number of games the white advantage was ~72%, IIRC. This is in good agreement with what I found in self-play tests of Fairy-Max or Joker, and seemed almost completely independent on the level of play.
Ok, but seems to depend on the time control (more equal at longer TC, therefore depends on the level?). So, a 2900 Elo FIDE level seems plausible for present top engines on a quad? I am bit surprised (the commonly accepted level is >3000).

Kai
Actually i tested Rybka without full pawn and its performance was approx.240 elo weaker than Rybka (default-with all peaces)

Rybka 4.1 WP x64 1c:
http://www.sedatcanbaz.com/chess/ratings/scct-auto232/

For more details about Human vs Engine Elo calculations:
http://www.talkchess.com/forum/viewtopi ... 1&start=20
http://www.talkchess.com/forum/viewtopi ... 4&start=50


Best,
Sedat

User avatar
Laskos
Posts: 8977
Joined: Wed Jul 26, 2006 8:21 pm
Full name: Kai Laskos

Re: Rybka odds matches and the strength of engines

Post by Laskos » Sat Jun 09, 2012 11:41 pm

Uri Blass wrote:
Laskos wrote:I put a Rybka self-play randomizer match (search window 3cp) with the "pawn and move" handicap (f7 removed). I don't know if this was done before, maybe some Rybka Forum members would know.

[D]rnbqkbnr/ppppp1pp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq[/D]


200 games at 1s/move

169:31
+145 =48 -7

Some ~300 computer Elo points handicap.

In 2008 Rybka played 8 games with this handicap against GM Roman Dzindzichashvili, and 2 games against GM Vadim Milov, performing at Elo 2550 FIDE level. This is AFAIK the last series of computer-GM games in stable conditions.

Now a bit of speculations: engines improved since 2008 by some ~150 Elo points, for a total difference of ~450 computer Elo points, meaning some ~350 human Elo points (could someone confirm that computer ratings are exaggerating the differences?). Therefore a recent top engine on a quad (and tournament TC) would be ~2550+350 ~ 2900 Elo points on FIDE ratings. Seems a bit low compared to Elo 3200 assumed by many for these engines.

Kai
I believe that the difference in playing strength against humans between having pawn advantage and not having pawn advantage is clearly
more than the difference in comp-comp games.
I don't know, are you sure?
I also do not think that using self-play randomizer at 1 second per move is a good way to estimate elo computer difference.

self play randomizer at 1 second per move
means 2 things:
1)very fast time control that you do not use against humans
2)weaker playing strength relative to not using a randomizer.
Actually 1s/move is not so fast, the difference at slower TC would be even smaller (Sedat is showing 240 points instead of 300), and I am already wondering that the difference is not very large. The weakening at 3cp window is not that important (I guess).

Kai

User avatar
Laskos
Posts: 8977
Joined: Wed Jul 26, 2006 8:21 pm
Full name: Kai Laskos

Re: Rybka odds matches and the strength of engines

Post by Laskos » Sat Jun 09, 2012 11:43 pm

Sedat Canbaz wrote:
Laskos wrote:
hgm wrote:This has been done before, (indeed on Rybka forum), and the results were not as extreme as what you report. In a far larger number of games the white advantage was ~72%, IIRC. This is in good agreement with what I found in self-play tests of Fairy-Max or Joker, and seemed almost completely independent on the level of play.
Ok, but seems to depend on the time control (more equal at longer TC, therefore depends on the level?). So, a 2900 Elo FIDE level seems plausible for present top engines on a quad? I am bit surprised (the commonly accepted level is >3000).

Kai
Actually i tested Rybka without full pawn and its performance was approx.240 elo weaker than Rybka (default-with all peaces)

Rybka 4.1 WP x64 1c:
http://www.sedatcanbaz.com/chess/ratings/scct-auto232/

For more details about Human vs Engine Elo calculations:
http://www.talkchess.com/forum/viewtopi ... 1&start=20
http://www.talkchess.com/forum/viewtopi ... 4&start=50


Best,
Sedat
Is this the same (f7) "move and pawn" handicap? Thanks for the links, I was not trying to define generally the human rating with respect to computers, but it occurred to me that an obscure to me rated 2500+ something GM actually drew an 8-game match against Rybka 3 on a quad (IIRC) at pawn odds, and a stronger GM beat Rybka. Also, I am trying to imagine what taking back N times odds could mean.

Kai
Last edited by Laskos on Sat Jun 09, 2012 11:53 pm, edited 2 times in total.

Sedat Canbaz
Posts: 3018
Joined: Thu Mar 09, 2006 10:58 am
Location: Antalya/Turkey
Contact:

Re: Rybka odds matches and the strength of engines

Post by Sedat Canbaz » Sat Jun 09, 2012 11:50 pm

Laskos wrote:
Sedat Canbaz wrote:
Laskos wrote:
hgm wrote:This has been done before, (indeed on Rybka forum), and the results were not as extreme as what you report. In a far larger number of games the white advantage was ~72%, IIRC. This is in good agreement with what I found in self-play tests of Fairy-Max or Joker, and seemed almost completely independent on the level of play.
Ok, but seems to depend on the time control (more equal at longer TC, therefore depends on the level?). So, a 2900 Elo FIDE level seems plausible for present top engines on a quad? I am bit surprised (the commonly accepted level is >3000).

Kai
Actually i tested Rybka without full pawn and its performance was approx.240 elo weaker than Rybka (default-with all peaces)

Rybka 4.1 WP x64 1c:
http://www.sedatcanbaz.com/chess/ratings/scct-auto232/

For more details about Human vs Engine Elo calculations:
http://www.talkchess.com/forum/viewtopi ... 1&start=20
http://www.talkchess.com/forum/viewtopi ... 4&start=50


Best,
Sedat
Is this the same (f7) "move and pawn" handicap?

Kai
Dear Kai,

Rybka 4.1 x64 1c is played at handicap-without full pawn (e2 and e7),for more details:
http://rybkaforum.net/cgi-bin/rybkaforu ... 3;hl=sedat

Best,
Sedat

Uri Blass
Posts: 8368
Joined: Wed Mar 08, 2006 11:37 pm
Location: Tel-Aviv Israel

Re: Rybka odds matches and the strength of engines

Post by Uri Blass » Sat Jun 09, 2012 11:52 pm

Laskos wrote:
Uri Blass wrote:
Laskos wrote:I put a Rybka self-play randomizer match (search window 3cp) with the "pawn and move" handicap (f7 removed). I don't know if this was done before, maybe some Rybka Forum members would know.

[D]rnbqkbnr/ppppp1pp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq[/D]


200 games at 1s/move

169:31
+145 =48 -7

Some ~300 computer Elo points handicap.

In 2008 Rybka played 8 games with this handicap against GM Roman Dzindzichashvili, and 2 games against GM Vadim Milov, performing at Elo 2550 FIDE level. This is AFAIK the last series of computer-GM games in stable conditions.

Now a bit of speculations: engines improved since 2008 by some ~150 Elo points, for a total difference of ~450 computer Elo points, meaning some ~350 human Elo points (could someone confirm that computer ratings are exaggerating the differences?). Therefore a recent top engine on a quad (and tournament TC) would be ~2550+350 ~ 2900 Elo points on FIDE ratings. Seems a bit low compared to Elo 3200 assumed by many for these engines.

Kai
I believe that the difference in playing strength against humans between having pawn advantage and not having pawn advantage is clearly
more than the difference in comp-comp games.
I don't know, are you sure?
I also do not think that using self-play randomizer at 1 second per move is a good way to estimate elo computer difference.

self play randomizer at 1 second per move
means 2 things:
1)very fast time control that you do not use against humans
2)weaker playing strength relative to not using a randomizer.
Actually 1s/move is not so fast, the difference at slower TC would be even smaller (Sedat is showing 240 points instead of 300), and I am already wondering that the difference is not very large. The weakening at 3cp window is not that important (I guess).

Kai
I suspect that it is not the same f7 pawn.
It does not make sense to have a smaller difference at slower time control unless the position is a draw so slower time control helps the weaker side to find the right moves to draw.

User avatar
Laskos
Posts: 8977
Joined: Wed Jul 26, 2006 8:21 pm
Full name: Kai Laskos

Re: Rybka odds matches and the strength of engines

Post by Laskos » Sat Jun 09, 2012 11:56 pm

Uri Blass wrote:
I suspect that it is not the same f7 pawn.
It does not make sense to have a smaller difference at slower time control unless the position is a draw so slower time control helps the weaker side to find the right moves to draw.
I actually put on faster controls than 1s/move to see what happens, and the difference was larger. At Rybka depth 5 (very fast games), 3cp window randomizer, I waited for some 40 games to have a single draw, I was thinking that I messed up something (still a possibility).

Kai

Post Reply