Robbolito 0.09 New Edition VS Rybka 3

Damir · Post by **Damir** » Sat Jan 02, 2010 9:41 am

Hi Mike

The programmer who gave me this version to test told me it is 50 elo stronger. I did not make those claims by myself. From the way things are looking right now, there is no reason for me not to believe him
Damir

Uri Blass · Post by **Uri Blass** » Sat Jan 02, 2010 10:46 am

Michael Sherwin wrote:
Damir wrote:I suppose you will just have to wait than until this version is released, so you can see for yourself. Unlike some people here I don't need 500-1000 games to see if Robbo is giving Rybka run for its money... 50 games is enough for me, to make me convinced that this version is at least 50 elo stronger than the others released before it.... maybe more
I wish that you were right about this, because, RomiChess would be the engine that everyone would be talking about! I can not count the number of times that Romi totally dominated the first 50 to 75 games of a 100 game match only to fall flat on her face in the remainder of the games. Once Romi was smashing the old record to pieces only to get 3 points of the last 25 and got a bad total result. I think Bob would tell you that you need 2,500 games to prove a 50 elo improvement. And he ran millions of games on his cluster to prove it. But, in the end, those of us that do not have the testing resources do have to rely on our intuition. When I was making regular new official releases, they were mostly better than the last release. Now I consider that mostly good luck!

I think that the right way is not to test based on games but based on test suites.

You accept a change if the program scores better on a test suite.
Of course you need the right test suite and I do not recommend a tactical test suite.

The test suite can be based on correspondence games from the last years when using chess programs is allowed.

Sometimes there are some moves with similiar value so you need to do serious analysis of the positions in order to add solutions to the move that top correspondence players played

Uri

Damir · Post by **Damir** » Sat Jan 02, 2010 10:57 am

Hi Uri

Can you provide me with some of your test suits, you might even recommend your own games as example. I know you are playing corr chess ....+ recommend the time control for me to let Robbo use in solving these
Damir

Albert Silver · Post by **Albert Silver** » Sat Jan 02, 2010 6:00 pm

Uri Blass wrote:
Michael Sherwin wrote:
Damir wrote:I suppose you will just have to wait than until this version is released, so you can see for yourself. Unlike some people here I don't need 500-1000 games to see if Robbo is giving Rybka run for its money... 50 games is enough for me, to make me convinced that this version is at least 50 elo stronger than the others released before it.... maybe more
I wish that you were right about this, because, RomiChess would be the engine that everyone would be talking about! I can not count the number of times that Romi totally dominated the first 50 to 75 games of a 100 game match only to fall flat on her face in the remainder of the games. Once Romi was smashing the old record to pieces only to get 3 points of the last 25 and got a bad total result. I think Bob would tell you that you need 2,500 games to prove a 50 elo improvement. And he ran millions of games on his cluster to prove it. But, in the end, those of us that do not have the testing resources do have to rely on our intuition. When I was making regular new official releases, they were mostly better than the last release. Now I consider that mostly good luck!
I think that the right way is not to test based on games but based on test suites.

You accept a change if the program scores better on a test suite.
Of course you need the right test suite and I do not recommend a tactical test suite.

The test suite can be based on correspondence games from the last years when using chess programs is allowed.

Sometimes there are some moves with similiar value so you need to do serious analysis of the positions in order to add solutions to the move that top correspondence players played

Uri

I don't agree, and think the proof is in the pudding: playing. The whole point of test suites, tactical, and other, is to test an engine's ability a specific area and predict their improvement in playing.

What I do think, is that the testing should be done with the default settings and not debilitating things like contempt=0. FYI, Rybka LOSES to *itself* when using contempt at zero.

That said, I tested Robbo 0.085g3 amply using the full SilverSuite and came up with some odd results, though one is undeniable: Robbo is stronger. The issue is by how much.

I ran it against Robbo x64 against Rybka x64 2CPU, ponder off (I only have 2 cores), and Rybka came ahead with a slightly subpar 40 Elo lead. Subpar meaning that on equal hardware it would probably be behind 20 Elo or so, depending on how much gain you expect from the second CPU. CCRL and CEGT both predict a ~50 Elo gain at 20-40 min games.

However, I then tested both with a single CPU and ponder ON, 10min games, and I was quite astonished to see Robbo win with a hefty +78 Elo edge. Note, I used default settings for both engines of course.

I did see many of the games, and will comment this much: Robbo's wins were NOT the result of better endgame play, since most of its losses were precisely in the endgame where it managed to lose a number it should not have.

gerold · Post by **gerold** » Sat Jan 02, 2010 6:21 pm

Albert Silver wrote:
Uri Blass wrote:
Michael Sherwin wrote:
Damir wrote:I suppose you will just have to wait than until this version is released, so you can see for yourself. Unlike some people here I don't need 500-1000 games to see if Robbo is giving Rybka run for its money... 50 games is enough for me, to make me convinced that this version is at least 50 elo stronger than the others released before it.... maybe more
I wish that you were right about this, because, RomiChess would be the engine that everyone would be talking about! I can not count the number of times that Romi totally dominated the first 50 to 75 games of a 100 game match only to fall flat on her face in the remainder of the games. Once Romi was smashing the old record to pieces only to get 3 points of the last 25 and got a bad total result. I think Bob would tell you that you need 2,500 games to prove a 50 elo improvement. And he ran millions of games on his cluster to prove it. But, in the end, those of us that do not have the testing resources do have to rely on our intuition. When I was making regular new official releases, they were mostly better than the last release. Now I consider that mostly good luck!
I think that the right way is not to test based on games but based on test suites.

You accept a change if the program scores better on a test suite.
Of course you need the right test suite and I do not recommend a tactical test suite.

The test suite can be based on correspondence games from the last years when using chess programs is allowed.

Sometimes there are some moves with similiar value so you need to do serious analysis of the positions in order to add solutions to the move that top correspondence players played

Uri
I don't agree, and think the proof is in the pudding: playing. The whole point of test suites, tactical, and other, is to test an engine's ability a specific area and predict their improvement in playing.

What I do think, is that the testing should be done with the default settings and not debilitating things like contempt=0. FYI, Rybka LOSES to *itself* when using contempt at zero.

That said, I tested Robbo 0.085g3 amply using the full SilverSuite and came up with some odd results, though one is undeniable: Robbo is stronger. The issue is by how much.

I ran it against Robbo x64 against Rybka x64 2CPU, ponder off (I only have 2 cores), and Rybka came ahead with a slightly subpar 40 Elo lead. Subpar meaning that on equal hardware it would probably be behind 20 Elo or so, depending on how much gain you expect from the second CPU. CCRL and CEGT both predict a ~50 Elo gain at 20-40 min games.

However, I then tested both with a single CPU and ponder ON, 10min games, and I was quite astonished to see Robbo win with a hefty +78 Elo edge. Note, I used default settings for both engines of course.

I did see many of the games, and will comment this much: Robbo's wins were NOT the result of better endgame play, since most of its losses were precisely in the endgame where it managed to lose a number it should not have.

In the past the time control management has caused Robb
to lose in the end game. Have you seen this in any of your
games in the end game.

Best to you,

Gerold.

Carlos777 · Post by **Carlos777** » Sat Jan 02, 2010 6:42 pm

Damir wrote:I have just played a match of 50 games, time control 5+0.

Rybka was using 4-core, contempt 0, VS Robbo's single core.

I am proud to announce that Rybka on 4-core is 0 points better than

Robbo on 1-core.

Thanks for the games Damir.

Regards,
Carlos

Albert Silver · Post by **Albert Silver** » Sat Jan 02, 2010 6:51 pm

gerold wrote:
Albert Silver wrote:I don't agree, and think the proof is in the pudding: playing. The whole point of test suites, tactical, and other, is to test an engine's ability a specific area and predict their improvement in playing.

What I do think, is that the testing should be done with the default settings and not debilitating things like contempt=0. FYI, Rybka LOSES to *itself* when using contempt at zero.

That said, I tested Robbo 0.085g3 amply using the full SilverSuite and came up with some odd results, though one is undeniable: Robbo is stronger. The issue is by how much.

I ran it against Robbo x64 against Rybka x64 2CPU, ponder off (I only have 2 cores), and Rybka came ahead with a slightly subpar 40 Elo lead. Subpar meaning that on equal hardware it would probably be behind 20 Elo or so, depending on how much gain you expect from the second CPU. CCRL and CEGT both predict a ~50 Elo gain at 20-40 min games.

However, I then tested both with a single CPU and ponder ON, 10min games, and I was quite astonished to see Robbo win with a hefty +78 Elo edge. Note, I used default settings for both engines of course.

I did see many of the games, and will comment this much: Robbo's wins were NOT the result of better endgame play, since most of its losses were precisely in the endgame where it managed to lose a number it should not have.
In the past the time control management has caused Robb
to lose in the end game. Have you seen this in any of your
games in the end game.

Best to you,

Gerold.

I am uncertain what this means, but it didn't lose any games on time, but it does play slower and this isn't clearly best. I don't recall whether it was RYbka 2 or 3, but I tested it with slower time management as I thought it might score even better if it didn't play quite as fast. Ample testing showed it didn't improve. It beat some opponents by larger margins, and others by smaller ones. After 200 games I thought I was sure it was better slower, but after 400 games, and 4 opponents, the performance was actually the same as its normal faster play.

solis · Post by **solis** » Sat Jan 02, 2010 9:55 pm

Hi Albert,
I realy appreciate that you did the testing of Robbolito.Robbo is not a perfect engine but at least deserved to be tested with the open mind.
A lot of work is still to be done and program improved.If all the energy that was wasted in the fights about this engine was used to improve it we would already have much stronger Robbolito.
Why not take this engine as a chalenge to make it stronger.A lot of people have already participated in this projest and had a lot of fun doing it.
Maybe this is new beggining where more people would get involved to use ideas and improve other engines to.This would be good for all the chess community.
As we have all greeted new SF 16 with praises there is no reason not to accept now Robbolito and continue with pleasure of using these chess engines for fun and testing.
Open mind is all that is necessary.

Graham Banks · Post by **Graham Banks** » Sat Jan 02, 2010 10:12 pm

solis wrote:Hi Albert,
I realy appreciate that you did the testing of Robbolito.Robbo is not a perfect engine but at least deserved to be tested with the open mind.
A lot of work is still to be done and program improved.If all the energy that was wasted in the fights about this engine was used to improve it we would already have much stronger Robbolito.
Why not take this engine as a chalenge to make it stronger.A lot of people have already participated in this projest and had a lot of fun doing it.
Maybe this is new beggining where more people would get involved to use ideas and improve other engines to.This would be good for all the chess community.
As we have all greeted new SF 16 with praises there is no reason not to accept now Robbolito and continue with pleasure of using these chess engines for fun and testing.
Open mind is all that is necessary.

Who is the "author" and why isn't he the one doing any work? Did he ever release a compile, or is he just a coward who distributes someone else's code and leaves the deceived to proclaim him as a hero?

BubbaTough · Post by **BubbaTough** » Sat Jan 02, 2010 10:24 pm

Graham Banks wrote: Who is the "author" and why isn't he the one doing any work? Did he ever release a compile, or is he just a coward who distributes someone else's code and leaves the deceived to proclaim him as a hero?

I seem to remember that there is some rule in this forum that requires the real name of members to be listed in order to participate in discussions. Perhaps there should be a rule requiring the real name of program authors to be listed in order for that program to be discussed/linked to/etc.

. If the authors feel the need to be anonymous because of fear of attribution, then the program is not appropriate for discussion here. And if there is no fear of attribution, then why be anonymous?

-Sam

Robbolito 0.09 New Edition VS Rybka 3

Re: Robbolito 0.09 New Edition VS Rybka 3

Re: Robbolito 0.09 New Edition VS Rybka 3

Re: Robbolito 0.09 New Edition VS Rybka 3

Re: Robbolito 0.09 New Edition VS Rybka 3

Re: Robbolito 0.09 New Edition VS Rybka 3

Re: Robbolito 0.09 New Edition VS Rybka 3

Re: Robbolito 0.09 New Edition VS Rybka 3

Re: Robbolito 0.09 New Edition VS Rybka 3

Re: Robbolito 0.09 New Edition VS Rybka 3

Re: Robbolito 0.09 New Edition VS Rybka 3