Houdini 1.5 running for the IPON

Discussion of computer chess matches and engine tournaments.

Moderators: hgm, Rebel, chrisw

User avatar
Houdini
Posts: 1471
Joined: Tue Mar 16, 2010 12:00 am

Re: Houdini 1.5 running for the IPON

Post by Houdini »

IWB wrote:The current result can be found here:

http://www.inwoba.de

Have fun
Ingo
Current IPON results after 1050 games are better than I expected, Houdini 1.5 may will end above the 3000 mark.

Here's some more information about my own single-core testing results without ponder.
At ultra-fast TC 5"+0.05" I measured +35 Elo in a 21,000 game gauntlet against the top 7 engines.
At fast TC 50"+0.5" I measured +45 Elo in a 2,100 game gauntlet against the same opponents.

The IPON results seem to confirm the trend that the strength of Houdini 1.5 increases at longer TC, I like that. :)
Albert Silver
Posts: 3019
Joined: Wed Mar 08, 2006 9:57 pm
Location: Rio de Janeiro, Brazil

Re: Houdini 1.5 running for the IPON

Post by Albert Silver »

Houdini wrote:
IWB wrote:The current result can be found here:

http://www.inwoba.de

Have fun
Ingo
Current IPON results after 1050 games are better than I expected, Houdini 1.5 may will end above the 3000 mark.

Here's some more information about my own single-core testing results without ponder.
At ultra-fast TC 5"+0.05" I measured +35 Elo in a 21,000 game gauntlet against the top 7 engines.
At fast TC 50"+0.5" I measured +45 Elo in a 2,100 game gauntlet against the same opponents.

The IPON results seem to confirm the trend that the strength of Houdini 1.5 increases at longer TC, I like that. :)
Did you test with or without Gaviota EGTBs? If both, was there a measurable Elo difference? I imagine it must be worst at ultra fast due to disk reads, but I could be wrong.
"Tactics are the bricks and sticks that make up a game, but positional play is the architectural blueprint."
IWB
Posts: 1539
Joined: Thu Mar 09, 2006 2:02 pm

Re: Houdini 1.5 running for the IPON

Post by IWB »

Hello Robert,
Houdini wrote: Current IPON results after 1050 games are better than I expected, Houdini 1.5 may will end above the 3000 mark.
At 1006 games I made a real calculation with bayeselo. The result was 7 Elo lower than the "live rating". At the end that was still 60+ Elo more than 1.03a.

Regarding the strength increase at longer time controls: The question is "what is longer". If I compare my list with the rating lists with longer time controls, there is not a single enigne which is really performing significantl better than in my list (basicaly they always have the same neighbours egardless of the time control)

What I see in your reults is, that at this very short testing condition the engines did not reach the plane for the real rating ...

I remember that the stockfish team once where suprised that at my testing the version was better while teir testing indicated nearly no increase. All I see is that the ultra fast testing might not be the optimal method to correctly identify an increase in Elo! (As well I do not beliefe that you need a ultra long time control to indentify an elo increase ... :-) )

Bye
and thanks for the version
Ingo
User avatar
Houdini
Posts: 1471
Joined: Tue Mar 16, 2010 12:00 am

Re: Houdini 1.5 running for the IPON

Post by Houdini »

Albert Silver wrote:Did you test with or without Gaviota EGTBs? If both, was there a measurable Elo difference? I imagine it must be worst at ultra fast due to disk reads, but I could be wrong.
These were results without Gaviota EGTB.

I have also run two gauntlets with 5-piece EGTB attached; the results were slightly worse than without EGTB, but well within the statistical uncertainty of the gauntlets.
beram
Posts: 1187
Joined: Wed Jan 06, 2010 3:11 pm

Re: Houdini 1.5 running for the IPON

Post by beram »

IWB wrote:Hello Robert,
Houdini wrote: Current IPON results after 1050 games are better than I expected, Houdini 1.5 may will end above the 3000 mark.
At 1006 games I made a real calculation with bayeselo. The result was 7 Elo lower than the "live rating". At the end that was still 60+ Elo more than 1.03a.

Regarding the strength increase at longer time controls: The question is "what is longer". If I compare my list with the rating lists with longer time controls, there is not a single enigne which is really performing significantl better than in my list (basicaly they always have the same neighbours egardless of the time control)

What I see in your reults is, that at this very short testing condition the engines did not reach the plane for the real rating ...


Bye
and thanks for the version
Ingo
Sofar +12 -7 =11 the first results for Houdini 1.5 in 30 games match against DeepRybka 4 on my AMD 1090T time 3m 2s, each engine 4 cores, privat testsuite. This means a 58,33 % score, is 58 ELO above DR4
Earlier matchresults under same conditions with Houdini1.03 and DR4 gave a 49-51 % score

And now still running and almost finished under same conditions, after 55 games in Noomen Testsuite a 59 % score = approx 61 ELO for Houdini 1.5

I agree for a part with Ingo that there is not very much difference in testing at LTC although at very LTC in the TCECmatch, Rybka4 proved to be a little better than Houdini1.03 (10 - 20 ELO)

I vote mr Robert Houdart for president !

btw. can you save Belgium also :D
Dann Corbit
Posts: 12538
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: Houdini 1.5 running for the IPON

Post by Dann Corbit »

Based on the IPON testing so far, I think that there is no doubt that Houdini 1.5 is the strongest chess engine in the world.

It's also an incredible mate solver.
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: Houdini 1.5 running for the IPON

Post by Don »

Milos wrote:
Dann Corbit wrote:Quick summary:
All the other programs are bleeding out of their ears.
On all the mini-matches played that I examined, Houdini 1.5 won them fairly decisively.
Atm after 170 games it's 90elo better than the previous version. Error margins are around 50elo atm, so in the worst case it's at least 40 elo better than the previous version. Realistically 50 elo is something to expect, which is beyond 35elo that Robert humbly announced and well beyond level of Rybka 4.
Houdini 1.5-Rybka 3 elo difference will now certainly surpass Rybka 1.0 beta-Fruit 2.1 difference.
Let's see now will some vain programmers give credit where credit is due, or they'll continue to apply double standards as usual...
Are you saying that if you take a really strong program such as Robbolitto and add 50-60 ELO, it distinguishes you as a programmer?

If so, then I completely agree with you. That is something I can respect because I think that is difficult. Of course I know that you think it's easy, so I'm not sure why you are getting so excited. Anyway, for some reason I was thinking YOU were the one with the double standard. So I guess we both agree that both Vas and Robert Houdart are talented and original thinkers! I think we have made some progress with you today!
Albert Silver
Posts: 3019
Joined: Wed Mar 08, 2006 9:57 pm
Location: Rio de Janeiro, Brazil

Re: Houdini 1.5 running for the IPON

Post by Albert Silver »

IWB wrote:Hello Robert,
Houdini wrote: Current IPON results after 1050 games are better than I expected, Houdini 1.5 may will end above the 3000 mark.
At 1006 games I made a real calculation with bayeselo. The result was 7 Elo lower than the "live rating". At the end that was still 60+ Elo more than 1.03a.

Regarding the strength increase at longer time controls: The question is "what is longer". If I compare my list with the rating lists with longer time controls, there is not a single enigne which is really performing significantl better than in my list (basicaly they always have the same neighbours egardless of the time control)

What I see in your reults is, that at this very short testing condition the engines did not reach the plane for the real rating ...

I remember that the stockfish team once where suprised that at my testing the version was better while teir testing indicated nearly no increase. All I see is that the ultra fast testing might not be the optimal method to correctly identify an increase in Elo! (As well I do not beliefe that you need a ultra long time control to indentify an elo increase ... :-) )

Bye
and thanks for the version
Ingo
I would be quite curious, if you were willing to do the test, to see the same gauntlet but with the EGTBs installed and activated (incredibly easy to do BTW), to see the result.
"Tactics are the bricks and sticks that make up a game, but positional play is the architectural blueprint."
Milos
Posts: 4190
Joined: Wed Nov 25, 2009 1:47 am

Re: Houdini 1.5 running for the IPON

Post by Milos »

IWB wrote:I remember that the stockfish team once where suprised that at my testing the version was better while teir testing indicated nearly no increase. All I see is that the ultra fast testing might not be the optimal method to correctly identify an increase in Elo! (As well I do not beliefe that you need a ultra long time control to indentify an elo increase ... :-) )
With all do respect, all I see is that 1900 games with many different opponents is not sufficient for claiming any accuracy. You do claim 15Elo margins. Since we don't know the set of positions you are using I can only say that 15Elo margins are overly optimistic (and Bayeselo assumptions overly simplistic).
Realistic margins are 30Elo, maybe even more.
Milos
Posts: 4190
Joined: Wed Nov 25, 2009 1:47 am

Re: Houdini 1.5 running for the IPON

Post by Milos »

Don wrote:Are you saying that if you take a really strong program such as Robbolitto and add 50-60 ELO, it distinguishes you as a programmer?

If so, then I completely agree with you. That is something I can respect because I think that is difficult. Of course I know that you think it's easy, so I'm not sure why you are getting so excited. Anyway, for some reason I was thinking YOU were the one with the double standard. So I guess we both agree that both Vas and Robert Houdart are talented and original thinkers! I think we have made some progress with you today!
I can only say that there is no point discussing anything with ppl that claim to know what others think, assume, agree upon, how they feel, etc.
Sorry to disappoint you, but I do not believe you have psychic powers.