Page 3 of 7

Re: Houdini 1.5 running for the IPON

Posted: Thu Dec 16, 2010 4:55 pm
by Houdini
IWB wrote:The current result can be found here:

http://www.inwoba.de

Have fun
Ingo
Current IPON results after 1050 games are better than I expected, Houdini 1.5 may will end above the 3000 mark.

Here's some more information about my own single-core testing results without ponder.
At ultra-fast TC 5"+0.05" I measured +35 Elo in a 21,000 game gauntlet against the top 7 engines.
At fast TC 50"+0.5" I measured +45 Elo in a 2,100 game gauntlet against the same opponents.

The IPON results seem to confirm the trend that the strength of Houdini 1.5 increases at longer TC, I like that. :)

Re: Houdini 1.5 running for the IPON

Posted: Thu Dec 16, 2010 5:06 pm
by Albert Silver
Houdini wrote:
IWB wrote:The current result can be found here:

http://www.inwoba.de

Have fun
Ingo
Current IPON results after 1050 games are better than I expected, Houdini 1.5 may will end above the 3000 mark.

Here's some more information about my own single-core testing results without ponder.
At ultra-fast TC 5"+0.05" I measured +35 Elo in a 21,000 game gauntlet against the top 7 engines.
At fast TC 50"+0.5" I measured +45 Elo in a 2,100 game gauntlet against the same opponents.

The IPON results seem to confirm the trend that the strength of Houdini 1.5 increases at longer TC, I like that. :)
Did you test with or without Gaviota EGTBs? If both, was there a measurable Elo difference? I imagine it must be worst at ultra fast due to disk reads, but I could be wrong.

Re: Houdini 1.5 running for the IPON

Posted: Thu Dec 16, 2010 5:23 pm
by IWB
Hello Robert,
Houdini wrote: Current IPON results after 1050 games are better than I expected, Houdini 1.5 may will end above the 3000 mark.
At 1006 games I made a real calculation with bayeselo. The result was 7 Elo lower than the "live rating". At the end that was still 60+ Elo more than 1.03a.

Regarding the strength increase at longer time controls: The question is "what is longer". If I compare my list with the rating lists with longer time controls, there is not a single enigne which is really performing significantl better than in my list (basicaly they always have the same neighbours egardless of the time control)

What I see in your reults is, that at this very short testing condition the engines did not reach the plane for the real rating ...

I remember that the stockfish team once where suprised that at my testing the version was better while teir testing indicated nearly no increase. All I see is that the ultra fast testing might not be the optimal method to correctly identify an increase in Elo! (As well I do not beliefe that you need a ultra long time control to indentify an elo increase ... :-) )

Bye
and thanks for the version
Ingo

Re: Houdini 1.5 running for the IPON

Posted: Thu Dec 16, 2010 5:26 pm
by Houdini
Albert Silver wrote:Did you test with or without Gaviota EGTBs? If both, was there a measurable Elo difference? I imagine it must be worst at ultra fast due to disk reads, but I could be wrong.
These were results without Gaviota EGTB.

I have also run two gauntlets with 5-piece EGTB attached; the results were slightly worse than without EGTB, but well within the statistical uncertainty of the gauntlets.

Re: Houdini 1.5 running for the IPON

Posted: Thu Dec 16, 2010 6:07 pm
by beram
IWB wrote:Hello Robert,
Houdini wrote: Current IPON results after 1050 games are better than I expected, Houdini 1.5 may will end above the 3000 mark.
At 1006 games I made a real calculation with bayeselo. The result was 7 Elo lower than the "live rating". At the end that was still 60+ Elo more than 1.03a.

Regarding the strength increase at longer time controls: The question is "what is longer". If I compare my list with the rating lists with longer time controls, there is not a single enigne which is really performing significantl better than in my list (basicaly they always have the same neighbours egardless of the time control)

What I see in your reults is, that at this very short testing condition the engines did not reach the plane for the real rating ...


Bye
and thanks for the version
Ingo
Sofar +12 -7 =11 the first results for Houdini 1.5 in 30 games match against DeepRybka 4 on my AMD 1090T time 3m 2s, each engine 4 cores, privat testsuite. This means a 58,33 % score, is 58 ELO above DR4
Earlier matchresults under same conditions with Houdini1.03 and DR4 gave a 49-51 % score

And now still running and almost finished under same conditions, after 55 games in Noomen Testsuite a 59 % score = approx 61 ELO for Houdini 1.5

I agree for a part with Ingo that there is not very much difference in testing at LTC although at very LTC in the TCECmatch, Rybka4 proved to be a little better than Houdini1.03 (10 - 20 ELO)

I vote mr Robert Houdart for president !

btw. can you save Belgium also :D

Re: Houdini 1.5 running for the IPON

Posted: Thu Dec 16, 2010 8:01 pm
by Dann Corbit
Based on the IPON testing so far, I think that there is no doubt that Houdini 1.5 is the strongest chess engine in the world.

It's also an incredible mate solver.

Re: Houdini 1.5 running for the IPON

Posted: Fri Dec 17, 2010 1:29 am
by Don
Milos wrote:
Dann Corbit wrote:Quick summary:
All the other programs are bleeding out of their ears.
On all the mini-matches played that I examined, Houdini 1.5 won them fairly decisively.
Atm after 170 games it's 90elo better than the previous version. Error margins are around 50elo atm, so in the worst case it's at least 40 elo better than the previous version. Realistically 50 elo is something to expect, which is beyond 35elo that Robert humbly announced and well beyond level of Rybka 4.
Houdini 1.5-Rybka 3 elo difference will now certainly surpass Rybka 1.0 beta-Fruit 2.1 difference.
Let's see now will some vain programmers give credit where credit is due, or they'll continue to apply double standards as usual...
Are you saying that if you take a really strong program such as Robbolitto and add 50-60 ELO, it distinguishes you as a programmer?

If so, then I completely agree with you. That is something I can respect because I think that is difficult. Of course I know that you think it's easy, so I'm not sure why you are getting so excited. Anyway, for some reason I was thinking YOU were the one with the double standard. So I guess we both agree that both Vas and Robert Houdart are talented and original thinkers! I think we have made some progress with you today!

Re: Houdini 1.5 running for the IPON

Posted: Fri Dec 17, 2010 2:07 am
by Albert Silver
IWB wrote:Hello Robert,
Houdini wrote: Current IPON results after 1050 games are better than I expected, Houdini 1.5 may will end above the 3000 mark.
At 1006 games I made a real calculation with bayeselo. The result was 7 Elo lower than the "live rating". At the end that was still 60+ Elo more than 1.03a.

Regarding the strength increase at longer time controls: The question is "what is longer". If I compare my list with the rating lists with longer time controls, there is not a single enigne which is really performing significantl better than in my list (basicaly they always have the same neighbours egardless of the time control)

What I see in your reults is, that at this very short testing condition the engines did not reach the plane for the real rating ...

I remember that the stockfish team once where suprised that at my testing the version was better while teir testing indicated nearly no increase. All I see is that the ultra fast testing might not be the optimal method to correctly identify an increase in Elo! (As well I do not beliefe that you need a ultra long time control to indentify an elo increase ... :-) )

Bye
and thanks for the version
Ingo
I would be quite curious, if you were willing to do the test, to see the same gauntlet but with the EGTBs installed and activated (incredibly easy to do BTW), to see the result.

Re: Houdini 1.5 running for the IPON

Posted: Fri Dec 17, 2010 3:49 am
by Milos
IWB wrote:I remember that the stockfish team once where suprised that at my testing the version was better while teir testing indicated nearly no increase. All I see is that the ultra fast testing might not be the optimal method to correctly identify an increase in Elo! (As well I do not beliefe that you need a ultra long time control to indentify an elo increase ... :-) )
With all do respect, all I see is that 1900 games with many different opponents is not sufficient for claiming any accuracy. You do claim 15Elo margins. Since we don't know the set of positions you are using I can only say that 15Elo margins are overly optimistic (and Bayeselo assumptions overly simplistic).
Realistic margins are 30Elo, maybe even more.

Re: Houdini 1.5 running for the IPON

Posted: Fri Dec 17, 2010 3:54 am
by Milos
Don wrote:Are you saying that if you take a really strong program such as Robbolitto and add 50-60 ELO, it distinguishes you as a programmer?

If so, then I completely agree with you. That is something I can respect because I think that is difficult. Of course I know that you think it's easy, so I'm not sure why you are getting so excited. Anyway, for some reason I was thinking YOU were the one with the double standard. So I guess we both agree that both Vas and Robert Houdart are talented and original thinkers! I think we have made some progress with you today!
I can only say that there is no point discussing anything with ppl that claim to know what others think, assume, agree upon, how they feel, etc.
Sorry to disappoint you, but I do not believe you have psychic powers.