Houdini 4 has been released

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

lkaufman
Posts: 5960
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA

Re: Houdini 4 has been released

Post by lkaufman »

Houdini wrote:
lkaufman wrote: After 500 direct bullet (30" + 0.3") games against H3 I'm showing +48 elo, close enough to the claimed 50, but normally elo gains diminish with increased time limit and also against unrelated engines, so I'll "predict" that the real gain (say on the CEGT 5' +3" list, which is similar to IPON) will be around 30 elo. We'll see.
In my tests at 10"+0.1" and 120"+1.2" against 9 opponents Houdini 4 without table bases is about 45 Elo better than Houdini 3.
The Syzygy 6-men add another 5 to 10 Elo in my tests at 60"+1" time control, which explains the official number of "50 Elo" for the release.

How much this will produce in rating lists is always the big surprise, inasmuch as the time management of Houdini 4 has changed as well I'm not even trying to predict these numbers with a precision better than 20 Elo...

Cheers,
Robert
One more question: When you say that the Syzygy 6 men tb adds five to ten elo, that appears to mean compared to no TB. But shouldn't the proper comparison be with the best TB supported by Houdini 3? Or are you saying that Syzygy is that much better than any other supported TB?
By the way I'm now running a recent Komodo version (which was already tested vs H3) against H4 to measure the improvement against an unrelated opponent at 1' +.5". So far it is showing roughly midway between the 27 LS figure so far and your 45 figure. I'll report fully at the end, when I should have 4000 games.
lkaufman
Posts: 5960
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA

Re: Houdini 4 has been released

Post by lkaufman »

Milos wrote:
lkaufman wrote: Thanks. Since your methodology seems to be very similar to the LS list, and since LS uses a time limit in between the two you use, can you offer any theory other than sample error for the discrepancy between your +45 figure and LS +27 figure (as last reported, subject to change of course)?
It is very easy to explain. RH has +45 (without TBs) with 1SD approximatelly 7Elo, SP has +27 with 1SD approximatelly 10Elo. So it is quite probable that the "real" improvement is 37Elo which is quite in the middle.
In addition to that SP has higher average rating of opponents with compreses the ratings more.
The numbers you mention for 1SD look about right for 2SD to me, assuming something close to half draws. For 2800 games with half draws and a 27 elo gap I get margin of error 9.35, which might be about ten with somewhat less than half draws. These are two SD values, not 1 SD.I think you made some simple error. The combined margin of error is much less than the sum of the two errors; for 10 and 7 it is about 12.1, way below 17 or 18.
I think your 37 estimate might be about right, but it is not reasonable to consider 27 and 45 elo with the given numbers of games to be just sample error. There should be some other factor.

Larry
Milos
Posts: 4190
Joined: Wed Nov 25, 2009 1:47 am

Re: Houdini 4 has been released

Post by Milos »

lkaufman wrote:The numbers you mention for 1SD look about right for 2SD to me, assuming something close to half draws. For 2800 games with half draws and a 27 elo gap I get margin of error 9.35, which might be about ten with somewhat less than half draws. These are two SD values, not 1 SD.I think you made some simple error. The combined margin of error is much less than the sum of the two errors; for 10 and 7 it is about 12.1, way below 17 or 18.
I think your 37 estimate might be about right, but it is not reasonable to consider 27 and 45 elo with the given numbers of games to be just sample error. There should be some other factor.

Larry
For 2800 games assuming draw and win rates slightly better than H3 (which has 44% and 62% respectivelly) - i.e. 41% and 66% 1SD between 2 opponents would be 0.66%. Since there are many opponents here SD is larger and you have to multiply it with at least sqrt(2), which gives 6.5 Elo. On RH side 1SD is about 5.8Elo. This combined gives 8.7Elo for 1SD. 2SD is than around 17Elo which is already the difference between two results.
In addition to that LS list uses stronger opponents (for 60-80Elo on average) which translates into 5-10Elo rating compression.
ernest
Posts: 2041
Joined: Wed Mar 08, 2006 8:30 pm

Re: Houdini 4 has been released

Post by ernest »

Hi Robert,

Any possibility for a 32-bit version ???
lkaufman
Posts: 5960
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA

Re: Houdini 4 has been released

Post by lkaufman »

Milos wrote:
lkaufman wrote:The numbers you mention for 1SD look about right for 2SD to me, assuming something close to half draws. For 2800 games with half draws and a 27 elo gap I get margin of error 9.35, which might be about ten with somewhat less than half draws. These are two SD values, not 1 SD.I think you made some simple error. The combined margin of error is much less than the sum of the two errors; for 10 and 7 it is about 12.1, way below 17 or 18.
I think your 37 estimate might be about right, but it is not reasonable to consider 27 and 45 elo with the given numbers of games to be just sample error. There should be some other factor.

Larry
For 2800 games assuming draw and win rates slightly better than H3 (which has 44% and 62% respectivelly) - i.e. 41% and 66% 1SD between 2 opponents would be 0.66%. Since there are many opponents here SD is larger and you have to multiply it with at least sqrt(2), which gives 6.5 Elo. On RH side 1SD is about 5.8Elo. This combined gives 8.7Elo for 1SD. 2SD is than around 17Elo which is already the difference between two results.
In addition to that LS list uses stronger opponents (for 60-80Elo on average) which translates into 5-10Elo rating compression.
So the 18 elo gap is just one elo more than the margin of error, so certainly it is possible. But I don't understand your second point. Longer time limits mean rating compression, but the strength of the opposition should not affect properly calculated ratings in general, unless one uses the broken "elostat" which wrongly averages ratings. Why do you claim that stronger opponents makes for rating compression in general (not specifically for Houdini)?
Milos
Posts: 4190
Joined: Wed Nov 25, 2009 1:47 am

Re: Houdini 4 has been released

Post by Milos »

lkaufman wrote:So the 18 elo gap is just one elo more than the margin of error, so certainly it is possible. But I don't understand your second point. Longer time limits mean rating compression, but the strength of the opposition should not affect properly calculated ratings in general, unless one uses the broken "elostat" which wrongly averages ratings. Why do you claim that stronger opponents makes for rating compression in general (not specifically for Houdini)?
I am not claiming in general, I am assuming (not claiming again) for Houdini 4. Even though Robert stated somewhere that H dev has 0 contempt, from TCEC games in many positions I realized that there was still some contempt present and I believe the same for H4 (I don't have it myself yet to test).
User avatar
Werner
Posts: 2871
Joined: Wed Mar 08, 2006 10:09 pm
Location: Germany
Full name: Werner Schüle

Re: Houdini 4 has been released

Post by Werner »

ernest wrote:Hi Robert,

Any possibility for a 32-bit version ???
Hi, the 32bit version is included. Just use a Windows 32 bit system for installationl
Werner
Milos
Posts: 4190
Joined: Wed Nov 25, 2009 1:47 am

Re: Houdini 4 has been released

Post by Milos »

Milos wrote:I am not claiming in general, I am assuming (not claiming again) for Houdini 4. Even though Robert stated somewhere that H dev has 0 contempt, from TCEC games in many positions I realized that there was still some contempt present and I believe the same for H4 (I don't have it myself yet to test).
I've just got a confirmation that default H4 contempt is 1, so my guess about 5-10Elo less with 60-80Elo stronger opponents is quite correct.
There you go Larry, you now have complete explanation ;).
P.S. I'm sure if Stefan ran H4 with contempt 0 at his list (top 10 opponents) he would get better result than with default one.
jdart
Posts: 4366
Joined: Fri Mar 10, 2006 5:23 am
Location: http://www.arasanchess.org

Re: Houdini 4 has been released

Post by jdart »

Is there/will there be an opening book for Houdini 4?

--Jon
User avatar
Houdini
Posts: 1471
Joined: Tue Mar 16, 2010 12:00 am

Re: Houdini 4 has been released

Post by Houdini »

lkaufman wrote: One more question: When you say that the Syzygy 6 men tb adds five to ten elo, that appears to mean compared to no TB. But shouldn't the proper comparison be with the best TB supported by Houdini 3? Or are you saying that Syzygy is that much better than any other supported TB?
I've never been able to demonstrate any strength improvement with Nalimov EGTB at fast time controls, the overhead of the Nalimov appears to offset any gain. (note that the balance could be different at long TC)
On the other hand, with the Syzygy system the improvement is clear even at 1'+1".

Robert
Last edited by Houdini on Wed Nov 27, 2013 2:27 am, edited 1 time in total.