40x(2) Makes A Run! Where Are You Jesus?!

Discussion of computer chess matches and engine tournaments.

Moderators: hgm, Rebel, chrisw

User avatar
geots
Posts: 4790
Joined: Sat Mar 11, 2006 12:42 am

40x(2) Makes A Run! Where Are You Jesus?!

Post by geots »

Houdini 2.0c x64 v Engine 40x(2)


Last update Tuesday: At the 606 game mark, Houdini had a 50 game lead and +29 elo. Then Engine 40x(2) makes a ferocious run, thus today's update......



Intel i5 w/4TCs
Fritz 11 gui
1CPU/64bit
128MB hash
Bases=NONE
Ponder_Learning=OFF
Perfect 12.32 book w/12-move limit

10'+10"
Match=1000 games


Code: Select all

Houdini 2.0c x64    +19    +211/-172/=335   52.72%   378.5/718  
Engine 40x(2)       -19    +172/-211/=335   47.28%   339.5/718 

From game 606 thru 718, Engine 40x(2) has cut the lead from 50 games to 39 games & cut Houdini's elo lead from +29 to +19!
At the 606 mark- a lot of engines would have packed up and headed home.

This should be a very interesting next 282 games! Will Houdini go on a run of his own? Obviously it is doubtful any engine can make up 39 games with 282 to play- against Houdini. But he can sure make it interesting.



Stay tuned-

george
ernest
Posts: 2041
Joined: Wed Mar 08, 2006 8:30 pm

Re: 40x(2) Makes A Run! Where Are You Jesus?!

Post by ernest »

Making my point clear:
45.1% for Houdini in the last 112 games...

Doesn't that say a lot on the credit to give to 100-game matches?

Not Jesus, just statistics!
Jimmy Huggins
Posts: 98
Joined: Tue Feb 15, 2011 7:00 am
Location: Kansas USA

Re: 40x(2) Makes A Run! Where Are You Jesus?!

Post by Jimmy Huggins »

By looking at the % and a comment by someone on the rybka forum I think I know what Engine40x is.
User avatar
Ajedrecista
Posts: 1971
Joined: Wed Jul 13, 2011 9:04 pm
Location: Madrid, Spain.

Re: 40x(2) makes a run! Where are you Jesús?

Post by Ajedrecista »

Hello!
geots wrote:Houdini 2.0c x64 v Engine 40x(2)


Last update Tuesday: At the 606 game mark, Houdini had a 50 game lead and +29 elo. Then Engine 40x(2) makes a ferocious run, thus today's update......



Intel i5 w/4TCs
Fritz 11 gui
1CPU/64bit
128MB hash
Bases=NONE
Ponder_Learning=OFF
Perfect 12.32 book w/12-move limit

10'+10"
Match=1000 games


Code: Select all

Houdini 2.0c x64    +19    +211/-172/=335   52.72%   378.5/718  
Engine 40x(2)       -19    +172/-211/=335   47.28%   339.5/718 

From game 606 thru 718, Engine 40x(2) has cut the lead from 50 games to 39 games & cut Houdini's elo lead from +29 to +19!
At the 606 mark- a lot of engines would have packed up and headed home.

This should be a very interesting next 282 games! Will Houdini go on a run of his own? Obviously it is doubtful any engine can make up 39 games with 282 to play- against Houdini. But he can sure make it interesting.



Stay tuned-

george
Nice comeback by the mysterious Engine 40x(2)! For 2-sigma confidence (~ 95.45% confidence) I get these error bars:

Code: Select all

Elo interval for 2-sigma confidence:

Elo rating difference:     18.89 Elo

Lower rating difference:   -0.02 Elo
Upper rating difference:   37.91 Elo

Lower bound uncertainty:  -18.91 Elo
Upper bound uncertainty:   19.02 Elo
Average error:        +/-  18.96 Elo

K = (average error)*[sqrt(n)] =  508.11

Elo interval: ]  -0.02,   37.91[
So, it is clear that now (with 718 games played), Houdini is better (looking at error bars of my model) with a confidence of more less 95%; in fact:

Code: Select all

Minimum_score_for_no_regression, ® 2012.

 Calculation of the minimum score for no regression (i.e. negative Elo gain) in a match between two engines:

 Write down the number of games of the match (it must be a positive integer, up to 1073741823):

718

Write down the draw ratio (in percentage):

46.6573816155

Write down the confidence level (in percentage) between 75% and 99.9%:

95

Calculating...

Theoretical minimum score for no regression: 52.6640 %
Theoretical standard deviation in this case:  2.6640 %

Minimum number of won points for the engine in this match:       378.5 points.

Minimum Elo advantage, which is also the negative part of the error bar:
 18.8904 Elo

End of the calculations.

Thanks for using Minimum_score_for_no_regression. Press Enter to exit.
378.5 points out of 718 games is the exact score that Houdini has now. Things are changing very fast in this match!

You are doing a great job, George. Thanks in advance once again.

------------------------
Jimmy Huggins wrote:By looking at the % and a comment by someone on the rybka forum I think I know what Engine40x is.
Could you provide the exact link to the source, please? I have not found anything at Rybka Forum, although I only did some quick searches. By the way, are you referring to Strelka MP (although this match is played in single core) or even to a beta of Rybka 5? Thanks in advance.

Regards from Spain.

Ajedrecista.
User avatar
geots
Posts: 4790
Joined: Sat Mar 11, 2006 12:42 am

Re: 40x(2) makes a run! Where are you Jesús?

Post by geots »

Ajedrecista wrote:Hello!
geots wrote:Houdini 2.0c x64 v Engine 40x(2)


Last update Tuesday: At the 606 game mark, Houdini had a 50 game lead and +29 elo. Then Engine 40x(2) makes a ferocious run, thus today's update......



Intel i5 w/4TCs
Fritz 11 gui
1CPU/64bit
128MB hash
Bases=NONE
Ponder_Learning=OFF
Perfect 12.32 book w/12-move limit

10'+10"
Match=1000 games


Code: Select all

Houdini 2.0c x64    +19    +211/-172/=335   52.72%   378.5/718  
Engine 40x(2)       -19    +172/-211/=335   47.28%   339.5/718 

From game 606 thru 718, Engine 40x(2) has cut the lead from 50 games to 39 games & cut Houdini's elo lead from +29 to +19!
At the 606 mark- a lot of engines would have packed up and headed home.

This should be a very interesting next 282 games! Will Houdini go on a run of his own? Obviously it is doubtful any engine can make up 39 games with 282 to play- against Houdini. But he can sure make it interesting.



Stay tuned-

george
Nice comeback by the mysterious Engine 40x(2)! For 2-sigma confidence (~ 95.45% confidence) I get these error bars:

Code: Select all

Elo interval for 2-sigma confidence:

Elo rating difference:     18.89 Elo

Lower rating difference:   -0.02 Elo
Upper rating difference:   37.91 Elo

Lower bound uncertainty:  -18.91 Elo
Upper bound uncertainty:   19.02 Elo
Average error:        +/-  18.96 Elo

K = (average error)*[sqrt(n)] =  508.11

Elo interval: ]  -0.02,   37.91[
So, it is clear that now (with 718 games played), Houdini is better (looking at error bars of my model) with a confidence of more less 95%; in fact:

Code: Select all

Minimum_score_for_no_regression, ® 2012.

 Calculation of the minimum score for no regression (i.e. negative Elo gain) in a match between two engines:

 Write down the number of games of the match (it must be a positive integer, up to 1073741823):

718

Write down the draw ratio (in percentage):

46.6573816155

Write down the confidence level (in percentage) between 75% and 99.9%:

95

Calculating...

Theoretical minimum score for no regression: 52.6640 %
Theoretical standard deviation in this case:  2.6640 %

Minimum number of won points for the engine in this match:       378.5 points.

Minimum Elo advantage, which is also the negative part of the error bar:
 18.8904 Elo

End of the calculations.

Thanks for using Minimum_score_for_no_regression. Press Enter to exit.
378.5 points out of 718 games is the exact score that Houdini has now. Things are changing very fast in this match!

You are doing a great job, George. Thanks in advance once again.

------------------------
Jimmy Huggins wrote:By looking at the % and a comment by someone on the rybka forum I think I know what Engine40x is.
Could you provide the exact link to the source, please? I have not found anything at Rybka Forum, although I only did some quick searches. By the way, are you referring to Strelka MP (although this match is played in single core) or even to a beta of Rybka 5? Thanks in advance.

Regards from Spain.

Ajedrecista.


Thanks, Jesus. I am getting where I enjoy my updates much more when you are around to check and comment. Not to point out errors on my part in testing (but if you see any, by all means tell me), but just to check things and give me your "LOS" thoughts, and how things stand statistically. I just enjoy your ideas and comments. And like Adam said, you are extremely good at what you do.

Ernest is a good guy, and generally I enjoy his replies- except when he gets on his " ha- I told you so" attitude, like above. That thread accomplished nothing- because it told me nothing I did not already know.

Please PM me and just watch for my reply. I should reply quickly. I have some info that will be very very very interesting to you concerning this match!!!!



Best,

george
Jimmy Huggins
Posts: 98
Joined: Tue Feb 15, 2011 7:00 am
Location: Kansas USA

Re: 40x(2) makes a run! Where are you Jesús?

Post by Jimmy Huggins »

Well thanks for your answer to me. :lol:
ernest
Posts: 2041
Joined: Wed Mar 08, 2006 8:30 pm

Re: 40x(2) makes a run! Where are you Jesús?

Post by ernest »

geots wrote:...generally I enjoy his replies- except when he gets on his " ha- I told you so" attitude, like above.
Well that's part of my preaching spirit... 8-)
Too many people think they know something after 50 or 100 games and dismiss the huge error bars (what I call statistics).

With you, I preach to a believer, since you are undertaking a 1000-game match (but at the end, there also will be an error bar, albeit smaller...)
User avatar
geots
Posts: 4790
Joined: Sat Mar 11, 2006 12:42 am

Re: 40x(2) makes a run! Where are you Jesús?

Post by geots »

ernest wrote:
geots wrote:...generally I enjoy his replies- except when he gets on his " ha- I told you so" attitude, like above.
Well that's part of my preaching spirit... 8-)
Too many people think they know something after 50 or 100 games and dismiss the huge error bars (what I call statistics).

With you, I preach to a believer, since you are undertaking a 1000-game match (but at the end, there also will be an error bar, albeit smaller...)

The run where 40x(2) knocked 10 elo off Houdini's lead could easily have been a couple hundred games later- maybe around game 1015. But we would have never known................. (No argument or point, really- just an observation.)

george