POLL:Man vs Machine ?

Discussion of anything and everything relating to chess playing software and machines.

Moderator: Ras

MM
Posts: 766
Joined: Sun Oct 16, 2011 11:25 am

Re: POLL:Man vs Machine ?

Post by MM »

Don wrote:
Sedat Canbaz wrote:
Don wrote:
MM wrote: Hello,

2600 is generic, anyway the difference in strenght between a 2600 and a 2800 is huge. If we consider any kind of GM, of course the machines are clearly dominant.
I agree with that. What is interesting to me is not so much who is better but why they are better. What makes the computer stronger at normal human-like time controls?

Part of the answer has to do with the time controls and why humans play better and better with more time. There is still the tactical vs positional play issue and we might now ask if computers generally outplay humans positionally. I think in general they seem to but their dominance is not so clear in this respect. Computers have this extremely solid style where they do not seem to overlook anything and I don't necessarily mean what we call tactics.

Really there is no such thing as tactics, it's a word we made up to mean not overlooking big things, like obvious wins of material. Computers are good at that but they don't overlook little things either - provided they understand them. That's where they are really dominant and they are so good at it that it seems to cover over their inferior positional understanding faults. You cannot tell they are inferior when they calculate so well.

Sure that the difference is huge between 2600 and 2800 Elo

But however,it seems many of the chess friends dont care a lot about the speed of the processors and the strenght of opening books

Just i'd like to mention again that the hardware speed is very important factor

For example,980X @4.33GHz is approx.3 times faster than Quad 2.40GHz:

Code: Select all

Hardware-Processor        Speed      Cores     kN/s
Intel Core i7 980X      @ 4.33 GHz     6      18709
Intel Core 2 Q6600        2.40 GHz     4       6771
*The hardware Elo difference is expecting to be approx.130-150 Elo


Another very important note is that:the power of the opening books

Even exactly on same equal conditions (exception books),we can see huge different Elo standings

I have no much free time to post all my book tournament links,but here is the latest one:
http://www.sedatcanbaz.com/chess/scct-super-league/


Hope this helps


Best,
Sedat
The most effective way, in my opinion, to measure the difference in humans and computers can be done at any time control. You simply turn off pondering and play time handicap games. I would recommend that the computer be rigged not to move faster than it normally would anyway. So if the time control is 40/30 minutes and the handicap is that the computer play 40/5 minutes, that is a 6 to 1 handicap. So if the computer has a move ready in 5 seconds it should "wait" for 30 seconds, that is 6x longer, to return the move. Of course the interface or the computer can be rigged to return the move.

The reason for this is that humans can be upset by pace. If the computer is playing instantly the human is still be robbed of time and most humans can be provoked into playing too fast if their opponent is playing fast.

Once you have established a baseline of equivalence, you can extrapolate pretty easily. In fact you don't need a top player, just get a strong player who is 200-400 below the top players who is willing to play a lot of games, or better yet get a number of players over 2400 willing to do this but who have well established ratings. We need to do something like this because we are getting the point where we don't really have reliable data on how strong computer are compared to humans - this would at least give us a good reference point.

Hi Don, your idea is interesting but i don't think it can work.

As u say, we are getting the point where we don't have rielable data on how strong computers are compared to human.

Whatever result could happen with your method, anybody could say'' but there was no 2800...Carlsen plays differently....handicap is in favour of the machine ...and many other complaints.

At the time of the match Rybka 3 - Milov (2700), people were generally convinced that, even with Rybka Handicapped, Milov would have been crushed bad.
As you know Milov won that match (handicap match) and nothing changed.

There's a party pro-computer (very large) and a party pro-human (very little).

The error of Rybka team has been, in my point of view, to make an handicap match.

Wanna do a match? Ok, lets fight but no handicap.

I think Milov would have agreed.

Like Fern used to say...only my thoughts...

Best Regards
MM
MM
Posts: 766
Joined: Sun Oct 16, 2011 11:25 am

Re: POLL:Man vs Machine ?

Post by MM »

Sedat Canbaz wrote:
MM wrote:
Hi Sedat,

yes, i know hardware and opening book gives plus elo.
But basically a huge hardware is devastating against a small hardware. Much less devastating against a superGM.
Why?
Because two engines supported by very different HW make a battle of tactics and the one who has the better HW triumphs almost always.

In matches Engines vs Humans, if the engine has a limited chess knowledge, you can give it a 20 times stronger hardware and that engine will probably keep on missing the correct positional manouevres, because it has a limited sensibility for the positional play and the add of calculation power hardly can compensate it (it depends by the position and the depth of the position anyway).
And this is very important because in match human vs engine, the human being should and must try to drive the game into a positional strategical direction, in which engines have the main difficulties.

So the difference in elo that you showed in your scheme, are obviously right, but they are related to machines vs machines, not to machines vs humans.

Basically: if engine A, with super HW and elo 3500 is 350 elo stronger than engine B, with small HW, and this same engine B has the same elo of Carlsen, we are not allowed to say that engine A is 350 elo stronger than Carlsen, for the reasons i just explained.

As regards the opening book, it is different. The impact of an opening book, if made very well, can make you win a game so i would say that it is very important.

Thank you.

Best Regards

Dear Maurizio,

Not at all...its my pleasure

Actually i agree with some of your notes,but however not with all

Especially i dont agree with you about the hardware speed's factor

Believe me,the hardware speed is playing a big role,if we are talking about Man vs Machine

I know that in 1990 years, the egines were weaker...but remember also that the hardwares were much slower too

I think this is one of the main reasons/factors about why in 1990 years GMs were stronger than the Machines

Some notes about the previous played matches-Man vs Machine:

1)In 1990 years = GMs were stronger than Chess Engines
2)In the early of 2000 years = GMs were equal to Chess Engines
3)In 2003/2004/2005/2006 = Chess Engines performed approx. 190 Elo better than GMs
4)Now we are in 2012 = We have Houdini,Rybka,Critter and much faster hardwares than the past


Kind Regards,
Sedat
Hello Sedat,

i know HW has a big role, i just say that the difference in software and HW between old engines and old machines, compared to ours (Houdini, Critter, Komodo, i7, xeon e5 ect) have much more influence in the world of chess engines and much less in an hypotetical comparison human vs machine.

It's like to say:

well, 10 years ago, the best engine on the best HW of that period, was 600 elo weaker than Houdini on a octal right now.

Ok, it can be true.

But this calculation may be applicable to the engines, not to the humans.

I think that in 1990, Kasparov was clearly by far stronger than Deep Thought (he showed).

In 1996 Kasparov defeated again Deep Blu of IBM but in 1997 he lost in a controversial revenge.

We are in 2012.

Obvious, software has been improved, HW has been hugely improved.

But, especially HW, it is usuful mainly to calculate. Calculate what? Calculate variations, lines, kn/s, millions positions per seconds.

A human plays differently. He also calculate but he uses much more heuristic. A human can't calculate probably 1/1000 of what a machine calculates in 10 seconds.

So a human is forced to analyze the position to try some pattern, some scheme, that he knows and that he can work on, especially on long time range.

That's exactly what machine can't do and can't see.
A long range plan.

That's why the huge improvement of the speed of the machines, is very usueful against other slower machines, but no so useful against a humans.

That's what i think.

Thank you

Best Regards
MM
Sedat Canbaz
Posts: 3018
Joined: Thu Mar 09, 2006 11:58 am
Location: Antalya/Turkey

Re: POLL:Man vs Machine ?

Post by Sedat Canbaz »

MM wrote:
Sedat Canbaz wrote:
MM wrote:
Hi Sedat,

yes, i know hardware and opening book gives plus elo.
But basically a huge hardware is devastating against a small hardware. Much less devastating against a superGM.
Why?
Because two engines supported by very different HW make a battle of tactics and the one who has the better HW triumphs almost always.

In matches Engines vs Humans, if the engine has a limited chess knowledge, you can give it a 20 times stronger hardware and that engine will probably keep on missing the correct positional manouevres, because it has a limited sensibility for the positional play and the add of calculation power hardly can compensate it (it depends by the position and the depth of the position anyway).
And this is very important because in match human vs engine, the human being should and must try to drive the game into a positional strategical direction, in which engines have the main difficulties.

So the difference in elo that you showed in your scheme, are obviously right, but they are related to machines vs machines, not to machines vs humans.

Basically: if engine A, with super HW and elo 3500 is 350 elo stronger than engine B, with small HW, and this same engine B has the same elo of Carlsen, we are not allowed to say that engine A is 350 elo stronger than Carlsen, for the reasons i just explained.

As regards the opening book, it is different. The impact of an opening book, if made very well, can make you win a game so i would say that it is very important.

Thank you.

Best Regards

Dear Maurizio,

Not at all...its my pleasure

Actually i agree with some of your notes,but however not with all

Especially i dont agree with you about the hardware speed's factor

Believe me,the hardware speed is playing a big role,if we are talking about Man vs Machine

I know that in 1990 years, the egines were weaker...but remember also that the hardwares were much slower too

I think this is one of the main reasons/factors about why in 1990 years GMs were stronger than the Machines

Some notes about the previous played matches-Man vs Machine:

1)In 1990 years = GMs were stronger than Chess Engines
2)In the early of 2000 years = GMs were equal to Chess Engines
3)In 2003/2004/2005/2006 = Chess Engines performed approx. 190 Elo better than GMs
4)Now we are in 2012 = We have Houdini,Rybka,Critter and much faster hardwares than the past


Kind Regards,
Sedat
Hello Sedat,

i know HW has a big role, i just say that the difference in software and HW between old engines and old machines, compared to ours (Houdini, Critter, Komodo, i7, xeon e5 ect) have much more influence in the world of chess engines and much less in an hypotetical comparison human vs machine.

It's like to say:

well, 10 years ago, the best engine on the best HW of that period, was 600 elo weaker than Houdini on a octal right now.

Ok, it can be true.

But this calculation may be applicable to the engines, not to the humans.

I think that in 1990, Kasparov was clearly by far stronger than Deep Thought (he showed).

In 1996 Kasparov defeated again Deep Blu of IBM but in 1997 he lost in a controversial revenge.

We are in 2012.

Obvious, software has been improved, HW has been hugely improved.

But, especially HW, it is usuful mainly to calculate. Calculate what? Calculate variations, lines, kn/s, millions positions per seconds.

A human plays differently. He also calculate but he uses much more heuristic. A human can't calculate probably 1/1000 of what a machine calculates in 10 seconds.

So a human is forced to analyze the position to try some pattern, some scheme, that he knows and that he can work on, especially on long time range.

That's exactly what machine can't do and can't see.
A long range plan.

That's why the huge improvement of the speed of the machines, is very usueful against other slower machines, but no so useful against a humans.

That's what i think.

Thank you

Best Regards

Hello Maurizio,

I think you are missing one important point

For example,if you check more carefully the engine versions, which are played during 2004/2005 years

You will noticed that in those years the version were Hydra,Deep Fritz 8,Deep Junior 8...

Currently these engine versions (which are played vs GMs) are rated around 2900 Elo SSDF points (exception Hydra,which is private)
http://ssdf.bosjo.net/list.htm

Note also that even on Quad 2.40GHz,Rybka 4's performance is 3220 Elo

In other words,if the participants were based on only by Houdini/Rybka/Critter/Hydra
Then i would expect to see at least 400 Elo difference (i mean for those years of 2004,2005,2006)

Of course,i have no much information about the speed of all used hardwares,which are played in 2004,2005 years

So i guess,overall Elo performance of the played Engines were around 3000 Elo (due to Hydra's speed hardware advantage in 2004,2005)

Yes...now we are in 2012 and more than 6 years is gone where the Machines are missed to play vs GMs


And its time for reality-no patience to see the duel match:Man vs Machine 2012


Greetings,
Sedat
Last edited by Sedat Canbaz on Tue Feb 21, 2012 6:20 pm, edited 3 times in total.
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: POLL:Man vs Machine ?

Post by Don »

MM wrote: Hi Don, your idea is interesting but i don't think it can work.

As u say, we are getting the point where we don't have rielable data on how strong computers are compared to human.

Whatever result could happen with your method, anybody could say'' but there was no 2800...Carlsen plays differently....handicap is in favour of the machine ...and many other complaints.
Science is all about making interpolations like this. People will complain that the test should have been like this or like that, but there is NO WAY to perfectly construct a test and most of their concerns are personal superstitions, not based on sound thinking.

You believe that there is something special and different about Carlsen that somehow invalidates any test with weaker players, but he does not play any differently from anyone else, he is just stronger. A 1900 is stronger than a 1700 player too, but there is nothing special that separates them other than ELO. The 1900 player is not made out of special stuff. There is really only ONE thing special about Carlsen, and that is that he is the strongest player in the world. That makes him special only because we attach a degree of significance to it. In fact I believe that Carlsen pretty much sucks. I am a big fan and he is my current favorite player and I admire him for being so strong so young, but he sucks! I am sure that he plays hundreds of ELO below optimal and there is nothing special about that. No only that but computers suck too and Carlsen is almost certainly at least 200 ELO below the best computer programs. So he cannot beat players that suck!

I say of this to illustrate that there nothing special that separates the top players from everyone else other than what separates YOU from players a few hundred ELO weaker.

So I think my test is perfect valid. If you get players within 400 ELO of the top players in strength and get a large enough sample of games at time odds you can get a very good estimate of how much stronger computer programs are. We can do that pretty easily because we know their ELO and we know how much the ELO increases with programs when time is added. It won't be perfect but it will much better than what we have now.

At the time of the match Rybka 3 - Milov (2700), people were generally convinced that, even with Rybka Handicapped, Milov would have been crushed bad.
As you know Milov won that match (handicap match) and nothing changed.

There's a party pro-computer (very large) and a party pro-human (very little).

The error of Rybka team has been, in my point of view, to make an handicap match.

Wanna do a match? Ok, lets fight but no handicap.

I think Milov would have agreed.

Like Fern used to say...only my thoughts...

Best Regards
MM
Posts: 766
Joined: Sun Oct 16, 2011 11:25 am

Re: POLL:Man vs Machine ?

Post by MM »

Don wrote:
MM wrote: Hi Don, your idea is interesting but i don't think it can work.

As u say, we are getting the point where we don't have rielable data on how strong computers are compared to human.

Whatever result could happen with your method, anybody could say'' but there was no 2800...Carlsen plays differently....handicap is in favour of the machine ...and many other complaints.
Science is all about making interpolations like this. People will complain that the test should have been like this or like that, but there is NO WAY to perfectly construct a test and most of their concerns are personal superstitions, not based on sound thinking.

You believe that there is something special and different about Carlsen that somehow invalidates any test with weaker players, but he does not play any differently from anyone else, he is just stronger. A 1900 is stronger than a 1700 player too, but there is nothing special that separates them other than ELO. The 1900 player is not made out of special stuff. There is really only ONE thing special about Carlsen, and that is that he is the strongest player in the world. That makes him special only because we attach a degree of significance to it. In fact I believe that Carlsen pretty much sucks. I am a big fan and he is my current favorite player and I admire him for being so strong so young, but he sucks! I am sure that he plays hundreds of ELO below optimal and there is nothing special about that. No only that but computers suck too and Carlsen is almost certainly at least 200 ELO below the best computer programs. So he cannot beat players that suck!

I say of this to illustrate that there nothing special that separates the top players from everyone else other than what separates YOU from players a few hundred ELO weaker.

So I think my test is perfect valid. If you get players within 400 ELO of the top players in strength and get a large enough sample of games at time odds you can get a very good estimate of how much stronger computer programs are. We can do that pretty easily because we know their ELO and we know how much the ELO increases with programs when time is added. It won't be perfect but it will much better than what we have now.

At the time of the match Rybka 3 - Milov (2700), people were generally convinced that, even with Rybka Handicapped, Milov would have been crushed bad.
As you know Milov won that match (handicap match) and nothing changed.

There's a party pro-computer (very large) and a party pro-human (very little).

The error of Rybka team has been, in my point of view, to make an handicap match.

Wanna do a match? Ok, lets fight but no handicap.

I think Milov would have agreed.

Like Fern used to say...only my thoughts...

Best Regards
Hi Don, i didn't contest your idea, i like it but, speaking in general, i am a bit skeptical about the reaction of people, and not about ideas, at least this idea of yours, anyway..
please give an example to make me understand better.
If you found that on avarage of 2400 elo, humans score 50% against Houdini, having 6 times Houdini's time, what does it involves?

I mean, i'm not a math, i need an example how to translate a score from weaker players to Carlsen or high level player.

Thank you

Regards
MM
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: POLL:Man vs Machine ?

Post by Don »

MM wrote:
Don wrote:
MM wrote: Hi Don, your idea is interesting but i don't think it can work.

As u say, we are getting the point where we don't have rielable data on how strong computers are compared to human.

Whatever result could happen with your method, anybody could say'' but there was no 2800...Carlsen plays differently....handicap is in favour of the machine ...and many other complaints.
Science is all about making interpolations like this. People will complain that the test should have been like this or like that, but there is NO WAY to perfectly construct a test and most of their concerns are personal superstitions, not based on sound thinking.

You believe that there is something special and different about Carlsen that somehow invalidates any test with weaker players, but he does not play any differently from anyone else, he is just stronger. A 1900 is stronger than a 1700 player too, but there is nothing special that separates them other than ELO. The 1900 player is not made out of special stuff. There is really only ONE thing special about Carlsen, and that is that he is the strongest player in the world. That makes him special only because we attach a degree of significance to it. In fact I believe that Carlsen pretty much sucks. I am a big fan and he is my current favorite player and I admire him for being so strong so young, but he sucks! I am sure that he plays hundreds of ELO below optimal and there is nothing special about that. No only that but computers suck too and Carlsen is almost certainly at least 200 ELO below the best computer programs. So he cannot beat players that suck!

I say of this to illustrate that there nothing special that separates the top players from everyone else other than what separates YOU from players a few hundred ELO weaker.

So I think my test is perfect valid. If you get players within 400 ELO of the top players in strength and get a large enough sample of games at time odds you can get a very good estimate of how much stronger computer programs are. We can do that pretty easily because we know their ELO and we know how much the ELO increases with programs when time is added. It won't be perfect but it will much better than what we have now.

At the time of the match Rybka 3 - Milov (2700), people were generally convinced that, even with Rybka Handicapped, Milov would have been crushed bad.
As you know Milov won that match (handicap match) and nothing changed.

There's a party pro-computer (very large) and a party pro-human (very little).

The error of Rybka team has been, in my point of view, to make an handicap match.

Wanna do a match? Ok, lets fight but no handicap.

I think Milov would have agreed.

Like Fern used to say...only my thoughts...

Best Regards
Hi Don, i didn't contest your idea, i like it but, speaking in general, i am a bit skeptical about the reaction of people, and not about ideas, at least this idea of yours, anyway..
please give an example to make me understand better.
If you found that on avarage of 2400 elo, humans score 50% against Houdini, having 6 times Houdini's time, what does it involves?

I mean, i'm not a math, i need an example how to translate a score from weaker players to Carlsen or high level player.

Thank you

Regards
I can make up a simple example. We take 4 fide players rated 2600 for example and they all play 100 games so that we have 400 games total. In order to get a lot of games we play matches at a fast time control, but not ridiculous. Let's say Fischer game in 15 minutes plus 10 or 15 second increment, something like that. We take a program such as Komodo and rig it to play at a level where it is roughly winning 50% of the games. I don't know what the level might be, but it might require some trial and error. Let's say it turns out that we need to play 10x faster - but Komodo is rigged to "pretend" to take more time for the benefit of the human player but in reality it's player 1.5 minutes + 1 second which is 10x faster. No pondering.

When the match is complete, we can easily estimate Komodo's rating at this time control under these conditions against human players.

Let's pretend that Komodo comes out winning 50% of the games and thus gets rating at 2600 ELO relatively to humans at this time control playing 10x faster on this hardware. So now the question is how strong would Komodo be had it played at the SAME time control which is 10x longer? That can be computed by simply running a series of automated tests. When it's done we have a rating estimate.

Larry Kaufman would probably suggest a rating adjustment based on computer vs computer play as the difference would be slightly overstated. I don't remember his formula but if it turned that Komodo plays at 3050 level Larry might calculate that it's overstated by 50 ELO (For example.) Since pondering is turned off we also have to ADD back a few ELO for that, perhaps 30 or 40 ELO. After these calculations we have a "reasonable" estimate of the ELO of computers vs humans but only at this exact time control.

I think this estimate could be directly compared to Carlsen or any other player and we would not be off by very much.

I think there is one more important element in this test. The human needs to be properly motivated just as he would be in any other rated match situation. The results could be widely advertised and that serves as motivation or he can be given monetary compensation for each win and draw - but the main point is that match conditions cannot be too relaxed or casual or experimental as that is likely to skew the results. It should not be played online with the master sitting at his computer and checking email and such.

Like any other kind of interpolation, it is desirable to minimize how much you have to interpolate! For example it is better to get the strongest players possible, if you have to settle for 2400 players instead of 2600 or 2700 players, the interpolation is stretched and the margin of error is likely to be wider. If the computer has to play too fast to make the match even, there is more error too. And of course the faster the human has to play (in order to get larger samples) the less relevant the result is to "match conditions." In fact the result is not accurate except for the time control chosen - so if you want to estimate how strong the computer is at longer time controls relative to humans you have to do another interpolation calculation that is subject to more error. We don't really have a good way to make that calculation - we just know that humans do better with longer time controls but there is no solid reliable data that will tell us how much to adjust the computer by for that.
MM
Posts: 766
Joined: Sun Oct 16, 2011 11:25 am

Re: POLL:Man vs Machine ?

Post by MM »

Don wrote:
MM wrote:
Don wrote:
MM wrote: Hi Don, your idea is interesting but i don't think it can work.

As u say, we are getting the point where we don't have rielable data on how strong computers are compared to human.

Whatever result could happen with your method, anybody could say'' but there was no 2800...Carlsen plays differently....handicap is in favour of the machine ...and many other complaints.
Science is all about making interpolations like this. People will complain that the test should have been like this or like that, but there is NO WAY to perfectly construct a test and most of their concerns are personal superstitions, not based on sound thinking.

You believe that there is something special and different about Carlsen that somehow invalidates any test with weaker players, but he does not play any differently from anyone else, he is just stronger. A 1900 is stronger than a 1700 player too, but there is nothing special that separates them other than ELO. The 1900 player is not made out of special stuff. There is really only ONE thing special about Carlsen, and that is that he is the strongest player in the world. That makes him special only because we attach a degree of significance to it. In fact I believe that Carlsen pretty much sucks. I am a big fan and he is my current favorite player and I admire him for being so strong so young, but he sucks! I am sure that he plays hundreds of ELO below optimal and there is nothing special about that. No only that but computers suck too and Carlsen is almost certainly at least 200 ELO below the best computer programs. So he cannot beat players that suck!

I say of this to illustrate that there nothing special that separates the top players from everyone else other than what separates YOU from players a few hundred ELO weaker.

So I think my test is perfect valid. If you get players within 400 ELO of the top players in strength and get a large enough sample of games at time odds you can get a very good estimate of how much stronger computer programs are. We can do that pretty easily because we know their ELO and we know how much the ELO increases with programs when time is added. It won't be perfect but it will much better than what we have now.

At the time of the match Rybka 3 - Milov (2700), people were generally convinced that, even with Rybka Handicapped, Milov would have been crushed bad.
As you know Milov won that match (handicap match) and nothing changed.

There's a party pro-computer (very large) and a party pro-human (very little).

The error of Rybka team has been, in my point of view, to make an handicap match.

Wanna do a match? Ok, lets fight but no handicap.

I think Milov would have agreed.

Like Fern used to say...only my thoughts...

Best Regards
Hi Don, i didn't contest your idea, i like it but, speaking in general, i am a bit skeptical about the reaction of people, and not about ideas, at least this idea of yours, anyway..
please give an example to make me understand better.
If you found that on avarage of 2400 elo, humans score 50% against Houdini, having 6 times Houdini's time, what does it involves?

I mean, i'm not a math, i need an example how to translate a score from weaker players to Carlsen or high level player.

Thank you

Regards
I can make up a simple example. We take 4 fide players rated 2600 for example and they all play 100 games so that we have 400 games total. In order to get a lot of games we play matches at a fast time control, but not ridiculous. Let's say Fischer game in 15 minutes plus 10 or 15 second increment, something like that. We take a program such as Komodo and rig it to play at a level where it is roughly winning 50% of the games. I don't know what the level might be, but it might require some trial and error. Let's say it turns out that we need to play 10x faster - but Komodo is rigged to "pretend" to take more time for the benefit of the human player but in reality it's player 1.5 minutes + 1 second which is 10x faster. No pondering.

When the match is complete, we can easily estimate Komodo's rating at this time control under these conditions against human players.

Let's pretend that Komodo comes out winning 50% of the games and thus gets rating at 2600 ELO relatively to humans at this time control playing 10x faster on this hardware. So now the question is how strong would Komodo be had it played at the SAME time control which is 10x longer? That can be computed by simply running a series of automated tests. When it's done we have a rating estimate.

Larry Kaufman would probably suggest a rating adjustment based on computer vs computer play as the difference would be slightly overstated. I don't remember his formula but if it turned that Komodo plays at 3050 level Larry might calculate that it's overstated by 50 ELO (For example.) Since pondering is turned off we also have to ADD back a few ELO for that, perhaps 30 or 40 ELO. After these calculations we have a "reasonable" estimate of the ELO of computers vs humans but only at this exact time control.

I think this estimate could be directly compared to Carlsen or any other player and we would not be off by very much.

I think there is one more important element in this test. The human needs to be properly motivated just as he would be in any other rated match situation. The results could be widely advertised and that serves as motivation or he can be given monetary compensation for each win and draw - but the main point is that match conditions cannot be too relaxed or casual or experimental as that is likely to skew the results. It should not be played online with the master sitting at his computer and checking email and such.

Like any other kind of interpolation, it is desirable to minimize how much you have to interpolate! For example it is better to get the strongest players possible, if you have to settle for 2400 players instead of 2600 or 2700 players, the interpolation is stretched and the margin of error is likely to be wider. If the computer has to play too fast to make the match even, there is more error too. And of course the faster the human has to play (in order to get larger samples) the less relevant the result is to "match conditions." In fact the result is not accurate except for the time control chosen - so if you want to estimate how strong the computer is at longer time controls relative to humans you have to do another interpolation calculation that is subject to more error. We don't really have a good way to make that calculation - we just know that humans do better with longer time controls but there is no solid reliable data that will tell us how much to adjust the computer by for that.
Thank you Don, very interesting.

Yes, we should need to be sure that the human is properly motivated.

Best Regards
MM
EroSennin
Posts: 133
Joined: Fri Apr 09, 2010 3:26 am

Re: POLL:Man vs Machine ?

Post by EroSennin »

Maybe I can shed some light on this. I am 2350 elo, and I lose to computers over 90% of the time with a piece handicap at 5 minute blitz. Draws are not counted.
Sedat Canbaz
Posts: 3018
Joined: Thu Mar 09, 2006 11:58 am
Location: Antalya/Turkey

Re: POLL:Man vs Machine ?

Post by Sedat Canbaz »

EroSennin wrote:Maybe I can shed some light on this. I am 2350 elo, and I lose to computers over 90% of the time with a piece handicap at 5 minute blitz. Draws are not counted.
Hello dear Jean,

It sounds interesting...

When you have free time,can you play vs the bellow engines at 5 min

I wonder about we are using right starting Elo calculation or not

Code: Select all

174 Horizon 4.3.165 Beta         2404   58   56   110   59%  2340   19%
175 Flux 2.2                     2393   60   61   100   46%  2420   23%
176 Amy 0.8.7b t2                2388   54   54   120   45%  2429   23%
177 Gaia 3.5 x64                 2386   45   46   160   42%  2441   29%
178 Averno 0.81                  2383   54   53   116   59%  2320   29%
179 Resp 0.19                    2383   53   53   116   55%  2348   29%
180 Aice 0.99.2                  2383   60   60   100   54%  2354   17%
181 Tytan 9.32 x64 t2            2381   57   57   110   50%  2377   25%
182 Chezzz 1.0.3                 2379   52   52   120   54%  2356   31%
183 The Crazy Bishop 0052        2378   58   58   100   45%  2414   27%
184 BBChess 1.10                 2377   60   59   100   56%  2337   28%
185 Butcher 1.58 x64             2375   57   57   106   49%  2384   25%
186 Esc 1.16                     2373   58   57   110   59%  2309   18%
187 Queen 3.09                   2371   55   57   115   39%  2457   24%
188 Diablo 0.51 JA               2369   56   57   110   41%  2442   24%
189 Alfil 7.6                    2361   52   54   122   37%  2451   29%
190 EXchess 5.01 Beta            2355   59   60   100   45%  2385   26%
191 NanoSzachy 2.9               2351   62   61   100   57%  2298   16%
192 Popochin 3.0                 2350   55   55   110   50%  2352   27%
193 Arion 1.7                    2347   56   57   110   41%  2411   21%
194 Tornado 1.0 Mainz            2344   54   57   120   32%  2476   25%
195 Zeus 1.29                    2341   57   56   110   55%  2311   22%
196 Chispa 4.0.3                 2340   61   62   100   44%  2388   16%
197 RomiChess NG4                2339   54   56   120   34%  2454   23%
198 GreKo 5.4                    2334   53   54   124   42%  2383   23%
199 Asterisk v0.6                2334   57   56   104   55%  2300   30%
200 Ant 2006-F                   2328   49   50   136   40%  2397   29%
201 GES 1.36                     2327   56   57   110   44%  2366   25%
202 Anechka 0.08                 2310   59   60   100   41%  2374   22%
203 KnightX 1.92                 2309   58   59   104   44%  2353   20%
204 Rotor 0.2                    2304   54   55   115   43%  2353   27%
205 Sage 2.2a                    2303   59   59   100   48%  2317   24%
206 Scidlet 3.6                  2303   58   58   110   53%  2282   15%
207 Natwarlal 0.14               2303   57   58   110   45%  2342   16%
208 LittleThought 1.01 x64       2302   59   59   100   44%  2355   25%

Thanks,
Sedat
Sedat Canbaz
Posts: 3018
Joined: Thu Mar 09, 2006 11:58 am
Location: Antalya/Turkey

Re: POLL:Man vs Machine ?

Post by Sedat Canbaz »

EroSennin wrote:Maybe I can shed some light on this. I am 2350 elo, and I lose to computers over 90% of the time with a piece handicap at 5 minute blitz. Draws are not counted.
Hello dear Jean,

It sounds interesting...

When you have free time,can you play vs the bellow engines at 5 min

I wonder about we are using right starting Elo calculation or not

Code: Select all

174 Horizon 4.3.165 Beta         2404   58   56   110   59%  2340   19%
175 Flux 2.2                     2393   60   61   100   46%  2420   23%
176 Amy 0.8.7b t2                2388   54   54   120   45%  2429   23%
177 Gaia 3.5 x64                 2386   45   46   160   42%  2441   29%
178 Averno 0.81                  2383   54   53   116   59%  2320   29%
179 Resp 0.19                    2383   53   53   116   55%  2348   29%
180 Aice 0.99.2                  2383   60   60   100   54%  2354   17%
181 Tytan 9.32 x64 t2            2381   57   57   110   50%  2377   25%
182 Chezzz 1.0.3                 2379   52   52   120   54%  2356   31%
183 The Crazy Bishop 0052        2378   58   58   100   45%  2414   27%
184 BBChess 1.10                 2377   60   59   100   56%  2337   28%
185 Butcher 1.58 x64             2375   57   57   106   49%  2384   25%
186 Esc 1.16                     2373   58   57   110   59%  2309   18%
187 Queen 3.09                   2371   55   57   115   39%  2457   24%
188 Diablo 0.51 JA               2369   56   57   110   41%  2442   24%
189 Alfil 7.6                    2361   52   54   122   37%  2451   29%
190 EXchess 5.01 Beta            2355   59   60   100   45%  2385   26%
191 NanoSzachy 2.9               2351   62   61   100   57%  2298   16%
192 Popochin 3.0                 2350   55   55   110   50%  2352   27%
193 Arion 1.7                    2347   56   57   110   41%  2411   21%
194 Tornado 1.0 Mainz            2344   54   57   120   32%  2476   25%
195 Zeus 1.29                    2341   57   56   110   55%  2311   22%
196 Chispa 4.0.3                 2340   61   62   100   44%  2388   16%
197 RomiChess NG4                2339   54   56   120   34%  2454   23%
198 GreKo 5.4                    2334   53   54   124   42%  2383   23%
199 Asterisk v0.6                2334   57   56   104   55%  2300   30%
200 Ant 2006-F                   2328   49   50   136   40%  2397   29%
201 GES 1.36                     2327   56   57   110   44%  2366   25%
202 Anechka 0.08                 2310   59   60   100   41%  2374   22%
203 KnightX 1.92                 2309   58   59   104   44%  2353   20%
204 Rotor 0.2                    2304   54   55   115   43%  2353   27%
205 Sage 2.2a                    2303   59   59   100   48%  2317   24%
206 Scidlet 3.6                  2303   58   58   110   53%  2282   15%
207 Natwarlal 0.14               2303   57   58   110   45%  2342   16%
208 LittleThought 1.01 x64       2302   59   59   100   44%  2355   25%

Thanks,
Sedat