IPON

mjlef · Post by **mjlef** » Wed May 24, 2017 4:29 pm

IWB wrote:
JJJ wrote:If contempt doesn't change that much the Elo in rating list, maybe the default contempt should be set to 0 instead of 10, the rating list would be more accurate to me. I d prefer to see the real score of Komodo against Stockfish and Houdini instead a lower one due to contempt.

I am with you here, but the developers decided different ... and as it really doesn't matter that much ... \_(ツ)_/

Since users can set Contempt to any value they want, anyone can test however they wish. The value of 10 was selected since most programs are weaker than Komodo. It is meant to mimic what a Grandmaster would do against a weaker opponent. A positive Contempt in Komodo tries to not exchange pieces and keep the position open, and to try to avoid draws. I feel this is a valuable feature and testing shows it helps against weaker opponents. And a negative Contempt is useful in getting Komodo to seek draws and lock up the board, whihc helps some correspondence players that "only need a draw". In a match against strong opponents we would use a small number or even zero. If more GUIs supported sending elo ratings to the engines involved, Komodo could decide for itself what Contempt to use. Since humans normally know their opponents, it seems unfair for the engines to not get this information.

IWB · Post by **IWB** » Wed May 24, 2017 4:38 pm

I don't mind the contempt but i am unsure about the rating. At first I thought that is a good idea but what do you do if you get a 3300 and nothing else as Elo?

Is that SSDF, CEGT, IPON or CCRL? Without a common standard this is a lot of guesswork

lkaufman · Post by **lkaufman** » Wed May 24, 2017 10:12 pm

IWB wrote:I don't mind the contempt but i am unsure about the rating. At first I thought that is a good idea but what do you do if you get a 3300 and nothing else as Elo?

Is that SSDF, CEGT, IPON or CCRL? Without a common standard this is a lot of guesswork

Also the number of Threads used makes a big difference to the rating, so I think it is impractical to state a rating for each engine to report.

Regarding using Contempt as default, if the rating agencies switch to "WILO" or something like it, or if they make the pairings close by giving the weaker engines more Threads (as CCRL and CEGT do on occasion) and rating them as such, we would switch to zero contempt as default. But as it is, I think we gain slightly more elo using Contempt against the weaker engines than we lose against Stockfish and Houdini, but it's no big deal. I think that a bigger issue is that we get higher performance ratings against Houdini than against Stockfish, for unknown reasons. I hope to be able to document this soon. It means that direct matches with Stockfish, without also including matches with Houdini, make Komodo look worse than it really is. This is independent of Contempt.

Dann Corbit · Post by **Dann Corbit** » Wed May 24, 2017 10:55 pm

Some of the many variables in testing:
Thread count
Hardware
Time control
Books used/not used
Operating System
SSD or not for tablebase files (or no tablebase files)
Hash size
I have also seen that some engines have an evil adversary that scores against them better than it should. We may call it "the antagonist" engine.

There are so many variables that we can have nothing but educated guesses in the long run.

Lyudmil Tsvetkov · Post by **Lyudmil Tsvetkov** » Thu May 25, 2017 9:40 am

IWB wrote:
JJJ wrote:If contempt doesn't change that much the Elo in rating list, maybe the default contempt should be set to 0 instead of 10, the rating list would be more accurate to me. I d prefer to see the real score of Komodo against Stockfish and Houdini instead a lower one due to contempt.

I am with you here, but the developers decided different ... and as it really doesn't matter that much ... \_(ツ)_/

I know Larry and Mark are going to hate me for this, but why do not you(as a fully personal experiment, in brackets), do a second Komodo test, this time with contempt set to 0?

I guess no one has done this until now, and it will be very interesting to compare the results in all ranges.

Komodo team might somewhat lose out, but science will win, you know.

IWB · Post by **IWB** » Thu May 25, 2017 11:12 am

Lyudmil Tsvetkov wrote: I know Larry and Mark are going to hate me for this, but why do not you(as a fully personal experiment, in brackets), do a second Komodo test, this time with contempt set to 0?

I guess no one has done this until now, and it will be very interesting to compare the results in all ranges.

Komodo team might somewhat lose out, but science will win, you know.

I did exactly that with a known outcome:

http://talkchess.com/forum/viewtopic.ph ... 65&t=64054

I will not do it again just to find a difference of 2 Elo - again!

Lyudmil Tsvetkov · Post by **Lyudmil Tsvetkov** » Thu May 25, 2017 1:54 pm

IWB wrote:
Lyudmil Tsvetkov wrote: I know Larry and Mark are going to hate me for this, but why do not you(as a fully personal experiment, in brackets), do a second Komodo test, this time with contempt set to 0?

I guess no one has done this until now, and it will be very interesting to compare the results in all ranges.

Komodo team might somewhat lose out, but science will win, you know.
I did exactly that with a known outcome:

http://talkchess.com/forum/viewtopic.ph ... 65&t=64054

I will not do it again just to find a difference of 2 Elo - again!

then contempt 0 is simply best for Komodo and they do not need contempt at all, unless to play against humans.

but I guess your test should not have been very representative in some way.

how is it possible to lose points vs a dozen opponents and gain only vs 2 and still break even?

I guess difference should be at least some 10 points or so.

IWB · Post by **IWB** » Thu May 25, 2017 3:27 pm

Lyudmil Tsvetkov wrote:
then contempt 0 is simply best for Komodo and they do not need contempt at all, unless to play against humans.

but I guess your test should not have been very representative in some way.

how is it possible to lose points vs a dozen opponents and gain only vs 2 and still break even?

I guess difference should be at least some 10 points or so.

I am not sure if you really read the news on my site!?

Anyhow, I have 3300 games with C0 and 3300 with C-default and you can guess whatever you like. This way we are all happy

IPON

Re: IPON

Re: IPON

Re: IPON

Re: IPON

Re: IPON

Re: IPON

Re: IPON

Re: IPON