Rémi Coulom wrote:Very interesting. I am curious to see the results.
I had started to implement alternative models in bayeselo at the time I wrote the unfinished paper I posted here earlier. But I did not try to MM them. My plan was to use Newton's method or Conjugate Gradient. I don't expect it will be possible to apply MM to Glenn-David.
I recommend normalizing elo scales by having the same derivative at zero of the expected gain (p(win)+p(draw)/2). That's how I did for the original bayeselo.
What seems very popular nowadays is that all sorts of engines do learning in a rather hard manner. Hard i mean: difficult to turn off.
Basically most follow roughly this pattern: if you lose a line, or even draw it, pick another line. If you win a line, repeat it.
A small difference in objective elostrength already can completely dominate the outcome of the match and enlarge the difference.
We still assumed the same book of course for both engines, which isn't realistic either.
One of the reasons for the huge difference in outcome is simply the fact that most books have a very thin 'tournament book line'. Just a few moves are inside that.
If an opponent engine happens to be stronger in one of those tournament lines, then all lines around it probably have a similar outcome as well (assumption). That renders the tournament book suddenly as less useful in such case. Now you can of course suddenly move then to an entire different opening - which is what most booklearners already for 15+ years do.
You get not seldom in old sidelines then.
A problem of old sidelines is that there usually is a refutation for it, or some line that kind of gives practical high chances to beat that old line.
In short winning the first few games of a match in different openings is really important.
For computerchess it would be important to model this. How would you do that?
It's pretty important in this: if you play a 3000 game match like Ernst A Heinz did years ago, in a GUI where you cannot turn off learning (in Fritz you could turn it off for 1 game, but then the 2nd and further it would be turned on again), so the 3000 game match gets heavily influenced by learning; in short objective statistical independant measurements of a 3000 game match is not even close to what practical happens there.
p.s. several engines when you turn off in their UCI settings the learning they still learn. Note that booklearning overlaps here with positionlearning.
On this CCC i have seen several guys post outputs from positions on engines where they all always fall for the learning trick.
It's really really effective in fooling even the most advanced users.
Even in the random book matches people get fooled.
Think of this: you have a score for a position P with moves made P+1, P+2 moves etc. You lose that position having the white colors because of response X from the opponent.
Next game, you have reversed colors, matches get played a lot like that nowadays; the score simply gets used in your hashtable now of your own engine.
It might not help much, but sometimes it does; engines are world champion in making similar mistakes. Especially with todays nearly identical evaluation functions of a lot of engines.
So even position learning influences the outcome. It's not just the plain booklearning only. It's a scala of tricks.
How do you model that?