Martin Thoresen wrote:Swami, reading up on your SF 1.7.1 test you wrote this:
Stockfish 1.7.1 replaced Critter as new #1 with 796/1000.
So basically 1.8 scored 9 lower than 1.7.1?
On total score, yes. But on individual scores, Stockfish 1.8 does some themes better than Stockfish 1.7.1. I hope that those themes count more.
One can't take total score alone and conclude that one version is better than other.
If for example, one version does a lot better in certain theme (let's call it STS 5), but does worser in total score, it may even play better than the other because STS 5 may be more important than other STS's). We just don't know which theme is important.
Martin Thoresen wrote:Swami, reading up on your SF 1.7.1 test you wrote this:
Stockfish 1.7.1 replaced Critter as new #1 with 796/1000.
So basically 1.8 scored 9 lower than 1.7.1?
On total score, yes. But on individual scores, Stockfish 1.8 does some themes better than Stockfish 1.7.1. I hope that those themes count more.
One can't take total score alone and conclude that one version is better than other.
If for example, one version does a lot better in certain theme (let's call it STS 5), but does worser in total score, it may even play better than the other because STS 5 may be more important than other STS's). We just don't know which theme is important.
Martin Thoresen wrote:Swami, reading up on your SF 1.7.1 test you wrote this:
Stockfish 1.7.1 replaced Critter as new #1 with 796/1000.
So basically 1.8 scored 9 lower than 1.7.1?
On total score, yes. But on individual scores, Stockfish 1.8 does some themes better than Stockfish 1.7.1. I hope that those themes count more.
One can't take total score alone and conclude that one version is better than other.
If for example, one version does a lot better in certain theme (let's call it STS 5), but does worser in total score, it may even play better than the other because STS 5 may be more important than other STS's). We just don't know which theme is important.
It's tested on my old cpu: 2.4 GHZ, 32 bits and Q6600 and at 10 seconds per position.
I hope someone with better hardware tests both the versions. Stronger engines obviously should be tested on better hardware and at intermediate time controls (minimum of 30 seconds per position)
Martin Thoresen wrote:Swami, reading up on your SF 1.7.1 test you wrote this:
Stockfish 1.7.1 replaced Critter as new #1 with 796/1000.
So basically 1.8 scored 9 lower than 1.7.1?
On total score, yes. But on individual scores, Stockfish 1.8 does some themes better than Stockfish 1.7.1. I hope that those themes count more.
One can't take total score alone and conclude that one version is better than other.
If for example, one version does a lot better in certain theme (let's call it STS 5), but does worser in total score, it may even play better than the other because STS 5 may be more important than other STS's). We just don't know which theme is important.
It's tested on my old cpu: 2.4 GHZ, 32 bits and Q6600 and at 10 seconds per position.
I hope someone with better hardware tests both the versions. Stronger engines obviously should be tested on better hardware and at intermediate time controls (minimum of 30 seconds per position)
Yes, seems defenitly weaker then 1.7.1
This is interesting, let's see how it will go in normal games, anyhow as I said, we don't foreseen an important gain from 1.7.1, just a small one...
mcostalba wrote:Yes, seems defenitly weaker then 1.7.1
This is interesting, let's see how it will go in normal games, anyhow as I said, we don't foreseen an important gain from 1.7.1, just a small one...
Hi Marco,
I can tell you for one thing, I have done many tests on updates of others engines. So far, the STS results seems fairly consistent in telling us how much the version has improved over the previous one.
It also predicts the rough ratings of the new engine.
It worked for 95% of the engines.
Remaining 5% is tough to predict and it's especially for engines which are well beyond 3000. Total score sometimes tells, sometimes it doesn't -- especially in 10 seconds/per position and a bit slow hardware. So I hope this is the case and that Stockfish 1.8 is indeed few elo's better than Stockfish 1.7.1.
mcostalba wrote:we don't foreseen an important gain from 1.7.1, just a small one...
actually iirc the initial word was no gain in 1.7 but big gains after
btw, idk if you saw on the other forum but my match is finished, +22 elo for 1.8 not so far off from where you think the gain is, but i shd just forget hyperbullet eh?
mcostalba wrote:we don't foreseen an important gain from 1.7.1, just a small one...
actually iirc the initial word was no gain in 1.7 but big gains after
btw, idk if you saw on the other forum but my match is finished, +22 elo for 1.8 not so far off from where you think the gain is, but i shd just forget hyperbullet eh?
Hyperbullet gives a (too) big premium to the fastest version and 1.8 is a bit faster then 1.7.1, but with longer TC this premium greately decreases.