Albert Silver wrote:
BGBlitz is an excellent program, however it is still a bit behind GNU and Snowie, even Frank acknowledges this. It is currently believed that its checker play is almost on par with them, but its cube algorithms, especially in match play, aren't quite as developed. Probably not a big deal you'll say, and it is unlikely anyone below world class level (as measured by GNU or Snowie) would ever notice any discrepancies.
This has been true some years ago, but BGBlitz is constantly evolving. Since mid 2005 BGBlitz uses the same math as GnuBG for cube decisions, the only differences are that BGBlitz uses other heuristics to estimate the cube liveness and that when you calculate the game tree, BGBlitz looks at the cube only after the whole game tree is evaluated whereas GnuBG looks at the cube at every node of the tree. As a consequence, if the move with the highest equity is not the best move because of a following cube decision BGBlitz will just choose the move with the highest equity and when the analyze shows that the equity justifies a double but there are no market loosers BGBlitz will double. Such situations are rare and the error in equity is usually small.
Because all current bots loose much more equity on checker play than on cube decisions I concentrate on checker play.
With the AI from 2006 BGBlitz was already very strong albeit probably maybe 10-15 rating points weaker than GnuBG and maybe 5-10 to Snowie (if at all). Both values are in my opinion upper bounds. E.g. a 4 ply analyze by GnuBG from the Amsterdam matches showed a much smaller difference (2-5 IIRC).
The recent AI from BGBlitz version 2.6 is a measurable improvement. I don't believe that anyone can currently seriously judge the strength or even quantify it in rating points. To answer the question one had to play a long seriers of matches. Torsten Schoop had done this with 2-3 computers for about half a year (with the old AI of BGBlitz) (1000 25pt matches) and the only statistical significant result was that Jellyfish plays weaker than the three others, although GnuBG was close to statistical significance being the best). There has been unfortunately some quirks in it (BGBlitz had an error in Postcraford games, GnuBG used a lousy MET, Snowie beavered sometimes ???)
Albert Silver wrote:
As to measuring them, unless you play hundreds or thousands of matches, AND factor out the variance on every roll, you will never really know since the universe required is simply too big.
I guess one can do it, but it requires 1-4 years computing time I guess.
5000 7 point matches would give some indications.
Albert Silver wrote:
I conducted such an experiment (multi-ply analysis in 7-point matches) between GNU 0.14 and Snowie 4, with analysis by Joern Thyssen (who did the variance analysis), and GNU came out a fraction ahead (think 30 Elo at best),
I doubt that it is so much.
Albert Silver wrote:but it isn't so simple. For example, Snowie conducts prime vs prime games a bit better, and cubes point games better, whereas GNU is much better in races (near perfect), and doesn't suffer from Snowie's bearoff cube issues. A needlessly technical commentary, but the point is that they both have their strengths and weaknesses. For the price and features, there is no comparison however...
That's the point! Each bot currently has blind spots and taking any bots analysis as gospel is misleading. Each bot has it's strength and weaknesses. Too bad that we have just so few AI and only BGBlitz AI seems to develop further....