Variance reports for testing engine improvements

nczempin · Post by **nczempin** » Wed Sep 19, 2007 6:21 pm

hgm wrote:Well, you should not really see the different micro-Maxes as versions of the same engine. Some are more like parallel developments, optimized according to different criteria. I made uMax 1.6 after uMax 4.4, in the development line aimed at making the smallest engine (source-code wise), while the uMax 4 series aims at maximal Elo/character. It is very possible that there will be a uMax 1.7, and not a uMax 4.9, so that the far weaker uMax 1,7 would be the latest version...

So far uMax 4.8 leads Eden 17.5-0.5 in the Silver match at 1'+3".

Reminds me of Monty Python's holy grail: "Okay, we'll call it a draw"

Oh, and remember that this is not sufficient evidence that uMax 4.8 ist stronger than Eden. You need 20,000 more games for that. SCNR

nczempin · Post by **nczempin** » Wed Sep 19, 2007 6:25 pm

hgm wrote:Well, you should not really see the different micro-Maxes as versions of the same engine. Some are more like parallel developments, optimized according to different criteria. I made uMax 1.6 after uMax 4.4, in the development line aimed at making the smallest engine (source-code wise), while the uMax 4 series aims at maximal Elo/character. It is very possible that there will be a uMax 1.7, and not a uMax 4.9, so that the far weaker uMax 1,7 would be the latest version...

Yes, I am aware of some of your ideas behind the micromaxes. However, until you introduce some kind of name change that would prompt e. g. Olivier to include two separate versions in his tourneys, it would invalidate my policy.

Of course, no-one can keep me from violating my policy, especially since you are so insistant

hgm · Post by **hgm** » Wed Sep 19, 2007 8:22 pm

If I change the name, it would be considered a clone...

nczempin · Post by **nczempin** » Wed Sep 19, 2007 8:35 pm

hgm wrote:If I change the name, it would be considered a clone...

nczempin · Post by **nczempin** » Wed Sep 19, 2007 9:03 pm

Actually, now that I have had some time to think about it, it is actually very desirable to try to use engines in my tests that are more likely to show deterministic results, such as the various stable versions of umax, even when those versions are not likely to be opponents in any of the public tournaments.

1. There is some of umax 1.6 in umax 4.8 (even if it is different, there will be a minimal set of common features), so playing against it is a good preparation once I do catch you (and it is inevitable eventually, because your self-imposed code size restrictions will not allow you to go beyond a certain point).
2. As long as umax is not the only opponent, the possible bias of optimizing my program against umax (a real danger because my opening book currently has a small number of "cooked lines"... or whatever it was that Bob called them) and not the general population is minimal. I think a little non-determinism doesn't hurt, and even if it is only so I can be motivated to watch more of the games.

nczempin · Post by **nczempin** » Thu Sep 20, 2007 11:00 am

The first match in the 40-positions-test between Alchess and Eden is finished. The result is 67-13. I'll be very curious to see how that'll change in the next match. And of course the two matches after that.

On the other hand, I am running out of stamina. I am getting so fed up with R. H.'s condescending and arrogant comments that I think it'll be better for me to stay away from this forum for a while.

bob · Post by **bob** » Fri Sep 21, 2007 7:12 am

nczempin wrote:The first match in the 40-positions-test between Alchess and Eden is finished. The result is 67-13. I'll be very curious to see how that'll change in the next match. And of course the two matches after that.

On the other hand, I am running out of stamina. I am getting so fed up with R. H.'s condescending and arrogant comments that I think it'll be better for me to stay away from this forum for a while.

I like that. My "arrogance" as opposed to your "paranoia". That somehow every comment I make is directed explicitly toward your program. Again, "grow up" comes to mind... If you don't mention my name explicitly again, I won't respond to another post of yours, period, whether I know the answer or have an idea how to find it.

It does say a lot about your arguments, however, when all you can do is take cheap pot-shots as above, instead of offering interesting or useful observations... You just get defensive, dig in, and refuse to listen or consider that maybe, every now and then, not everything revolves around _your_ program...

the search + piece square hypothesis is one good example, but there are others...

Variance reports for testing engine improvements

Re: Variance reports for testing engine improvements

Re: Variance reports for testing engine improvements

Re: Variance reports for testing engine improvements

Re: Variance reports for testing engine improvements

Re: Variance reports for testing engine improvements

Re: Variance reports for testing engine improvements

Re: Variance reports for testing engine improvements