If all you are doing is dealing with performance issues, then no testing is necessary. Any increase in NPS will either make no difference, or will improve the program. Can't possibly hurt assuming you introduced no bugs.nczempin wrote:You made some very valid comments.bob wrote:...
I guess for now I just have to say: My engine is severely nps-challenged and time-to-depth compared to other engines at a similar level. The changes I am making are 99 % just optimizations that should to a large extent lead to the engine getting stronger (except in those rare cases where looking deeper causes you to dismiss the better move that you would find were you to look even better).
So under this condition it is mainly a question of: Have I optimized enough so I can get a whole ply more on average (yes my engine still has a lot of potential in that area), and when that is the case my tests are there to find out if that one ply was actually significant (which it doesn't have to be).
I am not changing the eval, the move ordering, or introducing any known techniques such as null-move. All I'm doing is finding bottlenecks, changing Java objects into primitives, representing "blackness" with a bit instead of <0, etc.
So I guess this factor pretty much makes the previous discussions on Eden meaningless, or at least any conclusions that anyone would like to draw from them.
And yet the questions remains: Why does my approach still seem to work? Who is willing to test my hypothesis that each version of Eden is stronger than the preceding one, even under the conditions you propose to be necessary?
I would also like to make one thing clear: I am very well aware of statistical issues such as random fluctuations (not solely because I play Poker sometimes), especially the fact that the human mind by default seems to be unable to deal with them.
I'm normally the guy that says in that joke I'm sure you've heard: "no, you can't say that all sheep in Scotland are black, not even that at least one sheep is, but the only thing you can say is that at least one sheep in Scotland is black on at least one side"
I'm always the guy that points out that e. g. salesperson competitions are more or less meaningless because the random fluctuations completely dominate any skill there might be.
I always shrug off "amazing" events that I easily determine to be very possibly caused simply by randomness.
I even seem to be too far on the side of randomness, being very skeptical even of scientific articles that claim to have found this or that correlation and/or even causation.
I take an interest in Statistical Process Control even to the extent of owning (although not yet having worked through) Shewhart, W A (1939) Statistical Method from the Viewpoint of Quality Control, plus Deming and lots of other stuff that precedes Watts Humphrey's work on Software Engineering Processes.
So...
It feels weird when I'm being placed "on the other side"
But for eval/search/etc changes, much more care is needed or you will be throwing away good changes, keeping bad changes, all based on random results.