more on fixed nodes
Posted: Tue Nov 10, 2009 11:06 pm
Here is one point to ponder (I started a new thread as the others quickly get too cumbersome to follow).
Has anyone thought about _why_ I raised this issue originally? the issue where a change can push the game toward or away from an area where you lose or gain speed? Which distorts the results?
(1) Do you believe that I spent weeks trying to find a reason why this was a bad idea?
(2) Did any of you read my _original_ discussion on cluster testing where we were trying to understand the really wild variability, even when playing the same starting position and same two opponents? Did you read the stuff about how playing the same two opponents, same starting position, 100 times, each time allowing just one opponent to search one more node than in the previous run, generally produces 100 _different_ games? And did anyone notice where I mentioned that for some odd reason, playing a fixed number of nodes, while producing repeatable results, produced results that were _significantly_ different from timed matches?
I actually spent a lot of time trying to figure out why. And I found the speedup/slowdown issue was the culprit. Fixed node games gives a bias to the program that overall searches at a lower NPS than the opponent, because fixed nodes makes nodes equal, even though one program expends more effort on a node than another. I tried the "adjustment" approach, which I explained way back when. In that I did a few test runs and came up with an average NPS for each program in the test. And the fixed node counts were adjusted so that each program took about the same amount of time. And that changed the results in unexpected ways. Because my "average NPS" ignored endgames, and the program that speeds up the most gets penalized since in fixed node testing, all nodes are treated equally and there is no adjustment to the number of nodes as the game progresses.
None of this is really new. And it wasn't something I dreamed up. It was something that took weeks to figure out. And after seeing the skewing, and considering how eval changes can screw this up (just add a trade bonus to reach endgames quicker and the program that speeds up the most in the endgame gets penalized and drops in overall score, only because of getting penalized by not being able to take advantage of the endgame speedup it would normally see.
None of this is made up. It was quite apparent. And I didn't like it because I don't want something to affect the results, yet it can't be easily quantified and the effect removed after the results are done.
Hope that helps as to where I am coming from on this. fixed nodes are still nice because of repeatability which makes debugging much simpler. But that is the _only_ advantage I can see for them, using real-world engines with significant NPS variations over the course of a game.
Has anyone thought about _why_ I raised this issue originally? the issue where a change can push the game toward or away from an area where you lose or gain speed? Which distorts the results?
(1) Do you believe that I spent weeks trying to find a reason why this was a bad idea?
(2) Did any of you read my _original_ discussion on cluster testing where we were trying to understand the really wild variability, even when playing the same starting position and same two opponents? Did you read the stuff about how playing the same two opponents, same starting position, 100 times, each time allowing just one opponent to search one more node than in the previous run, generally produces 100 _different_ games? And did anyone notice where I mentioned that for some odd reason, playing a fixed number of nodes, while producing repeatable results, produced results that were _significantly_ different from timed matches?
I actually spent a lot of time trying to figure out why. And I found the speedup/slowdown issue was the culprit. Fixed node games gives a bias to the program that overall searches at a lower NPS than the opponent, because fixed nodes makes nodes equal, even though one program expends more effort on a node than another. I tried the "adjustment" approach, which I explained way back when. In that I did a few test runs and came up with an average NPS for each program in the test. And the fixed node counts were adjusted so that each program took about the same amount of time. And that changed the results in unexpected ways. Because my "average NPS" ignored endgames, and the program that speeds up the most gets penalized since in fixed node testing, all nodes are treated equally and there is no adjustment to the number of nodes as the game progresses.
None of this is really new. And it wasn't something I dreamed up. It was something that took weeks to figure out. And after seeing the skewing, and considering how eval changes can screw this up (just add a trade bonus to reach endgames quicker and the program that speeds up the most in the endgame gets penalized and drops in overall score, only because of getting penalized by not being able to take advantage of the endgame speedup it would normally see.
None of this is made up. It was quite apparent. And I didn't like it because I don't want something to affect the results, yet it can't be easily quantified and the effect removed after the results are done.
Hope that helps as to where I am coming from on this. fixed nodes are still nice because of repeatability which makes debugging much simpler. But that is the _only_ advantage I can see for them, using real-world engines with significant NPS variations over the course of a game.