Don wrote: bob wrote: michiguel wrote: bob wrote:
hgm wrote:Decoupling the measurement of strength and speed is very useful. To know if a change improves my engine in time-based play, I would be obliged to implement the change in the maximally optimized way of the most clever algorithm. That would require a lot of effort, and it might all be wasted, because the idea might even be a bust in node-based play.
Testing first in node-based play does allow me to use the most quick and dirty solution I can imagine, I just hack it in without having to pay any attention to efficiency at all. Then the node-based play will tell me how much the idea is worth independent of the quality of the implementation.
Or not, as I have previously explained, because all programs are not constant in their NPS over the course of a game. If you eval change pushes the game toward positions where you are slower, or where your opponent is faster, you get a time advantage you did not think about. Which makes the change look good when it might be better or worse in real timed games. Vice-versa as well.
In experimental science many preliminary experiments are performed not to collect data, but to have an idea what other (if any) experiments should follow.
Testing with nodes may fall in the first category. You keep finding the defects of preliminary experiments, when they are not necessarily supposed to be perfect (if there is such a thing in experimental science...).
And from that info I can the get the estimate how much a speed hit would be affordable on the implementation of that idea. And that would give me a pretty clear impression whether that is feasible or not. But most of the time you don't even get to that stage. So it saves tons of time.
What I am trying to point out is that you learn _more_ from using actual time controls, than you learn from fixed node searches. Except perhaps for _very_ simple-minded programs that have a fairly constant speed throughout the game, and when using a set of opponents that have that same characteristic. But if a program varies its speed by 2x or 3x over the course of a game, fixed node searches _will_ introduce an unexpected (and nearly undetectable) effect for the reasons I have given.
I don't really care if an idea is good or bad if it can't be implemented in a way that makes it useful, speed-wise. That just doubles the work, to find that it looks good in a somewhat defective testing methodology, only to find it fails miserably in timed matches because it is too slow. Who really writes code without regard to speed, in the computer chess world? He who does, continually rewrites. Good design up front on an idea addresses both correctness and speed issues at the same time. No point in wasting test time if you already know it can't be done efficiently.
Good design up front is a myth. How many years have you been working on Crafty? Why didn't you just design it correctly in the first place and had it done in a week or two of coding? You've been fooling with that thing for years!
There we disagree. The basic structure of Crafty has not changed in 15 years. It is still a bitboard-based approach. It has evolved from rotated bitboards to magic bitboards, as one very small change. Small because those parts of the program were designed so that they are encapsulated in a way that makes the bitboard attack generation really independent from the users of that information.
One thing I can tell you I didn't do, is to design it sloppy from the get-go, just to get it working. I spent time on each and every part so that the original implementations were as good as I could make 'em, I could understand what interacted with what and how I could make those interactions more efficient, and so forth.
Good design certainly doesn't happen by accident, I agree. But it _does_ happen. No way one could have predicted the development of magic bitboard operations, rotated bitboards were unheard of at the time and represented a significant jump in bitboard knowledge. But by designing things properly, changing to magic took all of 30 minutes or less. Because of program design.
Miguel got it exactly right. If you are good engineer you don't just implement and test, but you are concerned with actually UNDERSTANDING the thing you are experimenting with because it has an impact on how you will proceed. What do you do when one of your tests fail? Do you move on to something else entirely and just give up? Don't you care why it failed? Or do you just try to fix it without knowing why it failed in the first place?
I am not sure where this is supposed to be heading. I _never_ write code that I don't understand. I never write code without considering speed/performance issues and how they can be addressed. I'm not going to build my first airplane out of concrete because that is the simplest material to work with. None of your questions above make any sense at all to me, in the context of testing. Fixed-node testing, IMHO, provides no useful information that is not subject to significant hidden bias. As a result, I am not interested. Time testing is easy to manage, easy to understand, and only requires that you don't add crappy code to test an idea. And I do not buy the idea that one first wants to write crappy code to see if the idea is good, before writing real code to make it more efficient. Otherwise, your are left with that concrete airplane that won't ever fly, even though the principle flight relies on is known to be valid.
If a test fails, I do my best to understand why. But fixed nodes serves no useful purpose to further that goal that I can see. I know what the code was supposed to address, why it needed addressing, and then it becomes a matter of understanding why it failed. Quite often the basic idea is flawed (search extensions, where too much is not better, or reductions based on history counters which doesn't work well at all) in some basic way that a little analysis can explain.
I don't get your continual implication that we make changes in a vacuum with no idea of whether they are good or not, or if they are bad, why? We often have to iterate on an idea before it works. For the most part, intuition is more than good enough to make us look deeper when something fails yet we thought it was better.
As far as the "do you try to fix it without knowing why it failed?" question, that sounds like a suggestion from a freshman CS student. How can you fix something you can't understand? Put the code in a roomful of monkeys and let 'em make random changes and hope you find something better? We certainly don't develop like that, and never have.
Can you imagine NASA taking this approach to putting a man on the moon?
Yes I can. They actually did it in fact.
At least they did things very similar to what I have been doing. Nothing happens inside Crafty when testing that we don't understand before moving on. that would defeat the very purpose of our involved testing methodology. And it would make no sense at all.
As far as NASA goes, you ought to watch the "moon or bust" series. They didn't iterate over and over on most of their hardware. They designed it with a specific goal in mind from the get-go. And they designed it such that it would work from the get-go as well.