Tord Romstad wrote:mjlef wrote:I went through the same thing with Zillions and earlier chess programs. WHat I settled on is limit search depth (auto play to determine a rating for a 1 ply search, 2 ply, etc.
That's similar to what Aleks suggested. It should work, but I'd like something more continuous. The difference in strength between a 1 ply search and a 2 ply search is probably huge. Another disadvantage is that the ratings would have to be calibrated again for each new time control. A 1 ply search at blitz will obviously do much better against humans than a 1 ply search at a tournament time control.
For even worse play, randomly toss out moves---do not score them based on how likely you think a human would be to overlook it...just toss x% of moves with a 1 ply search. You can then use autoplay to score that as well. People overlook moves all the time...even strong players miss mate in 1 sometimes.
Glaurung never misses a mate in 1, even at the lowest level. Perhaps that alone is worth a considerable number of Elo points?
Tord
Many years ago my program RexChess had a feature where you could set the ELO rating and it would try to play at that strength level.
The old programs played close to 2000 ELO with about 5-6 ply searches in the middlegame, depending on the particular program. The rating curve is also well understood, so it's simply a matter of setting the level appropriately. One really good way to do it is to set the number of nodes searched and that's how Rex did it. It's dirt simply, your program already has this feature and it's smooth - you can calibrate it simply.
Here is the problem with this that most people do not understand that you will need to be aware of if you are not already. Scalability with humans is BETTER than scalability with computers. If you double the thinking time, a computer may play 60 or 70 ELO stronger but it's even better for humans.
This effect was more easily noticed 10 or 20 years ago because humans lost at speed chess, but did a little better at game in 10 minutes, and better still at game in 30 and so on. At tournament time controls, it was the humans that were superior despite the fact that computers play MUCH stronger at 40 moves in 2 hours than they do at game in 5 minutes.
So you cannot really have a fixed level where you can say the program plays at X strength without taking into account the time control. If you have a level you call 1800 it might easily win at fast chess, and badly lose at serious long tournament games.
So in my opinion, what you need is something like this:
1. Determine a formula for converting some number of Nodes searched to a rating based on the assumption that each doubling gives you N ELO points.
2. Make an adjustment based on the time control by assuming the human has a different formula (each doubling is worth (N+20 ELO) or something like that.) I don't know the actual number, but each doubling is worth MORE to a human.
3. 100 nodes should give you a pretty weak search, but please note that at speed chess, 100 nodes will play MUCH better (relative to the human) than it would at long time control games.
In Rex, we just used a fixed formula so that 1800 ELO always played the same on any computer (it would just take longer on a slower computer.) The ELO was supposed to corresponded to how strong it would play IF THIS WAS A NORMAL TOURNAMENT TIME CONTROL GAME. But if the computer can hold it's own at speed chess with a 1500 ELO player, and then you give the human 8X more time to think without changing the computers time control (and assuming you also don't take advantage of pondering) then the human is going to play 2 classes better!
This is often met with disbelief, probably due to human perception. Most humans realize they play better when given 2x or 4x more time on the clock, but they don't realize how much better they are playing because their opponent is playing better too. But I think the evidence is clear, as you progressively increase the time control, humans progressively do better, so you cannot refute the fact that extra time helps humans even more than computers.
I think with a little work you can come up with a formula that returns the number of nodes you need to search to get roughly some ELO rating (against humans) at some TIME CONTROL - so time control needs to be one of the variables.
The nice things about nodes searched, as you yourself pointed out, is that the program needs to play better in the endgame, and it will. For such a think I would probably set a very small hash table size and perhaps cripple the quies search a bit, but it's probably not necessary since you can get very weak play with a 50 node search.
Before I started playing tournament chess, I would play matches with Sargon II on my TRS-80 and it usually would do a 3 and sometimes 4 ply search in tournament games. I played many "serious" match games where I didn't use a clock and I concentrated hard (and forbid myself from takebacks, etc.) I didn't use a clock, but I'm guessing that I took about 2 hours or more per game, recorded each one manually and really simulated tournament conditions. I played Sargon very close to even and had many close 10 games matches, winning some and losing some. I seriously doubt I was better than 1300 USCF ELO at the time but I can only guess. I can tell you for sure, that if I was constrained to 10 or 15 minute games where Sargon played at the same exact level, I would have been crushed, perhaps rarely winning a game. I think that explains what you describe.