maximum engine elo

outAtime · Post by **outAtime** » Thu Mar 17, 2011 9:13 pm

Would it be fair that most engines reach a stage in development where most if not all changes/additions to the code lead to little or no noticeable improvement? I ask because i feel i have reached a stage after LMR and Mobility where the engine appears to no longer improve. I have tried major changes to eval, ideas in search such as futility pruning and even move count based pruning but i guess what im beginning to feel is that any further improvements will only be by 10 - 20 elo over 1000's of games if any. Have others reached this point with their own engines and are there any suggestions for further improvements and not giving up? Thanks.

bhlangonijr · Post by **bhlangonijr** » Fri Mar 18, 2011 1:12 am

outAtime wrote:Would it be fair that most engines reach a stage in development where most if not all changes/additions to the code lead to little or no noticeable improvement? I ask because i feel i have reached a stage after LMR and Mobility where the engine appears to no longer improve. I have tried major changes to eval, ideas in search such as futility pruning and even move count based pruning but i guess what im beginning to feel is that any further improvements will only be by 10 - 20 elo over 1000's of games if any. Have others reached this point with their own engines and are there any suggestions for further improvements and not giving up? Thanks.

When you reach this situation it is very likely you have a major bug in your code that is screwing with every attempt you are doing to improve its strength. It happened once to me due to a nasty bug I had in some method which verifies if the move retrieved from the hash is a legal one. Nothing was helping improve the engine, until I fix the bug.

When this happen you better stop adding new things to the engine and try to spot possible bugs. Some common places to find hidden bugs and some hints:

- SEE function;
- Hash table;
- Move generation (do you have a perft function?)
- Arrays with uninitialized indexes being used;
- Write a function which swaps black and white sides for a given position (within your board data structure/class), evaluates some positions for both cases and check if it is giving equal scores.
etc....

EDIT: I am assuming you have an adequate procedure for testing your changes. Self matches with fast time control always help. How are you testing your changes and how are you measuring improvements?

Regards,

outAtime · Post by **outAtime** » Fri Mar 18, 2011 12:05 pm

Yes, testing using long matches usually 1 0 for matches of 100 games or more, sometimes 3 0 in shorter matches so i can follow the games and look at the moves and the searches etc.. Yes, i think you are right and something somewhere just isn't right... I think it may be in the search. Thanks.

Kempelen · Post by **Kempelen** » Fri Mar 18, 2011 1:50 pm

outAtime wrote:Would it be fair that most engines reach a stage in development where most if not all changes/additions to the code lead to little or no noticeable improvement? I ask because i feel i have reached a stage after LMR and Mobility where the engine appears to no longer improve. I have tried major changes to eval, ideas in search such as futility pruning and even move count based pruning but i guess what im beginning to feel is that any further improvements will only be by 10 - 20 elo over 1000's of games if any. Have others reached this point with their own engines and are there any suggestions for further improvements and not giving up? Thanks.

I have reached that point where make progress is quite difficult, but not impossible. I suggest you the following:

- Try to catching bugs. Using 'assert' macros is very usefull. Also a rotated eval.
- Try to profile your code. I dont use a profile, but use my own profile system witch runs very well. Is about putting a 'function taged' crono at the start of any function and stop it on any return. Making this system will take you a little time but is worthy. I can give an example if you ask for.
- Take a physical notepad always with you. I have my own, and I note everything cames to my mind in any moment.
- Consider to rewrite full parts of your engine, like move generator, or the hash table system. Always thing where you can do better.
- Read a lot of historic posts in talkchess.com, the old ccc, winboard forum, the cpwiki and openchess. Reading lot of old posts is a source of handreds of ideas. Use the search feature if you are interested in any particular topic.
- Read other open source engines' code, is also a source of inspiration. You will see you can change the ways things are done.
- Read advanced C books. If have learned a few things thanks to them.
- You can also make any other usefull tools, like one that analice your results. I have not made mine, but it is an idea I will do in the future. There are also same tool that do somthing similar, but dont remember the name...... I wrote my own tournament manager to my needs. Search for FSTM for details.
- In order to make ideas cames to my mind, I usually try to thing how a human treat the position. I.e. why to extend some positions which are loose in apparence? some GM will not do it.
- Also take a mind-map software and do a brain-strom. It always happend something.
- You can print the tree in xml and see how you engine works. I did it, and now I use it for bugs catching, and it is very handy.
- Always post yours C programming doubts, here or at cprogramming, where many people can tell you about any advanced topic.
- Take random positions, see how your engine evaluate them, split them in his parts (opening, bishops, ....) and try to see if any eval term is exagerated. This way you will be adjusting your eval. Also you can compare with other strong engines, like crafty, which has the same feature.
- If you dont make progress, try always to rewrite the code cleaner, simpler and faster. That will lead to a better program

It is true that there is a point where make progress is very unnoticiable, but it is possible. The only drawback is maybe you will need more time and games to test it.

My non-release-yet rodin version is in that barrier, but I like this hobby and try always to improve, or at lest re-do thinkgs in a better way. Also you can stop it for a time and refresh ideas will come.

Good luck

Fermin

bhlangonijr · Post by **bhlangonijr** » Fri Mar 18, 2011 2:57 pm

outAtime wrote:Yes, testing using long matches usually 1 0 for matches of 100 games or more, sometimes 3 0 in shorter matches so i can follow the games and look at the moves and the searches etc.. Yes, i think you are right and something somewhere just isn't right... I think it may be in the search. Thanks.

Testing properly is one of the most important things if you want to have a steady progress in terms of Elo. I never released a new version of my chess engine without clear improvements in terms of Elo. Not that it is a big deal, the point is, testing procedures I am currently using is fairly reliable and always agrees with further testings of independent rating lists. It is pretty straightforward and doesn't require a lot of resources/time:

- I use a source code management system (subversion - GIT is even better). After making the change that must be validated, I also change the version of the engine adding to it the revision number of the current development "snapshot". You can query the current revision number using the tools of your favorite SCM system. This is _very_ useful to keep track of your changes and how each version did in testings. Also, you can easily rollback to the "best" version and continue the development from there.

- Ok, after compiling the "snapshot" you should validate it against the testing "wall":
1) The strategic test suite http://sites.google.com/site/strategictestsuite/ is great to quick check if you had any great "impacts" with your changes - good or bad. I usually run all the positions using 10 seconds per position. Scoring maybe 10-15 positions less than the last best "snapshot" is not a problem, you can go for the next step (save the results and the "snapshot" version used) . Scoring less than that you should stop testing and double-check your changes;
2) The "snapshot" passed in the strategic test suite, you should run some gauntlets against the last best "snapshot". I personally use Linux for development and testing. Cutechess-cli is my favorite tool for running those gauntlets - Little blitzer is also good if you are using windows. I usually run the gauntlets using two types of time controls according to the change I made:
- 4000 games using fixed depth (5-8) if the change is related to evaluation, tuning parameters, etc;
- 2000 games using 30 seconds +1 second: If the change is related to search, etc.
- 2000 games using 40/2:00 minute: if the snapshot is targeted to a new release.

After all games are finished, I run bayeselo to compute Elo gain. If I get -2 Elo or less for the "snapshot" I definitely rollback the changes. If it is more than +2 Elo points the current "snapshot" is promoted to best version. Otherwise I use the criterion of lines added/removed. If the change was substantial (added more than 20 lines) I definitely rollback it. If instead I removed lines, then I promote the version to the best one, as it is simpler and without Elo change.

Femin's tips are great. I hope you have better lucky with the development of your chess engine.

Regards,

maximum engine elo

maximum engine elo

Re: maximum engine elo

Re: maximum engine elo

Re: maximum engine elo

Re: maximum engine elo