I have tried multiple times to improve on Gaviota's parameter values with CLOP with no success. No luck with SPSA either. Miguel's method, which is similar to Peter's, seems to work great.mar wrote:This: http://www.talkchess.com/forum/viewtopi ... 22&t=50823
From what I understood Miguel did something very similar (as can be seen in the thread).
Working with CLOP
Moderators: hgm, Rebel, chrisw
-
- Posts: 3226
- Joined: Wed May 06, 2009 10:31 pm
- Location: Fuquay-Varina, North Carolina
Re: Working with CLOP
-
- Posts: 4367
- Joined: Fri Mar 10, 2006 5:23 am
- Location: http://www.arasanchess.org
-
- Posts: 4833
- Joined: Sun Aug 10, 2008 3:15 pm
- Location: Philippines
Re: Working with CLOP
One way is to look at the plot for example there could be a point where param value converges. See sample below, clopping one parameter with a range of 100 to 1000, it appears to converge around 800 to 900, x is time y is my param term.Robert Pope wrote:I finally got CLOP up and running on my computer. That was a real headache, since I didn't realize for quite a while that I needed to install python to run the python cutechess-cli script, and then I couldn't get the PATH variable to stick.
Now that it is running and I am doing my first optimization test (just of standard piece values), I have a few questions:
1. How do you know if a term has been optimized "enough"? If it takes 500,000 games, so be it.
In my example one param with 100 to 1000 range, I can have 1000-100 + 1 = 901 unique values, say a minimum of 1000 games per unique values so you may need a minimum of 1000 games * 901 unique param value = 901000 games. But clop is smart, so not all param unique value will be tested at 1000 games, if it converges early to some values you will get early indication.
I make a little tool to get detail of what param were tested most and its performance.
Code: Select all
CLOP Data Reader v3.0
Number of parameters: 1
First parameter: OffensivePercent
Param1: Min 110, Max 1000
Total games: 5212
Param1 W / L / D NetW Games Score LOS
110 0 / 1 / 1, -1 2 25.00% 25.00%
[...]
462 0 / 1 / 1, -1 2 25.00% 25.00%
463 1 / 0 / 1, +1 2 75.00% 75.00%
464 0 / 1 / 1, -1 2 25.00% 25.00%
472 2 / 0 / 0, +2 2 100.00% 87.50%
473 2 / 0 / 0, +2 2 100.00% 87.50%
478 4 / 0 / 0, +4 4 100.00% 96.88%
481 1 / 0 / 1, +1 2 75.00% 75.00%
[...]
717 2 / 3 / 1, -1 6 41.67% 34.38%
719 1 / 0 / 1, +1 2 75.00% 75.00%
720 5 / 1 / 0, +4 6 83.33% 93.75%
721 2 / 0 / 0, +2 2 100.00% 87.50%
722 3 / 0 / 1, +3 4 87.50% 93.75%
723 2 / 0 / 0, +2 2 100.00% 87.50%
724 2 / 0 / 0, +2 2 100.00% 87.50%
726 3 / 1 / 0, +2 4 75.00% 81.25%
[...]
839 28 / 1 / 1, +27 30 95.00% 100.00%
840 19 / 0 / 1, +19 20 97.50% 100.00%
841 20 / 4 / 2, +16 26 80.77% 99.95%
842 24 / 2 / 4, +22 30 86.67% 100.00%
843 32 / 0 / 0, +32 32 100.00% 100.00%
844 15 / 2 / 5, +13 22 79.55% 99.93%
845 25 / 2 / 1, +23 28 91.07% 100.00%
846 17 / 1 / 2, +16 20 90.00% 100.00%
847 16 / 0 / 2, +16 18 94.44% 100.00%
848 30 / 0 / 0, +30 30 100.00% 100.00%
849 16 / 4 / 0, +12 20 80.00% 99.64%
850 21 / 1 / 4, +20 26 88.46% 100.00%
851 33 / 1 / 2, +32 36 94.44% 100.00%
852 19 / 2 / 3, +17 24 85.42% 99.99%
853 18 / 4 / 2, +14 24 79.17% 99.87%
854 17 / 1 / 2, +16 20 90.00% 100.00%
855 28 / 4 / 2, +24 34 85.29% 100.00%
856 19 / 1 / 0, +18 20 95.00% 100.00%
857 26 / 0 / 0, +26 26 100.00% 100.00%
858 32 / 2 / 2, +30 36 91.67% 100.00%
859 26 / 1 / 5, +25 32 89.06% 100.00%
[...]
921 22 / 2 / 2, +20 26 88.46% 100.00%
922 15 / 3 / 2, +12 20 80.00% 99.78%
923 24 / 3 / 1, +21 28 87.50% 100.00%
924 27 / 0 / 1, +27 28 98.21% 100.00%
925 10 / 3 / 1, +7 14 75.00% 97.13%
926 27 / 1 / 2, +26 30 93.33% 100.00%
927 11 / 2 / 1, +9 14 82.14% 99.35%
[...]
997 7 / 2 / 1, +5 10 75.00% 94.53%
998 6 / 0 / 2, +6 8 87.50% 99.22%
999 4 / 0 / 0, +4 4 100.00% 96.88%
1000 13 / 0 / 1, +13 14 96.43% 99.99%
Top Parameters: By LOS
[1] par1 838, score 92.65%, LOS 100.000%, Games 34, NetWins +29
[2] par1 839, score 95.00%, LOS 100.000%, Games 30, NetWins +27
[3] par1 843, score 100.00%, LOS 100.000%, Games 32, NetWins +32
[4] par1 848, score 100.00%, LOS 100.000%, Games 30, NetWins +30
[5] par1 851, score 94.44%, LOS 100.000%, Games 36, NetWins +32
I use cutechess-cli version 0.5.1 in clop.2. My CLOP file has me running 3 processors on my Quad. But invariably, after a few hundred games, I start to get processors that drop out. A soft pause won't close out the threads, and I manually have to do a hard close and then go into Task Manager and kill an instance of cutechess-cli and the engine I was playing.
I'm guessing it must be my engine that is terminating prematurely, since it is never the one left idling, but how can I troubleshoot this? I have no idea which game it stuck on to find a corresponding log file.
-
- Posts: 175
- Joined: Fri Oct 22, 2010 9:47 pm
- Location: Austria
Re: Working with CLOP
It's true, but of course you understand what is the difference between a theoretical proof and "several engines" in which it was successfulmar wrote:It has been used sucessfully in several engines giving significant gains in gameplay
Also maybe there is a proof for that method too, and I don't know about it...
-
- Posts: 2559
- Joined: Fri Nov 26, 2010 2:00 pm
- Location: Czech Republic
- Full name: Martin Sedlak
Re: Working with CLOP
MTDf should have been theoretically sound as well but practice has proven otherwise.
I got significant improvement thanks to this method, that's proof enough for me.
Of course if you're going to tune hundreds of eval params with CLOP I say good luck and let us know in a couple of years.
I got significant improvement thanks to this method, that's proof enough for me.
Of course if you're going to tune hundreds of eval params with CLOP I say good luck and let us know in a couple of years.
-
- Posts: 3232
- Joined: Mon May 31, 2010 1:29 pm
- Full name: lucasart
Re: Working with CLOP
Agreed. I've wasted so much time trying to tune just 2 or 3 variables with CLOP, and rarely saw it converge. In my experience, it's not uncommon that CLOP needs over 100,000 games to tune only 2 variables. As for 3, forget about it!mar wrote:My suggestion is: don't waste time on CLOP. If you want to tune eval there are vastly superior and much faster methods that actually converge.
Admittedly, I've gain a non trivial elo amount in DiscoCheck thanks to CLOP. But anything else would have done a better job than CLOP with the same amount of resources.
Plus the CLOP interface is horrible. CLOP runs a script that plays a single game. Already that it's serious design flaw, because there is lots of overhead in starting everything (from cutechess-cli to the engines) and playing a single game, instead of sending a simply ucinewgame to an already existent process.
Since CLOP needs to play hundred of thousands of games to only tune 2 or 3 variables, you need to kill any overhead you can, and you can't even use tc, but have to resort to depth=6 or so to play super fast games. Not to mention the fact, that once you have found the depth=6 optimal values, they are likely not optimal at normal tc testing...
In Stockfish, we use SPSA by Joona, which is much superior in practice (perhaps not in theory, but read my signature about theory and practice). I strongly recommend you look at Joona's SPSA script, instead of wasting yourtime trying to figure out how the CLOP/Python/cutechess-cli plumbing works (I know how it works, it's a real penance, and a waste of time anyway, trust me!).
Theory and practice sometimes clash. And when that happens, theory loses. Every single time.
-
- Posts: 3232
- Joined: Mon May 31, 2010 1:29 pm
- Full name: lucasart
Re: Working with CLOP
couple of years? Even billions of years won't be enough!mar wrote:MTDf should have been theoretically sound as well but practice has proven otherwise.
I got significant improvement thanks to this method, that's proof enough for me.
Of course if you're going to tune hundreds of eval params with CLOP I say good luck and let us know in a couple of years.
Theory and practice sometimes clash. And when that happens, theory loses. Every single time.
-
- Posts: 10948
- Joined: Wed Jul 26, 2006 10:21 pm
- Full name: Kai Laskos
Re: Working with CLOP
Pretty much what I have said at CLOP's release:lucasart wrote:Agreed. I've wasted so much time trying to tune just 2 or 3 variables with CLOP, and rarely saw it converge. In my experience, it's not uncommon that CLOP needs over 100,000 games to tune only 2 variables. As for 3, forget about it!mar wrote:My suggestion is: don't waste time on CLOP. If you want to tune eval there are vastly superior and much faster methods that actually converge.
http://www.talkchess.com/forum/viewtopic.php?t=40987
Quadratic non-iterative regression on noisy data needed 300,000 datapoints to detect the global optimum even in 2 dimensions on pretty smooth distribution. Pictures are gone, though. With more than 2-3 dimensions, the detection of global optima is hopeless, even the many local optima will be missed. Basically CLOP can be useful only in low dimensional (2-3) perturbative cases, where one can fit one parameter by one sequentially anyway.
-
- Posts: 175
- Joined: Fri Oct 22, 2010 9:47 pm
- Location: Austria
Re: Working with CLOP
Ok, then I will also try SPSAlucasart wrote:couple of years? Even billions of years won't be enough!mar wrote:MTDf should have been theoretically sound as well but practice has proven otherwise.
I got significant improvement thanks to this method, that's proof enough for me.
Of course if you're going to tune hundreds of eval params with CLOP I say good luck and let us know in a couple of years.
-
- Posts: 2272
- Joined: Mon Sep 29, 2008 1:50 am
Re: Working with CLOP
Now where did that come from?? There are convergence proofs for SPSA but not for CLOP. So the theoretical situation for SPSA is considerably better than for CLOP.In Stockfish, we use SPSA by Joona, which is much superior in practice (perhaps not in theory, but read my signature about theory and practice).
More importantly, CLOP invests all its time in finding a true (perhaps local) optimum whereas SPSA does hill climbing. Since you are not really interested in a true optimum (if you are near an optimum there is little elo to be gained anyway), but rather in an improvement, SPSA can give results fast even with a large number of parameters (the parameters that have no elo impact or are already optimal will perform a random walk).
Ideas=science. Simplification=engineering.
Without ideas there is nothing to simplify.
Without ideas there is nothing to simplify.