diep wrote:Rebel wrote:hgm wrote:Micro-Max has a very poor move ordering (no killer, no history; it just does 1) null move, 2) hash move, 3) best MVV/LVA capture, 4) other captures in unspecified order, 5) non-captures in unspecified order). Yet LMR worked great. It reduces all non-captures that are not Pawn moves.
LMR might give no doubt, but the better the move ordering the better LMR will perform. Much of course will depend on the scheme in use for 1.0, 1.5, 2.0, 2.5, 3.0 reductions in relationship with the number of the move list. I am not using the latter myself but that's what I have understood what many programs do nowadays.
Ed, if i just look at rebel and compare it with the derivatives, the trick is the material evaluation of them.
It's more advanced than Rebel one. This where they indeed use a few simple concepts to get it done.
If none of your positional factors, other than passed pawn evaluation,
can ever overrule the material evaluation, then you can easily forward prune last 4 plies, not sure how this has been set in komodo.
Our evaluation is pretty aggressive and it can easily go over a pawn even without passed pawns.
A way to test your evaluation is to try a bunch of gambit positions. For example if you put Morra Gambit in and do a search the old programs with weak evaluation consider that you are down almost a pawn, but in reality you have a lot of compensation. I think current consensus is that you should not play most of these gambits, but they don't lose by force.
I just checked Komodo using the classic main line of the Smith Morra Gambit accepted:
1. e4 c5 2. d4 cxd4 3. c3 dxc3 4. Nxc3 Nc6
Now from the opening position Komodo thinks white has about 0.30 pawn advantage. If you play the above line Komodo thinks white is just slightly weaker - about -0.05 or so. So Komodo thinks by playing the gambit you are giving black a slight advantage that you didn't have to give. I'm sure people's opinions will vary on this but I think most strong players believe you are giving away the advantage of the white pieces by playing like this - so Komodo is pretty much correct.
Komodo is perfectly capable of positional pawn and exchange sacrifices too and they will be sound. A beancounter program cannot do that.
If someone want to build a good problem set I think a great one would present opportunites for SOUND positional sacrifices. These are difficult to construct because if a positional sacrifice is possible, it means you already have the advantage and for a good set the sacrifice needs to be the clearly best way to proceed. I have seen many positions where people show what a great move Komodo or some other program has played when it turns out that the program was winning and many moves worked, it's just more impressive if the program plays a "flashy" move. Komodo was not designed to play flashy just to impress people - but it will play them when they are needed.
That's what they do.
They win 3 plies there that i do not win.
Even if we are doing something special here, what is the point? With Komodo and I'm sure ALL the top programs we do what works. If we actually figure out how to win 3 ply it would be stupid not to do so unless it hurts the program. Nothing stops you from doing the same. If it hurts your program then don't do it! If there is something different about your program that makes it impossible to do so, then you must decide whether that is a good or bad thing.
What can be simpler than that?
Then they use nullmove with reduction factor that soon goes up 3,4,5,6,7 etc.
That they combine with agressive LMR.
If you'd retune Rebel, all you have to do is improve your material evaluation, have a 2 layer PSQ table and retune every parameter.
So layer 1 of the PSQ table is for opening, layer 2 is for endgame and then you find a formula to know how far in opening or endgame you are.
You cannot tune those parameters playing games.
You'd beat the crap out of Komodo in superbullet with Rebel then with your searchframe would you manage.
As for search, their strong point is Rebels strong point as well, namely see the mainline deep quickly. Additional to that you see more tactics with Rebel, so you win the superbullet bigtime from Komodo5 in such case.
I have huge respect for Rebel, if Ed really got serious about computer chess again I know that we would have our hands full trying to keep up with him.
But my only response to you is that I think our methods have been fully justified by the strength of Komodo. You can prove us wrong by beating us, otherwise all you are doing is making a fool out our yourself.
Just you lose it based upon the combination of material, and piece square tables. They have copied Fruit's idea, some of them in fact just cut'n pasted it, to have a piece square table for every piece in both opening as well as endgame. Then they average over that using a 32 intervals or so (with piece = 1, rook = 3, queen = 6 ; so in a full board position it gets 100% from the first PSQ with 2 * (6 + 3 + 3 + 1 + 1 + 1 + 1 ) = 2 * 16 = 32 points being the openingstable.
Fruit was using 1,2,4 by the way as values instead of 1,3,6...
This concept is surprisingly strong simply for endgame.
You don't manage to tune that at home that accurately though.
I've been toying at 80 cores and now at home i have 64 core cluster with 2 Tesla cards each having 448 cores. So i have in total for tuning nearly a 1000 cores.
In my experiments on how to improve Diep's tuning, i encountered many problems.
The PSQ tables play a total crucial role in the derivatives. Remove them and they're dead. The material evaluation none of them would invent it.
It has been tuned as it seems by neural net.
Now i'm not too impressed by that neural net, but tuning of your values highly is dependant upon how much knowledge you have got.
An engine with no knowledge except material, a bishop will be worth like 2.85 or so.
Your first goal is to figure out how to mathematically tune.
Playing games won't work.
If you have quite some knowledge you need a million games to figure out whether something needs to be 0.013 pawn bonus or 0.014
Forget it.
LMR is not the holy grail. It's the tuning and a simple yet effective setup of the evaluation. Material for the material. PSQ's for the positional evaluation and simple passed pawn knowledge for the passers.
Kingsafety?
Ah that's the big bummer. In many positions in chess it simply APPEARS that you can get away with relative simple kingsafety provided you get 20 ply. Most positional factors in kingsafety give a TEMPORARILY advantage.
That doesn't stop me from still working on Diep's kingsafety by the way.
Yet nearly all fixes is do right now have to do with simple things. Like i'm fixing now the willingness of diep to castle quick. Sure it wants to castle, but the derivatives simply give a big penalty for a king in the center.
Tuning fixes it for them everywhere....
In Diep just the material already is a 300+ parameters or so. Diep's material concept is totally different from the Fruit concept.
I really use knowledge rules to decide whether i give a bonus. I don't use a neural net tuned entity.
though it's all tuned by hand this material evaluation, i get impression diep is doing it better than the derivatives. But that feeling is very recent and i'm not sure it's already ok for most endgames.
As Diep has more parameters i can easier independantly decide to tune specific endgames. More possibilities also mean more problems for tuning though.
Tuning Diep is far more complex i'd say - even then the effort needed is similar in terms of expertise.
Tuning is in fact very important. No question about that, but you cannot tune something that is not there. If something important is missing from your program you are not going to "tune" it into the program.