How effective is move ordering from TT?

Desperado · Post by **Desperado** » Sat Aug 11, 2012 10:23 pm

diep wrote:
...

(I already pointed out some things in my last post) It simply belongs together.
Search <-> Heuristic
Any conclusion, doesnt matter how the setup looks like of the experiment,
would be misleading, and would not really represent the evaluation
with the better chess knowledge at all.
I disagree there.

...

no problem

But what you finally are measuring is _NOT_ the "chess knowledge",
_BUT_ the "chess knowledge in its framework". IMO this experiment
will not extract the essence of the chess knowledge.

Here is a simple question for you and for Don.

What offset do you get when your engines play with:

a. material + pst mode
compared to
b. full evaluation mode.

It is a while back i measured that for Nemo, but i think it was
about 300 Elo difference.

Michael

Don · Post by **Don** » Sat Aug 11, 2012 10:30 pm

diep wrote:
plattyaj wrote:
diep wrote: the proof is easy that bad moves search deeper with LMR.

I select in an upperbound node the queen giving away move.
Where i can capture a piece i select the move not taking it.

I'm gonna get 50 plies you know...

It won't be better elowise maybe, but sure it searches DEEPER.

Anyone not understanding that didn't program a chessprogram.
Oh wait - you didn't - you just improved TSCP a tad...

Didn't you do your PHD in statistics by the way, why aren't you commenting at any statistics thread here?
I don't understand. If I search bad moves first, I have to search all of them (until I get to the point where my LMR kicks in). So that means that my search tree is a lot wider. But it also means I won't be able to search so deep because I'll run out of time. What am I missing?
LMR is NOT reducing the first moves, usually first 3. And it's reducing the remainder of the list.

So if we select a horrible move first in an upperbound node, then nullmove will reduce deeper in the tree that searchspace bigtime and the tactical moves that can create trouble that we can try in this position P we reduce an additional ply.

Yes, that is correct. LMR can reduce a critical move that will cause you to miss something important tactically.

One of the leading causes of death in the USA is traffic accidents. Walking is safer, but driving by automobile is much faster and you can go much farther. But if you get into an accident you will not get there at all. We still take automobiles because we can get to places we could not reasonable go without them.

LMR is like taking the automobile. You might get into an accident and miss a critical line but most of the time you will get to places where you could not go - you will see things you could NEVER reasonably expect to see without LMR.

It's not uncommon to find positions where Komodo takes 2 or 3 extra ply with LMR and STILL finds it faster than it would without LMR. So why would I ever want to go back to a program which drops several ply and takes several times longer to find almost every tactic?

[/quote]

Most engines already reduce them nowadays 2 plies or so (on top of the normal reduction of 1 ply).

[/quote]
Komodo reduces progressively, several types of moves are not reduced at all, several are reduced 1 ply and then the ones that appear to be weaker are reduced more. We don't stupidly reduce without reason. Also, we have heuristics in komodo which will alter the reduction schedule, where a move that might normally get reduced 2 ply will only get reduced 1 ply for example. We are very careful to try not to reduce an important move too much.

Reducing a move is not quite the same as forward pruning it either. Just because a move is reduced doesn't mean it won't be re-searched. Human players play very much the same way - they will superficially look at a move and if it seems interesting they will look at it more carefully. I don't see any reason why a program should not do the same.

So mathematical it's very easy to prove selecting the first moves, say 3 moves we don't reduce, to select the most horrible moves there. Of c ourse that's not captures.

Just put your queen back on a square opponent can capture it.

Vincent

lkaufman · Post by **lkaufman** » Sat Aug 11, 2012 11:08 pm

Desperado wrote:
diep wrote:
...

(I already pointed out some things in my last post) It simply belongs together.
Search <-> Heuristic
Any conclusion, doesnt matter how the setup looks like of the experiment,
would be misleading, and would not really represent the evaluation
with the better chess knowledge at all.
I disagree there.

...

no problem

But what you finally are measuring is _NOT_ the "chess knowledge",
_BUT_ the "chess knowledge in its framework". IMO this experiment
will not extract the essence of the chess knowledge.

Here is a simple question for you and for Don.

What offset do you get when your engines play with:

a. material + pst mode
compared to
b. full evaluation mode.

It is a while back i measured that for Nemo, but i think it was
about 300 Elo difference.

Michael

It's been a while for us too since measuring that in Komodo, but it was a lot, probably more than the 300 you mention to my best recollection.

Rebel · Post by **Rebel** » Sat Aug 11, 2012 11:12 pm

diep wrote:
Rebel wrote:
Don wrote: I personally believe that Komodo has the best evaluation function of any chess program in the world.
I see a new form of (fun!) competition arising at the horizon, who has the best eval?

Its basic framework:

1. Root search (1-ply) only with standard QS.
2. QS needs to be defined by mutual agreement.
3. No extensions allowed.

25,000 - 50,000 games (or so) to weed out most of the noise because the lack of search.

Details to be worked out of course.
Great idea Ed. We need an independant tester who also verifies no cheating occurs. Do you volunteer?

If the 2 of you come to an agreement then surely. With a debugger the software will be checked first if the 2 of you play by the agreed rules.

Will be giving spectacular attacking games!

Don't forget the spectacular horizon effects

diep · Post by **diep** » Sat Aug 11, 2012 11:23 pm

Don,
My claim is you CAN search a lot deeper with LMR by sorting bad moves as first to try.

Obviously i never claimed it would play better. In contradiction. As you're native English, i do realize you never made that distinction mistake, unlike Uri who isn't native English.

You carefully formulated it in a manner not excluding that.
You're not admitting this however: "can search deeper".

And all the time i've been posting about that - and you knew it.

By searching deeper you especially notice tactics get weaker.
Komodo needs plies more - and i'm sure you tested it to be better - than other engines to find the same trick.

So what brings obviously most strength for Komodo is the increased search depth - it's better mainline checking.

I call that eloscaling, though that's probably a bad English word.

Then thanks to hashtable and other montecarlo effects you select of course in the end the correct move and a side effect then of the depth is that you also find more than without.

That's not the discussion however. the discussion is that you search DEEPER with LMR than if you select the most stupid move first.

That's my whole big point - namely that the absolute search depth that engines get nowadays is less of a relevant number than it was 15 years ago.

Rebel · Post by **Rebel** » Sat Aug 11, 2012 11:32 pm

Don wrote: I think you are ducking a REAL match. Trying to make a 1 ply match happen is way down on my list of priorities and I would not allow myself to be distracted by such a thing.

Don,

On several occasions you have said Komodo has the best eval in the world. I think you should proof it now that you have a challenger.

In good old Rome tradition we want to see the gladiators blood flow

lkaufman · Post by **lkaufman** » Sun Aug 12, 2012 12:23 am

Rebel wrote:
Don wrote: I think you are ducking a REAL match. Trying to make a 1 ply match happen is way down on my list of priorities and I would not allow myself to be distracted by such a thing.
Don,

On several occasions you have said Komodo has the best eval in the world. I think you should proof it now that you have a challenger.

In good old Rome tradition we want to see the gladiators blood flow

We have a different definition of eval than Vince. He refers to the eval function, while we are talking about eval in positions where search won't help appreciably. Probably Diep has a better eval function, because it gives up search depth for evaluating tactics. We claim that Komodo has the best eval when tactics don't matter, and I don't know of a way to prove this. When tactics do matter, search depth is extremely important, and comparing us on equal depth to Diep has no value.

Don · Post by **Don** » Sun Aug 12, 2012 12:48 am

lkaufman wrote:
Rebel wrote:
Don wrote: I think you are ducking a REAL match. Trying to make a 1 ply match happen is way down on my list of priorities and I would not allow myself to be distracted by such a thing.
Don,

On several occasions you have said Komodo has the best eval in the world. I think you should proof it now that you have a challenger.

In good old Rome tradition we want to see the gladiators blood flow
We have a different definition of eval than Vince. He refers to the eval function, while we are talking about eval in positions where search won't help appreciably. Probably Diep has a better eval function, because it gives up search depth for evaluating tactics. We claim that Komodo has the best eval when tactics don't matter, and I don't know of a way to prove this. When tactics do matter, search depth is extremely important, and comparing us on equal depth to Diep has no value.

Just to illustrate how differently our definition really is, Vincent proposes to "prove" which program has the most sophisticated evaluation function by doing a 1 ply search. As almost anyone in this field knows, a 1 ply search is completely dominated by tactics and fine positional understanding is almost completely irrelevant. Any yet he believes that is a legitimate test of how strong a program is in the positional sense. I can only say that his definition of what is positional is different than ours.

I think the best test is to simply play a long match at long time controls. The program that wins is playing the best chess and we don't have to debate "positional" vs "tactical" play, as has been pointed out it is ALL positional play, right?

Don

Uri Blass · Post by **Uri Blass** » Sun Aug 12, 2012 1:56 am

Don wrote:
lkaufman wrote:
Rebel wrote:
Don wrote: I think you are ducking a REAL match. Trying to make a 1 ply match happen is way down on my list of priorities and I would not allow myself to be distracted by such a thing.
Don,

On several occasions you have said Komodo has the best eval in the world. I think you should proof it now that you have a challenger.

In good old Rome tradition we want to see the gladiators blood flow
We have a different definition of eval than Vince. He refers to the eval function, while we are talking about eval in positions where search won't help appreciably. Probably Diep has a better eval function, because it gives up search depth for evaluating tactics. We claim that Komodo has the best eval when tactics don't matter, and I don't know of a way to prove this. When tactics do matter, search depth is extremely important, and comparing us on equal depth to Diep has no value.
Just to illustrate how differently our definition really is, Vincent proposes to "prove" which program has the most sophisticated evaluation function by doing a 1 ply search. As almost anyone in this field knows, a 1 ply search is completely dominated by tactics and fine positional understanding is almost completely irrelevant. Any yet he believes that is a legitimate test of how strong a program is in the positional sense. I can only say that his definition of what is positional is different than ours.

I think the best test is to simply play a long match at long time controls. The program that wins is playing the best chess and we don't have to debate "positional" vs "tactical" play, as has been pointed out it is ALL positional play, right?

Don

Playing long time control match is not going to answer the question because the faster program or the program that has better search
may win even if both programs have exactly the same evaluation function.

The programs need to have the same search algorithm and to search
the same number of nodes(or have another fair rule)in order to compare quality of the evaluation function and I guess that we cannot get a reply even in this case because one evaluation may be better for one search when the second program is better for another search.

Don · Post by **Don** » Sun Aug 12, 2012 2:19 am

Uri Blass wrote:
Don wrote:
lkaufman wrote:
Rebel wrote:
Don wrote: I think you are ducking a REAL match. Trying to make a 1 ply match happen is way down on my list of priorities and I would not allow myself to be distracted by such a thing.
Don,

On several occasions you have said Komodo has the best eval in the world. I think you should proof it now that you have a challenger.

In good old Rome tradition we want to see the gladiators blood flow
We have a different definition of eval than Vince. He refers to the eval function, while we are talking about eval in positions where search won't help appreciably. Probably Diep has a better eval function, because it gives up search depth for evaluating tactics. We claim that Komodo has the best eval when tactics don't matter, and I don't know of a way to prove this. When tactics do matter, search depth is extremely important, and comparing us on equal depth to Diep has no value.
Just to illustrate how differently our definition really is, Vincent proposes to "prove" which program has the most sophisticated evaluation function by doing a 1 ply search. As almost anyone in this field knows, a 1 ply search is completely dominated by tactics and fine positional understanding is almost completely irrelevant. Any yet he believes that is a legitimate test of how strong a program is in the positional sense. I can only say that his definition of what is positional is different than ours.

I think the best test is to simply play a long match at long time controls. The program that wins is playing the best chess and we don't have to debate "positional" vs "tactical" play, as has been pointed out it is ALL positional play, right?

Don
Playing long time control match is not going to answer the question because the faster program or the program that has better search
may win even if both programs have exactly the same evaluation function.

The programs need to have the same search algorithm and to search
the same number of nodes(or have another fair rule)in order to compare quality of the evaluation function and I guess that we cannot get a reply even in this case because one evaluation may be better for one search when the second program is better for another search.

Yes, this is undoubtedly an imperfect test, but I don't know that we can do better. Certainly a 1 ply search doesn't prove anything.

The big problem with identical search is that each program evolves WITH it's evaluation function. So to take some program and replace various evaluation functions with it just would not work. Don't forget that for some people the evaluation function is not totally independent from the search as information could be passed back and forth.

However, if such a test were to be done I would suggest that a good old fashioned brute force search with no forward pruning, selectivity or anything else be applied. Perhaps only check extensions. Each person writes their own evaluation function with a simple interface, perhaps:

int eval( char *fen )

You task is to return an integer representing the score from the point of view of the player who's turn to move it is. The "fen" could be replaced with a 64 byte array representation to speed up decoding of the position.

I could probably make a version of Komodo's evaluation function that honors this protocol.

Now the problem is what prevents me (or someone else) from actually doing an N ply search or perhaps a simple quies search built in? Without inspecting the code this is just a black box - it could even have a full blown Komodo search built in. You call eval() and it actually does a 10 ply search!

Then there is mate threat detection and attack bonuses which some programs build in to the evaluation function. I did not design Komodo to have the best possible evaluation for a 1 ply search - I designed it to have the best possible evaluation function I could manage for a practical chess program. If I wanted to compete in a 1 ply competition I would have built in all sorts of tactics (regardless of how slow it might make the program) into the evaluation function. You can design an evaluation functions to find mates and all sort of stuff. In fact we do have some basic things such as pins and up-attacks or attacks of undefended pieces - but nothing much more sophisticate than that - it's just way too expensive compared to a quies search. Vincent says his program is 20x slower than mine - so I can do several full blown quies searches in the same time he calls the evaluation function apparently.

How effective is move ordering from TT?

Re: How effective is move ordering from TT?

Re: How effective is move ordering from TT?

Re: How effective is move ordering from TT?

Re: How effective is move ordering from TT?

Re: How effective is move ordering from TT?

Re: How effective is move ordering from TT?

Re: How effective is move ordering from TT?

Re: How effective is move ordering from TT?

Re: How effective is move ordering from TT?

Re: How effective is move ordering from TT?