How effective is move ordering from TT?

diep · Post by **diep** » Sat Aug 11, 2012 3:14 pm

Rebel wrote:
Don wrote: I personally believe that Komodo has the best evaluation function of any chess program in the world.
I see a new form of (fun!) competition arising at the horizon, who has the best eval?

Its basic framework:

1. Root search (1-ply) only with standard QS.
2. QS needs to be defined by mutual agreement.
3. No extensions allowed.

25,000 - 50,000 games (or so) to weed out most of the noise because the lack of search.

Details to be worked out of course.

Great idea Ed. We need an independant tester who also verifies no cheating occurs. Do you volunteer?

With some luck we'll see then how strong mobility and coordinated piece evaluation plays.

Oh i remember - diep also knows everything about pins, and has extensive kingsafety that will directly attack the opponent king with all pieces, probably with the usual computer bug not using many pawns to do so. Will be giving spectacular attacking games!

Houdini · Post by **Houdini** » Sat Aug 11, 2012 3:44 pm

For Houdini 2 removing the PST amounts to about 150 Elo.
You can easily test this yourself, just install Houdini 2 with an incorrect license and it will disable the PST

.

lkaufman · Post by **lkaufman** » Sat Aug 11, 2012 4:14 pm

diep wrote:
lkaufman wrote:
diep wrote:If we would do next experiment Diep versus Komodo. We both remove our piece square tables and have the engines play.

Komodo probably loses 1000 elopoints or so then playing Diep. Diep will lose really a lot less than Komodo elowise. It's still 2600+ then or so in case of Diep i bet.

It's not even a contest. It's gonna be a 100% butchering of Komodo.

Now you might say this is not fair compare, and it isn't, as i have far more chessknowledge in Diep than you have in Komodo, probably a factor 100+ more or so. Yet that should give you an indication that ordering Diep's moves doesn't work with a simple heuristic like history table, which basically reflects your piece square tables and tuning for that given position.
Just for fun I decided to check out your claim. I'll assume for round numbers that Diep on one core is 2600 by IPON standards (correct me if I'm way off) and we'll round Komodo down to 3000. I made a version of Komodo 5 with all the piece square tables zeroed out, and rated it against the normal Komodo at 10" + 0.1". Fast games, but still average depth in the 11 to 12 ply range, probably good enough to earn the IM title in human tournaments, or at least FM. The elo loss was 190, giving it a rating of 2810, still miles above DIEP with piece square tables. I only played a couple hundred games, so the margin of error is large, but I'm not interested in a precise rating, just ballpark. But even this is not fair, because Komodo does depend on the tables somewhat for the king; in particular it may not want to castle without piece square tables. So I made another version that restored piece square tables just for the king, and the rating came out 2920, only 80 below normal and 320 above DIEP. Now I admit that at longer time limits the elo loss might be somewhat greater, but I doubt that it would grow by more than 50% at a time limit such as IPON uses for example (5' +3"). As the time limit grows, the number of draws increases, so this should counteract any tendency of the elo loss to grow with depth. Also the rating loss was probably much exaggerated by the self-testing, as you yourself have often pointed out. If both were run against DIEP, the loss should be much smaller.
So while Komodo may lose far more than DIEP if piece square tables are removed (I'll take your word for this), the loss will still leave Komodo far above DIEP, with or without such tables.
Please email me the version of Komodo 5 without piece square tables. I'll carefully check then whether you have turned off the PSQ's, as i simply don't believe your claim here.

Also you 'test' it by accident within a few minutes after my posting. No one will believe you.

You're kind of wrong about Diep's elo by the way. A 500 elopoints or so.

I have plenty of 8 core Xeon machines here to test. Each program can get its own machine of course.

I didn't say anything about testing it "by accident". I did so in response to your post. It only took a few minutes to run the games. As for the tables, we have a "Multiplier" for each term, and I just set them all to zero, very simple. So the tables would be filled with zeroes.
Regarding your statement that I am off by 500 elo for Diep, if I assume you mean that Diep is 2600 + 500 (and not minus 500!) = 3100 IPON scale on one core, it would be a hundred elo ahead of Houdini, and you would be missing out on making a pile of money by not selling it. Are you so wealthy that this is of no interest to you?

ZirconiumX · Post by **ZirconiumX** » Sat Aug 11, 2012 4:19 pm

Houdini wrote:For Houdini 2 removing the PST amounts to about 150 Elo.
You can easily test this yourself, just install Houdini 2 with an incorrect license and it will disable the PST .

Well, if that is all it does, then it's mine. No money from the Peak District for you. I'll probably put it through microwine first, though.

Legal note: I won't get Houdini through any methods, legal or not, mainly because Houdini won't run on a mac, or a Raspberry Pi.

Stockfish is plenty strong enough for me, thanks.

Matthew:out

lkaufman · Post by **lkaufman** » Sat Aug 11, 2012 4:57 pm

diep wrote:
Rebel wrote:
Don wrote: I personally believe that Komodo has the best evaluation function of any chess program in the world.
I see a new form of (fun!) competition arising at the horizon, who has the best eval?

Its basic framework:

1. Root search (1-ply) only with standard QS.
2. QS needs to be defined by mutual agreement.
3. No extensions allowed.

25,000 - 50,000 games (or so) to weed out most of the noise because the lack of search.

Details to be worked out of course.
Great idea Ed. We need an independant tester who also verifies no cheating occurs. Do you volunteer?

With some luck we'll see then how strong mobility and coordinated piece evaluation plays.

Oh i remember - diep also knows everything about pins, and has extensive kingsafety that will directly attack the opponent king with all pieces, probably with the usual computer bug not using many pawns to do so. Will be giving spectacular attacking games!

This is the problem. Knowledge about pins is generally considered tactical, not evaluation, even if you put it in the eval function. So probably Diep would look great on a one ply test due to this pin knowledge, but this has no bearing on which program has the better evaluation. There is no limit to how much tactical knowledge can be put into an eval function, but whether it justifies the slowdown in search is the question.
Regarding your request for a Komodo 5 version without PST, Richard Vida posted a patch to Komodo 5 making all eval terms configurable. Since we don't condone this I won't post the link here, but if you can find his patch all you need do is set the "xtm" terms ("pawn table multiplier" etc.), to zero and you'll have what you want.

diep · Post by **diep** » Sat Aug 11, 2012 5:47 pm

lkaufman wrote:
diep wrote:
Rebel wrote:
Don wrote: I personally believe that Komodo has the best evaluation function of any chess program in the world.
I see a new form of (fun!) competition arising at the horizon, who has the best eval?

Its basic framework:

1. Root search (1-ply) only with standard QS.
2. QS needs to be defined by mutual agreement.
3. No extensions allowed.

25,000 - 50,000 games (or so) to weed out most of the noise because the lack of search.

Details to be worked out of course.
Great idea Ed. We need an independant tester who also verifies no cheating occurs. Do you volunteer?

With some luck we'll see then how strong mobility and coordinated piece evaluation plays.

Oh i remember - diep also knows everything about pins, and has extensive kingsafety that will directly attack the opponent king with all pieces, probably with the usual computer bug not using many pawns to do so. Will be giving spectacular attacking games!
This is the problem. Knowledge about pins is generally considered tactical, not evaluation, even if you put it in the eval function. So probably Diep would look great on a one ply test due to this pin knowledge, but this has no bearing on which program has the better evaluation. There is no limit to how much tactical knowledge can be put into an eval function, but whether it justifies the slowdown in search is the question.
Regarding your request for a Komodo 5 version without PST, Richard Vida posted a patch to Komodo 5 making all eval terms configurable. Since we don't condone this I won't post the link here, but if you can find his patch all you need do is set the "xtm" terms ("pawn table multiplier" etc.), to zero and you'll have what you want.

You are trying to talk your way out of the 1 ply match?

kingsafety is also tactical, mobility is also tactical, evaluating attacks which diep is doing massively that's also tactical?

Yet evaluating the material suddenly is the most important 'positional term' of an evaluation?

Oh comeon we can call everything tactical.

I want a 1 ply match

Ed?
Make some noise!

lkaufman · Post by **lkaufman** » Sat Aug 11, 2012 6:01 pm

diep wrote:
lkaufman wrote:
diep wrote:
Rebel wrote:
Don wrote: I personally believe that Komodo has the best evaluation function of any chess program in the world.
I see a new form of (fun!) competition arising at the horizon, who has the best eval?

Its basic framework:

1. Root search (1-ply) only with standard QS.
2. QS needs to be defined by mutual agreement.
3. No extensions allowed.

25,000 - 50,000 games (or so) to weed out most of the noise because the lack of search.

Details to be worked out of course.
Great idea Ed. We need an independant tester who also verifies no cheating occurs. Do you volunteer?

With some luck we'll see then how strong mobility and coordinated piece evaluation plays.

Oh i remember - diep also knows everything about pins, and has extensive kingsafety that will directly attack the opponent king with all pieces, probably with the usual computer bug not using many pawns to do so. Will be giving spectacular attacking games!
This is the problem. Knowledge about pins is generally considered tactical, not evaluation, even if you put it in the eval function. So probably Diep would look great on a one ply test due to this pin knowledge, but this has no bearing on which program has the better evaluation. There is no limit to how much tactical knowledge can be put into an eval function, but whether it justifies the slowdown in search is the question.
Regarding your request for a Komodo 5 version without PST, Richard Vida posted a patch to Komodo 5 making all eval terms configurable. Since we don't condone this I won't post the link here, but if you can find his patch all you need do is set the "xtm" terms ("pawn table multiplier" etc.), to zero and you'll have what you want.
You are trying to talk your way out of the 1 ply match?

kingsafety is also tactical, mobility is also tactical, evaluating attacks which diep is doing massively that's also tactical?

Yet evaluating the material suddenly is the most important 'positional term' of an evaluation?

Oh comeon we can call everything tactical.

I want a 1 ply match

Ed?
Make some noise!

Certainly "evaluating attacks" is tactical, we do some of that too, but probably much less than you. It's basically an attempt to save one ply of search in specific situations. I didn't say anything about "material" being the most important positional term. Pawn structure, mobility, king safety, and many specific positional terms are also important. But anything that attempts to save a ply of search is in my opinion tactical. I don't object to anyone running one ply search matches. But they tell us nothing about which program is better able to evaluate positions with no tactics in them, which is what I would consider the question here.

Uri Blass · Post by **Uri Blass** » Sat Aug 11, 2012 6:02 pm

lkaufman wrote:
diep wrote:
Rebel wrote:
Don wrote: I personally believe that Komodo has the best evaluation function of any chess program in the world.
I see a new form of (fun!) competition arising at the horizon, who has the best eval?

Its basic framework:

1. Root search (1-ply) only with standard QS.
2. QS needs to be defined by mutual agreement.
3. No extensions allowed.

25,000 - 50,000 games (or so) to weed out most of the noise because the lack of search.

Details to be worked out of course.
Great idea Ed. We need an independant tester who also verifies no cheating occurs. Do you volunteer?

With some luck we'll see then how strong mobility and coordinated piece evaluation plays.

Oh i remember - diep also knows everything about pins, and has extensive kingsafety that will directly attack the opponent king with all pieces, probably with the usual computer bug not using many pawns to do so. Will be giving spectacular attacking games!
This is the problem. Knowledge about pins is generally considered tactical, not evaluation, even if you put it in the eval function. So probably Diep would look great on a one ply test due to this pin knowledge, but this has no bearing on which program has the better evaluation. There is no limit to how much tactical knowledge can be put into an eval function, but whether it justifies the slowdown in search is the question.
Regarding your request for a Komodo 5 version without PST, Richard Vida posted a patch to Komodo 5 making all eval terms configurable. Since we don't condone this I won't post the link here, but if you can find his patch all you need do is set the "xtm" terms ("pawn table multiplier" etc.), to zero and you'll have what you want.

I think that pins are not only tactical knowledge because it is possible to have a pin for many moves without winning material.

I also think that tactical knowledge is part of the knowledge that humans use in their evaluation function.

For example if I see a white knight at c7 and a black rook at a8 that cannot move then I may know that the rook is probably trapped without calculating all possible moves of black and it is clearly fair to say that
being able to see it without searching all the moves of black means better evaluation function.

Practically capturing the a8 rook may be hidden even with some plies of additional search because black can delay the capture by some threats against white queen or by checks and it is also possible that one of these threats can also save the a8 rook later so you cannot be sure by evaluation that you win the rook but you can give some bonus for the fact that maybe you are going to win the rook.

lkaufman · Post by **lkaufman** » Sat Aug 11, 2012 6:06 pm

Uri Blass wrote:
lkaufman wrote:
diep wrote:
Rebel wrote:
Don wrote: I personally believe that Komodo has the best evaluation function of any chess program in the world.
I see a new form of (fun!) competition arising at the horizon, who has the best eval?

Its basic framework:

1. Root search (1-ply) only with standard QS.
2. QS needs to be defined by mutual agreement.
3. No extensions allowed.

25,000 - 50,000 games (or so) to weed out most of the noise because the lack of search.

Details to be worked out of course.
Great idea Ed. We need an independant tester who also verifies no cheating occurs. Do you volunteer?

With some luck we'll see then how strong mobility and coordinated piece evaluation plays.

Oh i remember - diep also knows everything about pins, and has extensive kingsafety that will directly attack the opponent king with all pieces, probably with the usual computer bug not using many pawns to do so. Will be giving spectacular attacking games!
This is the problem. Knowledge about pins is generally considered tactical, not evaluation, even if you put it in the eval function. So probably Diep would look great on a one ply test due to this pin knowledge, but this has no bearing on which program has the better evaluation. There is no limit to how much tactical knowledge can be put into an eval function, but whether it justifies the slowdown in search is the question.
Regarding your request for a Komodo 5 version without PST, Richard Vida posted a patch to Komodo 5 making all eval terms configurable. Since we don't condone this I won't post the link here, but if you can find his patch all you need do is set the "xtm" terms ("pawn table multiplier" etc.), to zero and you'll have what you want.
I think that pins are not only tactical knowledge because it is possible to have a pin for many moves without winning material.

I also think that tactical knowledge is part of the knowledge that humans use in their evaluation function.

For example if I see a white knight at c7 and a black rook at a8 that cannot move then I may know that the rook is probably trapped without calculating all possible moves of black and it is clearly fair to say that
being able to see it without searching all the moves of black means better evaluation function.

Practically capturing the a8 rook may be hidden even with some plies of additional search because black can delay the capture by some threats against white queen or by checks and it is also possible that one of these threats can also save the a8 rook later so you cannot be sure by evaluation that you win the rook but you can give some bonus for the fact that maybe you are going to win the rook.

Scoring of pins could be considered positional, I agree, but for example scoring pawn attacks on pinned pieces is basically tactical, it's an attempt to save (usually) one ply of search. We could very easily make our program look much better on one ply searches by including various tactical ideas like this in eval, but we have found that doing this sort of thing makes the program weaker on balance due to the slowdown.

Uri Blass · Post by **Uri Blass** » Sat Aug 11, 2012 6:19 pm

lkaufman wrote:
diep wrote:
lkaufman wrote:
diep wrote:
Rebel wrote:
Don wrote: I personally believe that Komodo has the best evaluation function of any chess program in the world.
I see a new form of (fun!) competition arising at the horizon, who has the best eval?

Its basic framework:

1. Root search (1-ply) only with standard QS.
2. QS needs to be defined by mutual agreement.
3. No extensions allowed.

25,000 - 50,000 games (or so) to weed out most of the noise because the lack of search.

Details to be worked out of course.
Great idea Ed. We need an independant tester who also verifies no cheating occurs. Do you volunteer?

With some luck we'll see then how strong mobility and coordinated piece evaluation plays.

Oh i remember - diep also knows everything about pins, and has extensive kingsafety that will directly attack the opponent king with all pieces, probably with the usual computer bug not using many pawns to do so. Will be giving spectacular attacking games!
This is the problem. Knowledge about pins is generally considered tactical, not evaluation, even if you put it in the eval function. So probably Diep would look great on a one ply test due to this pin knowledge, but this has no bearing on which program has the better evaluation. There is no limit to how much tactical knowledge can be put into an eval function, but whether it justifies the slowdown in search is the question.
Regarding your request for a Komodo 5 version without PST, Richard Vida posted a patch to Komodo 5 making all eval terms configurable. Since we don't condone this I won't post the link here, but if you can find his patch all you need do is set the "xtm" terms ("pawn table multiplier" etc.), to zero and you'll have what you want.
You are trying to talk your way out of the 1 ply match?

kingsafety is also tactical, mobility is also tactical, evaluating attacks which diep is doing massively that's also tactical?

Yet evaluating the material suddenly is the most important 'positional term' of an evaluation?

Oh comeon we can call everything tactical.

I want a 1 ply match

Ed?
Make some noise!
Certainly "evaluating attacks" is tactical, we do some of that too, but probably much less than you. It's basically an attempt to save one ply of search in specific situations. I didn't say anything about "material" being the most important positional term. Pawn structure, mobility, king safety, and many specific positional terms are also important. But anything that attempts to save a ply of search is in my opinion tactical. I don't object to anyone running one ply search matches. But they tell us nothing about which program is better able to evaluate positions with no tactics in them, which is what I would consider the question here.

I think that it may be possible to have a 1 ply match when every move is checked by a deeper search of houdini not to be a serious mistake.

In 1 ply matches we have 2 type of moves:
1)moves that are at least 0.3 pawns weaker than the best move(based on search of houdini to depth 12)
2)moves that are not at least 0.3 pawns weaker than the best move based on the same search.

We can accept houdini's moves for type 1 moves(that hopefully are going to be minority of the moves) when we accept the moves of the engines for the other moves.

It is not going to fully answer the question because even 0.2 pawns difference may be because of tactics when the best move does not lose a pawn when the move that is 0.2 pawns weaker lose a pawn for some positional compensation but I guess that if komodo is better in non tactical positions it has bigger chances in the conditions that I suggest.

How effective is move ordering from TT?

Re: How effective is move ordering from TT?

Re: How effective is move ordering from TT?

Re: How effective is move ordering from TT?

Re: How effective is move ordering from TT?

Re: How effective is move ordering from TT?

Re: How effective is move ordering from TT?

Re: How effective is move ordering from TT?

Re: How effective is move ordering from TT?

Re: How effective is move ordering from TT?

Re: How effective is move ordering from TT?