How effective is move ordering from TT?

Discussion of chess software programming and technical issues.

Moderator: Ras

bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: How effective is move ordering from TT?

Post by bob »

chrisw wrote:
diep wrote:
lkaufman wrote:
diep wrote:
Rebel wrote:
Don wrote: I personally believe that Komodo has the best evaluation function of any chess program in the world.
I see a new form of (fun!) competition arising at the horizon, who has the best eval?

Its basic framework:

1. Root search (1-ply) only with standard QS.
2. QS needs to be defined by mutual agreement.
3. No extensions allowed.

25,000 - 50,000 games (or so) to weed out most of the noise because the lack of search.

Details to be worked out of course.
Great idea Ed. We need an independant tester who also verifies no cheating occurs. Do you volunteer?

With some luck we'll see then how strong mobility and coordinated piece evaluation plays.

Oh i remember - diep also knows everything about pins, and has extensive kingsafety that will directly attack the opponent king with all pieces, probably with the usual computer bug not using many pawns to do so. Will be giving spectacular attacking games!
This is the problem. Knowledge about pins is generally considered tactical, not evaluation, even if you put it in the eval function. So probably Diep would look great on a one ply test due to this pin knowledge, but this has no bearing on which program has the better evaluation. There is no limit to how much tactical knowledge can be put into an eval function, but whether it justifies the slowdown in search is the question.
Regarding your request for a Komodo 5 version without PST, Richard Vida posted a patch to Komodo 5 making all eval terms configurable. Since we don't condone this I won't post the link here, but if you can find his patch all you need do is set the "xtm" terms ("pawn table multiplier" etc.), to zero and you'll have what you want.
You are trying to talk your way out of the 1 ply match?

kingsafety is also tactical, mobility is also tactical, evaluating attacks which diep is doing massively that's also tactical?

Yet evaluating the material suddenly is the most important 'positional term' of an evaluation?

Oh comeon we can call everything tactical.

I want a 1 ply match :)

Ed?
Make some noise!
Completely agree with Vincent. Only beancounter programmers would oppose Ed's idea, always using the same false dichotomy, search=tactics, eval=positional. Nonsense of course. I'ld take it further, ban the QS, which can contain all manner of check search tricks btw and force the beancounters to write a SEE. Then we'll see how really crap their evals are ;-)

One way you can also test btw, is put the zero search program onto ICC and test it against rated players. Then shoot any programmer who can't get 2000 ELO out of raw evaluation only.
There is a major flaw in your reasoning. You are going back to the 70's, when the mantra was "you must do chess on a computer like a human does it." Problem was, then, and still is, now, "We don't know HOW a human plays chess." So saying "no search" is a meaningless constraint.

Not to mention the obvious dichotomy where one can write a complex eval, or use search to fill in issues the eval doesn't handle well, and either should eventually reach exactly the same level of skill. But with computers, it is easier to rely on recursive high-speed stuff rather than on overly complex code that contains too many bugs to ever work well..
User avatar
Rebel
Posts: 7297
Joined: Thu Aug 18, 2011 12:04 pm
Full name: Ed Schröder

Re: How effective is move ordering from TT?

Post by Rebel »

chrisw wrote:
diep wrote:
lkaufman wrote:
diep wrote:
Rebel wrote:
Don wrote: I personally believe that Komodo has the best evaluation function of any chess program in the world.
I see a new form of (fun!) competition arising at the horizon, who has the best eval?

Its basic framework:

1. Root search (1-ply) only with standard QS.
2. QS needs to be defined by mutual agreement.
3. No extensions allowed.

25,000 - 50,000 games (or so) to weed out most of the noise because the lack of search.

Details to be worked out of course.
Great idea Ed. We need an independant tester who also verifies no cheating occurs. Do you volunteer?

With some luck we'll see then how strong mobility and coordinated piece evaluation plays.

Oh i remember - diep also knows everything about pins, and has extensive kingsafety that will directly attack the opponent king with all pieces, probably with the usual computer bug not using many pawns to do so. Will be giving spectacular attacking games!
This is the problem. Knowledge about pins is generally considered tactical, not evaluation, even if you put it in the eval function. So probably Diep would look great on a one ply test due to this pin knowledge, but this has no bearing on which program has the better evaluation. There is no limit to how much tactical knowledge can be put into an eval function, but whether it justifies the slowdown in search is the question.
Regarding your request for a Komodo 5 version without PST, Richard Vida posted a patch to Komodo 5 making all eval terms configurable. Since we don't condone this I won't post the link here, but if you can find his patch all you need do is set the "xtm" terms ("pawn table multiplier" etc.), to zero and you'll have what you want.
You are trying to talk your way out of the 1 ply match?

kingsafety is also tactical, mobility is also tactical, evaluating attacks which diep is doing massively that's also tactical?

Yet evaluating the material suddenly is the most important 'positional term' of an evaluation?

Oh comeon we can call everything tactical.

I want a 1 ply match :)

Ed?
Make some noise!
Completely agree with Vincent. Only beancounter programmers would oppose Ed's idea, always using the same false dichotomy, search=tactics, eval=positional. Nonsense of course. I'ld take it further, ban the QS, which can contain all manner of check search tricks btw and force the beancounters to write a SEE. Then we'll see how really crap their evals are ;-)

One way you can also test btw, is put the zero search program onto ICC and test it against rated players. Then shoot any programmer who can't get 2000 ELO out of raw evaluation only.
And another fun expiriment, the "best" full static eval. No QS but in the middle of the chaos on the board with multiple hanging pieces and counter attacks one evaluates and checking moves needs special code to see if the move is not mate, cool.

So we have 2 challenges, the best dynamic eval (with QS) and the best static eval. Now we need participants 8-)
User avatar
Rebel
Posts: 7297
Joined: Thu Aug 18, 2011 12:04 pm
Full name: Ed Schröder

Re: How effective is move ordering from TT?

Post by Rebel »

lkaufman wrote:
Rebel wrote:
Don wrote: I think you are ducking a REAL match. Trying to make a 1 ply match happen is way down on my list of priorities and I would not allow myself to be distracted by such a thing.
Don,

On several occasions you have said Komodo has the best eval in the world. I think you should proof it now that you have a challenger.

In good old Rome tradition we want to see the gladiators blood flow :mrgreen:
We have a different definition of eval than Vince. He refers to the eval function, while we are talking about eval in positions where search won't help appreciably. Probably Diep has a better eval function, because it gives up search depth for evaluating tactics. We claim that Komodo has the best eval when tactics don't matter, and I don't know of a way to prove this. When tactics do matter, search depth is extremely important, and comparing us on equal depth to Diep has no value.
So a redefined claim which you can not proof? What purpose serves that?
chrisw
Posts: 4624
Joined: Tue Apr 03, 2012 4:28 pm
Location: Midi-Pyrénées
Full name: Christopher Whittington

Re: How effective is move ordering from TT?

Post by chrisw »

bob wrote:
chrisw wrote:
diep wrote:
lkaufman wrote:
diep wrote:
Rebel wrote:
Don wrote: I personally believe that Komodo has the best evaluation function of any chess program in the world.
I see a new form of (fun!) competition arising at the horizon, who has the best eval?

Its basic framework:

1. Root search (1-ply) only with standard QS.
2. QS needs to be defined by mutual agreement.
3. No extensions allowed.

25,000 - 50,000 games (or so) to weed out most of the noise because the lack of search.

Details to be worked out of course.
Great idea Ed. We need an independant tester who also verifies no cheating occurs. Do you volunteer?

With some luck we'll see then how strong mobility and coordinated piece evaluation plays.

Oh i remember - diep also knows everything about pins, and has extensive kingsafety that will directly attack the opponent king with all pieces, probably with the usual computer bug not using many pawns to do so. Will be giving spectacular attacking games!
This is the problem. Knowledge about pins is generally considered tactical, not evaluation, even if you put it in the eval function. So probably Diep would look great on a one ply test due to this pin knowledge, but this has no bearing on which program has the better evaluation. There is no limit to how much tactical knowledge can be put into an eval function, but whether it justifies the slowdown in search is the question.
Regarding your request for a Komodo 5 version without PST, Richard Vida posted a patch to Komodo 5 making all eval terms configurable. Since we don't condone this I won't post the link here, but if you can find his patch all you need do is set the "xtm" terms ("pawn table multiplier" etc.), to zero and you'll have what you want.
You are trying to talk your way out of the 1 ply match?

kingsafety is also tactical, mobility is also tactical, evaluating attacks which diep is doing massively that's also tactical?

Yet evaluating the material suddenly is the most important 'positional term' of an evaluation?

Oh comeon we can call everything tactical.

I want a 1 ply match :)

Ed?
Make some noise!
Completely agree with Vincent. Only beancounter programmers would oppose Ed's idea, always using the same false dichotomy, search=tactics, eval=positional. Nonsense of course. I'ld take it further, ban the QS, which can contain all manner of check search tricks btw and force the beancounters to write a SEE. Then we'll see how really crap their evals are ;-)

One way you can also test btw, is put the zero search program onto ICC and test it against rated players. Then shoot any programmer who can't get 2000 ELO out of raw evaluation only.
There is a major flaw in your reasoning.
There may be but your rambling texts do not show it

You are going back to the 70's,
No I am not, I was suggesting a static eval comparison

when the mantra was "you must do chess on a computer like a human does it." Problem was, then, and still is, now, "We don't know HOW a human plays chess."
Well, you may not but I used to play chess rather well, and I know how I played, well enough to design an evaluation function based on my own play style

So saying "no search" is a meaningless constraint.
which itself is a meaningless non-sequitur. What's up with you?

Not to mention the obvious dichotomy where one can write a complex eval, or use search to fill in issues the eval doesn't handle well, and either should eventually reach exactly the same level of skill.
That is not a dichotomy. If you want to copy my language usage, perhaps a dictionary would be useful first?

But with computers, it is easier to rely on recursive high-speed stuff rather than on overly complex code that contains too many bugs to ever work well..
this might be true for an old-style software engineer who generates buggy code, but working with Japanese taught me that it is possible to produce complex code that works well and does more, reliably. It's all a question of testing and quality control. Ask Sony.
chrisw
Posts: 4624
Joined: Tue Apr 03, 2012 4:28 pm
Location: Midi-Pyrénées
Full name: Christopher Whittington

Re: How effective is move ordering from TT?

Post by chrisw »

Rebel wrote:
lkaufman wrote:
Rebel wrote:
Don wrote: I think you are ducking a REAL match. Trying to make a 1 ply match happen is way down on my list of priorities and I would not allow myself to be distracted by such a thing.
Don,

On several occasions you have said Komodo has the best eval in the world. I think you should proof it now that you have a challenger.

In good old Rome tradition we want to see the gladiators blood flow :mrgreen:
We have a different definition of eval than Vince. He refers to the eval function, while we are talking about eval in positions where search won't help appreciably. Probably Diep has a better eval function, because it gives up search depth for evaluating tactics. We claim that Komodo has the best eval when tactics don't matter, and I don't know of a way to prove this. When tactics do matter, search depth is extremely important, and comparing us on equal depth to Diep has no value.
So a redefined claim which you can not proof? What purpose serves that?
Put on your businessman hat and go into bullshit detector mode. Those guys are trying to sell their product.
chrisw
Posts: 4624
Joined: Tue Apr 03, 2012 4:28 pm
Location: Midi-Pyrénées
Full name: Christopher Whittington

Re: How effective is move ordering from TT?

Post by chrisw »

Don wrote:
lkaufman wrote:
Rebel wrote:
Don wrote: I think you are ducking a REAL match. Trying to make a 1 ply match happen is way down on my list of priorities and I would not allow myself to be distracted by such a thing.
Don,

On several occasions you have said Komodo has the best eval in the world. I think you should proof it now that you have a challenger.

In good old Rome tradition we want to see the gladiators blood flow :mrgreen:
We have a different definition of eval than Vince. He refers to the eval function, while we are talking about eval in positions where search won't help appreciably. Probably Diep has a better eval function, because it gives up search depth for evaluating tactics. We claim that Komodo has the best eval when tactics don't matter, and I don't know of a way to prove this. When tactics do matter, search depth is extremely important, and comparing us on equal depth to Diep has no value.
Just to illustrate how differently our definition really is, Vincent proposes to "prove" which program has the most sophisticated evaluation function by doing a 1 ply search.


As almost anyone in this field knows, a 1 ply search is completely dominated by tactics and fine positional understanding is almost completely irrelevant.
I'm trying really hard to parse this sentence.

"as almost anyone in the field knows" is an attempt to democratise and thus legitimise the text which follows. Except that unverifiable woffle can't do that.

"a 1 ply search is completely dominated by tactics", actually what does this mean? A one ply search has no tactics? A one ply search would be overwhelmed by an N ply search? The game would appear to be tactical? No reason why it should be. "Completely" sounds strong, but with what justification? "Dominated"? Again strong word, but what justification? Heavy adjectival use but no backup for them. Are you in marketing mode?

"fine positional understanding is almost completely irrelevant" Is it really? Well you qualified "completely" this time, so you are not too sure it seems. Actually positional understanding seems perfectly relevent, what would you suggest as an alternative? Random understanding? Partially right and partially wrong understanding? Would they be better?

Any yet he believes that is a legitimate test of how strong a program is in the positional sense.
False straw man. It's a legitimate test of the evaluation function. It really is intellectual cheating to switch his argument from "eval" to the "whole program", "in a positional sense" (what does that mean btw?) and then attack that. Don't you think?

I can only say that his definition of what is positional is different than ours.
it would be different when your positional definition keeps changing at will. From part (the non-tactical) to ALL (see below), for example

I think the best test is to simply play a long match at long time controls.
yes, that finds the stongest program, but this thread is about strongest eval function. Anyway, you won't change your tune, for why would a marketeer enter a competition to find scientific truth which by its nature runs the risk of his product appearing dumb?

The program that wins is playing the best chess and we don't have to debate "positional" vs "tactical" play, as has been pointed out it is ALL positional play, right?

Don
User avatar
Rebel
Posts: 7297
Joined: Thu Aug 18, 2011 12:04 pm
Full name: Ed Schröder

Re: How effective is move ordering from TT?

Post by Rebel »

Desperado wrote:
diep wrote:
...

(I already pointed out some things in my last post) It simply belongs together.
Search <-> Heuristic
Any conclusion, doesnt matter how the setup looks like of the experiment,
would be misleading, and would not really represent the evaluation
with the better chess knowledge at all.
I disagree there.

...
no problem :-)

But what you finally are measuring is _NOT_ the "chess knowledge",
_BUT_ the "chess knowledge in its framework". IMO this experiment
will not extract the essence of the chess knowledge.
You are right. Modern eval's don't have stuff any longer that search will do for them. It's simply a speedup because why do things twice? If Diep and Komodo are much different in that respect then one of them has a clear advantage. Nevertheless it's a lot of fun (never implied more) and such a competition (or private testing at home) can be useful to improve your eval.
diep
Posts: 1822
Joined: Thu Mar 09, 2006 11:54 pm
Location: The Netherlands

Re: How effective is move ordering from TT?

Post by diep »

Desperado wrote:
diep wrote:
...

(I already pointed out some things in my last post) It simply belongs together.
Search <-> Heuristic
Any conclusion, doesnt matter how the setup looks like of the experiment,
would be misleading, and would not really represent the evaluation
with the better chess knowledge at all.
I disagree there.

...
no problem :-)

But what you finally are measuring is _NOT_ the "chess knowledge",
_BUT_ the "chess knowledge in its framework". IMO this experiment
will not extract the essence of the chess knowledge.

Here is a simple question for you and for Don.

What offset do you get when your engines play with:

a. material + pst mode
compared to
b. full evaluation mode.

It is a while back i measured that for Nemo, but i think it was
about 300 Elo difference.

Michael
Michael, that depends upon heavily upon search depth.

Todays software searches that deep that basically passed pawns + material + psq together is a real strong combination which is already decisive over engines from some years ago (so from before Fruit 2.1 / rybka), as those simply don't have such sophisticated material evaluation. Then the big bugs in passed pawns that everyone used to have etc.

So a better test would be PSQ + material + passed pawns versus full eval.

This also explains why all those copycats do so well - as all the old school guys simply don't have the same quality material + PSQ + passed pawn tuning.

I'll give you an example that i discovered Diep to be losing from Fruit.

Diep used to always give passers a small bonus. Then suddenly, years ago it started losing the same type of thing over and over.

Some normal middlegame position. Diep white and a passed pawn on d4.
Nothing on c file nothing on e file.

Diep gave there 0.15 bonus roughly, all patterns together.

Fruit said: "oh it's an isolated pawn let's penalize it". Not sure how much Fruit gives there for a bonus.

No knowledge in the world will help you if you aren't penalizing that case. I really do not know many programs from before 2005, maybe Hydra/Brutus/Nimzo98 excepted (automatic tuned) that penalize this case.

So they *all* lose.

They grab the passed pawn, give a bonus, in 1 or 2 lines they suffer from horizon effect it even can get to 5th rank, and they lose.

Another simple case. This is something Diep already knows right from the start of the engine of course.

Fruit 2.1 doesn't know about 3 pieces being stronger than a queen.
It prefers the queen.

Fruit 2.1 prefers a queen over 2 rooks as soon as it can get 1 pawn with it.

Look at Diep world champs 2005 - Fruit 2.1

Nowadays you just can't afford misevaluating that materialwise.

Fruit 2.1 will easily sacrafice a piece just for 2 pawns and a passer on the 7th rank, even when it's totally blocked.

In 90s many engines could win game after game by grabbing a passer.

On ICC some years ago i had some games Diep versus Rookie. Now Rookie is a great opponent as it's hyperagressive, so it really will find bugs in your evaluation function pretty easily.

One of the many bugs i discovered this way was that basically Diep lost a position where rookie had a knight and diep a bishop.

The rybka derivatives value a knight HIGHER than a bishop. Only in case of a bishop pair, the bishop pair gets a huge bonus.

In openingsposition a knight roughly is 0.6 pawns better than a bishop and in endgame that goes back to 0.3 pawns (so it doesn't like to trade into such endgame which is a correct decision).

Every book i have here about chess, a bishop gets valued HIGHER than a knight.

Just in computerchess the hard fact is that it's very easy for a program to see whether a knight is strong. Seeing whether a bishop is weak is however a real disaster to get correct. After toying there for 20 years with that code i can assure you, it still has major bugs - which of course i find now as many other engines value a knight higher than a bishop - rookie included.

No one had such tuning in 90s.

*no one*

A bishop is simply stronger than a knight in general spoken.

Of course still many engines assume a bishop to be stronger than a knight (even stockfish does), but it's really easy to value a knight higher than a bishop and simply take care you don't trade in such case.

This is why Rybka derivatives hardly ever exchange pieces, that directly causes them to search deeper, as there is severe penalties on trading.

Either they got the bishop pair and want to keep it, or they have the knight and don't want to exchange anything - such clear plan searches you deeper - which trivially brings points.

It really needs a big concession in order to trade in such case.

Is it objective true that a knight is stronger than a bishop?
Hell no.

But subjective it plays easy of course.

If you look careful you'll see many many games where the rybka derivatives win games based upon grabbing the knight themselves and not exchanging.

They just play themselves into a position where the old school engine loses it based upon getting a bad bishop.

You see - this is another example of accurate tuning.

I remember several games mine against national trainers, where i gave it a draw and my opponent was total amazed as i had a strong bishop and they a knight.

In their thinking a bishop is THAT MUCH STRONGER that just having it they want to play on even with a relative vulnerable pawn structure.

At home i put it at the engines, score 0.00

If you analyze games of Kasparov, you will quickly realize this guy really knew the power of a knight already really well.

There is several dragon lines where Kasparov took the knight, where any theory about chess will tell you the bishop is stronger.

I showed a number of strong chessplayers this position, but that didn't help me much, as they simply knew that game :)
diep
Posts: 1822
Joined: Thu Mar 09, 2006 11:54 pm
Location: The Netherlands

Re: How effective is move ordering from TT?

Post by diep »

bob wrote:
chrisw wrote:
diep wrote:
lkaufman wrote:
diep wrote:
Rebel wrote:
Don wrote: I personally believe that Komodo has the best evaluation function of any chess program in the world.
I see a new form of (fun!) competition arising at the horizon, who has the best eval?

Its basic framework:

1. Root search (1-ply) only with standard QS.
2. QS needs to be defined by mutual agreement.
3. No extensions allowed.

25,000 - 50,000 games (or so) to weed out most of the noise because the lack of search.

Details to be worked out of course.
Great idea Ed. We need an independant tester who also verifies no cheating occurs. Do you volunteer?

With some luck we'll see then how strong mobility and coordinated piece evaluation plays.

Oh i remember - diep also knows everything about pins, and has extensive kingsafety that will directly attack the opponent king with all pieces, probably with the usual computer bug not using many pawns to do so. Will be giving spectacular attacking games!
This is the problem. Knowledge about pins is generally considered tactical, not evaluation, even if you put it in the eval function. So probably Diep would look great on a one ply test due to this pin knowledge, but this has no bearing on which program has the better evaluation. There is no limit to how much tactical knowledge can be put into an eval function, but whether it justifies the slowdown in search is the question.
Regarding your request for a Komodo 5 version without PST, Richard Vida posted a patch to Komodo 5 making all eval terms configurable. Since we don't condone this I won't post the link here, but if you can find his patch all you need do is set the "xtm" terms ("pawn table multiplier" etc.), to zero and you'll have what you want.
You are trying to talk your way out of the 1 ply match?

kingsafety is also tactical, mobility is also tactical, evaluating attacks which diep is doing massively that's also tactical?

Yet evaluating the material suddenly is the most important 'positional term' of an evaluation?

Oh comeon we can call everything tactical.

I want a 1 ply match :)

Ed?
Make some noise!
Completely agree with Vincent. Only beancounter programmers would oppose Ed's idea, always using the same false dichotomy, search=tactics, eval=positional. Nonsense of course. I'ld take it further, ban the QS, which can contain all manner of check search tricks btw and force the beancounters to write a SEE. Then we'll see how really crap their evals are ;-)

One way you can also test btw, is put the zero search program onto ICC and test it against rated players. Then shoot any programmer who can't get 2000 ELO out of raw evaluation only.
There is a major flaw in your reasoning. You are going back to the 70's, when the mantra was "you must do chess on a computer like a human does it." Problem was, then, and still is, now, "We don't know HOW a human plays chess." So saying "no search" is a meaningless constraint.

Not to mention the obvious dichotomy where one can write a complex eval, or use search to fill in issues the eval doesn't handle well, and either should eventually reach exactly the same level of skill. But with computers, it is easier to rely on recursive high-speed stuff rather than on overly complex code that contains too many bugs to ever work well..
We know very well how a human plays chess.

In fact the most important clue we already know from a research from De Groot in 1946.

That clue is simply that it's knowledge based and not search based.

That also explains why so many players who are analytically real strong, why they lose games - they make search mistakes - sometimes even 2 ply ones; simply missing the opponents move entirely.

After 1946, the research there usually focuses upon the wrong persons.

Everyone always wants to research the world champion. The world champion from scientific viewpoint is NOT interesting to research.

In computerchess we also know very clearly that if you have somewhere abug you ARE gonna lose everything based upon it. So avoiding bugs is what you want.

So in such researches they always make the same beginnersmistake, a zillion times again - you want to avoid making the same mistakes like the common man who plays chess.

But researching a guy who is 1200 elo is not sexy huh?

All the ladies who work for government always want to research those in society that are weird/more intelligent right?

Interesting is researching the common guys who without knowing anything still can win a game.

Interesting is knowing why a correspondence player who is elo 1100 himself over the board, not even knowing which endgame is a clear win for him, why he's world top correspondence chess.

No one is researching those cases.

Simply because we all already know the truth since 1946. There is nothing secret there.

It's all knowledge based with human players.

Of course the real interesting question then is: in how far is accurate parameter tuning important to humankind?

To me it seems chess engines are far better tuned than any human has knowledge tuned in his brain.

Yet THAT is an open discussion and an interesting one.

My claim is that the beancounter chess engines have been tuned far beyond what is interesting from research viewpoint. Total useless to even spend money on developing further engines like Komodo, Stockfish, Rybka or any similar engine. It's total trivial that they play far above the elostrength that their knowledge supports. Not interesting in short.

So if someone claims his evaluation function is better, then we simply can do the 1 ply test.

They now back off and claim their eval is ok when they add that 30 ply search they're doing :)

Next attempt will be: "but we must both run on 1 core, as Diep nearly always wins from engines when it has a big hardware advantage and being SMP is a hardware advantage" (that even was the case when Diep was a lot lower rated than it is now).

Then after that the attempt will be: "but the only thing that matters is superbullet".

And after all, they're still not better in evaluation, not even a penny, than DeepSjeng 2011 :)

(which definitely is much better evaluation than rybka)

In fact it's very similar to it as well.

Yet DeepSjeng 2011 is from spring 2011, and Komodo5 is far more recent...

Who copied who?
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: How effective is move ordering from TT?

Post by Don »

Rebel wrote: And another fun expiriment, the "best" full static eval. No QS but in the middle of the chaos on the board with multiple hanging pieces and counter attacks one evaluates and checking moves needs special code to see if the move is not mate, cool.

So we have 2 challenges, the best dynamic eval (with QS) and the best static eval. Now we need participants 8-)
This is just a silly experiment. Probably all the top programs would lose badly to any program that make an attempt to resolve tactics in some way. I'm not going to devote the next week to fixing up Komodo to win some competition that has nothing to do with real chess skill.

There is a contest you can have right now - just test several programs doing a 1 ply search. I don't know what it will prove - but it's a lot better than trying to design an experiment that requires everyone except Vincent to dumb down their search (presumably to prove that Vincent really does have the best program of all.)
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.