Hi,
Some years ago, someone posted in CCC a list of heuristics with and estimated implementation cost and and estimated benefits. The list was then ordered so things like AB and Null-move were at the top because they have the best benefits/investment ratio.
I can't find this post. Does anyone else remember this and could help me finding it?
Return on investments of different techniques and heuristic
Moderator: Ras
-
- Posts: 290
- Joined: Mon Mar 13, 2006 5:23 pm
- Location: Québec
- Full name: Mathieu Pagé
-
- Posts: 290
- Joined: Mon Mar 13, 2006 5:23 pm
- Location: Québec
- Full name: Mathieu Pagé
Re: Return on investments of different techniques and heuris
Nevermind, I found it here : http://www.stmintz.com/ccc/index.php?id=374313
Mathieu Pagé
mathieu@mathieupage.com
mathieu@mathieupage.com
-
- Posts: 1154
- Joined: Fri Jun 23, 2006 5:18 am
Re: Return on investments of different techniques and heuris
An interesting, useful, and highly controversial table. I am not sure what the final verdict was. Here is one table I dug out:mathmoi wrote:Nevermind, I found it here : http://www.stmintz.com/ccc/index.php?id=374313
Here's the revised table based on recent discussion:
EFFORT RETURN RETURN/EFFORT Feature
1 10 10 capture ordering
3 4 1.33 null move
4 4 1 null move with verification
4 4 1 search pv first
2 1 .5 fractional extensions
4 6 1.5 static exchange evaluator
4 5 1.25 transposition table
4 5 1.25 transposition table with 2-tier repl.
3 3 1 history heuristic, killers, other ordering
2 1 .5 aspiration
4 5 1.25 book learning
3 3 1 position learning
2 2 1 iterative deepening
3 1 .33 internal iterative deepening
2 2 1 pawn hashing w/ complex pawn evaluation
3 3 1 king+pawn hashing w/ king safety evaluation
3 2 .66 capture extension
1 5 5 check extension
1 1 1 pawn to 6th/7th extension
2 3 1.5 futility in qsearch
3 3 1 futility on frontier
3 3 1 futility on pre-frontier
3 2 .66 razoring
5 3 .6 mate-at-a-glance
-Sam
-
- Posts: 4052
- Joined: Thu May 15, 2008 9:57 pm
- Location: Berlin, Germany
- Full name: Sven Schüle
Re: Return on investments of different techniques and heuris
Same table as the one you quoted but ordered first by RETURN/EFFORT and second by RETURN, and slightly formatted:BubbaTough wrote:An interesting, useful, and highly controversial table. I am not sure what the final verdict was. Here is one table I dug out:mathmoi wrote:Nevermind, I found it here : http://www.stmintz.com/ccc/index.php?id=374313
Here's the revised table based on recent discussion:
EFFORT RETURN RETURN/EFFORT Feature
1 10 10 capture ordering
3 4 1.33 null move
4 4 1 null move with verification
4 4 1 search pv first
2 1 .5 fractional extensions
4 6 1.5 static exchange evaluator
4 5 1.25 transposition table
4 5 1.25 transposition table with 2-tier repl.
3 3 1 history heuristic, killers, other ordering
2 1 .5 aspiration
4 5 1.25 book learning
3 3 1 position learning
2 2 1 iterative deepening
3 1 .33 internal iterative deepening
2 2 1 pawn hashing w/ complex pawn evaluation
3 3 1 king+pawn hashing w/ king safety evaluation
3 2 .66 capture extension
1 5 5 check extension
1 1 1 pawn to 6th/7th extension
2 3 1.5 futility in qsearch
3 3 1 futility on frontier
3 3 1 futility on pre-frontier
3 2 .66 razoring
5 3 .6 mate-at-a-glance
-Sam
Code: Select all
EFFORT RETURN RETURN/EFFORT Feature
1 10 10.00 capture ordering
1 5 5.00 check extension
4 6 1.50 static exchange evaluator
2 3 1.50 futility in qsearch
3 4 1.33 null move
4 5 1.25 transposition table
4 5 1.25 transposition table with 2-tier repl.
4 5 1.25 book learning
4 4 1.00 null move with verification
4 4 1.00 search pv first
3 3 1.00 history heuristic, killers, other ordering
3 3 1.00 position learning
3 3 1.00 king+pawn hashing w/ king safety evaluation
3 3 1.00 futility on frontier
3 3 1.00 futility on pre-frontier
2 2 1.00 iterative deepening
2 2 1.00 pawn hashing w/ complex pawn evaluation
1 1 1.00 pawn to 6th/7th extension
3 2 0.67 capture extension
3 2 0.67 razoring
5 3 0.60 mate-at-a-glance
2 1 0.50 fractional extensions
2 1 0.50 aspiration
3 1 0.33 internal iterative deepening
-
- Posts: 1822
- Joined: Thu Mar 09, 2006 11:54 pm
- Location: The Netherlands
Re: Return on investments of different techniques and heuris
Very ugly list.
Probably i would do better with some thinking but it would be like,
if we skip evaluation and all forms of tuning/testing and repetition detection:
1. quiescencesearch (not trivial to invent)
2. alfabeta (please note alfabeta is trivial to invent)
3. move ordering
4. hashtable (not just transpositiontable - also move ordering)
5. nullmove
6. extensions
really things like PVS and all other stuff is far less relevant.
Yet the above is pretty well known how to do it to some reasonable extend.
However the reality is that evaluation is far more important once you have done the above bugfree.
if we'd inject into above search diep's latest evaluation and then
grab rybka's search and get the best evaluation there was from the 90s,
and test it single core versus single core at the 90s time control of 3 minutes a move. Hell even 90s hardware if you want to.
You face 100% scores. the evaluation wins everything.
Now you'll argue that you need modern hardware.
So we do 3 minutes a move at a single core core2 2.4ghz or something, or a K8 whatever.
diep evaluation with simple search from 90s. Even improvements of the above search i made past years we can wash away.
So the above is like 10 ply search max versus 20 ply for todays deep searching engines.
if you pick the best evaluation function from 90s and let it play todays evaluation function from Diep, you total hammer away any search improvement.
So none of all those search improvements has any value without an up to date evaluation function. Without it, it's all elo 0.
It's really 100% score for todays evaluation function against 90s.
20 ply, 30 ply for 90s software, doesn't matter. You're dead against todays eval.
I noticed this effect very clearly in a big test we did do in the year 2000, after i had improved diep's evaluation to for what was back then modern standards. Jan Louwman at his 36 computers did do a lot of the testing for that.
You can definitely say i was quite shocked with the big progress been made and how world top engines from a few years before got total slaughtered.
Please note other evaluations than diep are more tricky, as they rely bigtime upon search for kingsafety. Most have shitzero kingsafety, diep has huge kingsafety knowledge obviously, so a small search depth for its kingsafety doesn't matter, where it would matter really bigtime for the fruit 2.1+ generation engines such as rybka, stockfish etc.
Probably i would do better with some thinking but it would be like,
if we skip evaluation and all forms of tuning/testing and repetition detection:
1. quiescencesearch (not trivial to invent)
2. alfabeta (please note alfabeta is trivial to invent)
3. move ordering
4. hashtable (not just transpositiontable - also move ordering)
5. nullmove
6. extensions
really things like PVS and all other stuff is far less relevant.
Yet the above is pretty well known how to do it to some reasonable extend.
However the reality is that evaluation is far more important once you have done the above bugfree.
if we'd inject into above search diep's latest evaluation and then
grab rybka's search and get the best evaluation there was from the 90s,
and test it single core versus single core at the 90s time control of 3 minutes a move. Hell even 90s hardware if you want to.
You face 100% scores. the evaluation wins everything.
Now you'll argue that you need modern hardware.
So we do 3 minutes a move at a single core core2 2.4ghz or something, or a K8 whatever.
diep evaluation with simple search from 90s. Even improvements of the above search i made past years we can wash away.
So the above is like 10 ply search max versus 20 ply for todays deep searching engines.
if you pick the best evaluation function from 90s and let it play todays evaluation function from Diep, you total hammer away any search improvement.
So none of all those search improvements has any value without an up to date evaluation function. Without it, it's all elo 0.
It's really 100% score for todays evaluation function against 90s.
20 ply, 30 ply for 90s software, doesn't matter. You're dead against todays eval.
I noticed this effect very clearly in a big test we did do in the year 2000, after i had improved diep's evaluation to for what was back then modern standards. Jan Louwman at his 36 computers did do a lot of the testing for that.
You can definitely say i was quite shocked with the big progress been made and how world top engines from a few years before got total slaughtered.
Please note other evaluations than diep are more tricky, as they rely bigtime upon search for kingsafety. Most have shitzero kingsafety, diep has huge kingsafety knowledge obviously, so a small search depth for its kingsafety doesn't matter, where it would matter really bigtime for the fruit 2.1+ generation engines such as rybka, stockfish etc.
BubbaTough wrote:An interesting, useful, and highly controversial table. I am not sure what the final verdict was. Here is one table I dug out:mathmoi wrote:Nevermind, I found it here : http://www.stmintz.com/ccc/index.php?id=374313
Here's the revised table based on recent discussion:
EFFORT RETURN RETURN/EFFORT Feature
1 10 10 capture ordering
3 4 1.33 null move
4 4 1 null move with verification
4 4 1 search pv first
2 1 .5 fractional extensions
4 6 1.5 static exchange evaluator
4 5 1.25 transposition table
4 5 1.25 transposition table with 2-tier repl.
3 3 1 history heuristic, killers, other ordering
2 1 .5 aspiration
4 5 1.25 book learning
3 3 1 position learning
2 2 1 iterative deepening
3 1 .33 internal iterative deepening
2 2 1 pawn hashing w/ complex pawn evaluation
3 3 1 king+pawn hashing w/ king safety evaluation
3 2 .66 capture extension
1 5 5 check extension
1 1 1 pawn to 6th/7th extension
2 3 1.5 futility in qsearch
3 3 1 futility on frontier
3 3 1 futility on pre-frontier
3 2 .66 razoring
5 3 .6 mate-at-a-glance
-Sam
-
- Posts: 10892
- Joined: Thu Mar 09, 2006 12:37 am
- Location: Tel-Aviv Israel
Re: Return on investments of different techniques and heuris
I do not understand your claim of 100% scoresdiep wrote:Very ugly list.
Probably i would do better with some thinking but it would be like,
if we skip evaluation and all forms of tuning/testing and repetition detection:
1. quiescencesearch (not trivial to invent)
2. alfabeta (please note alfabeta is trivial to invent)
3. move ordering
4. hashtable (not just transpositiontable - also move ordering)
5. nullmove
6. extensions
really things like PVS and all other stuff is far less relevant.
Yet the above is pretty well known how to do it to some reasonable extend.
However the reality is that evaluation is far more important once you have done the above bugfree.
if we'd inject into above search diep's latest evaluation and then
grab rybka's search and get the best evaluation there was from the 90s,
and test it single core versus single core at the 90s time control of 3 minutes a move. Hell even 90s hardware if you want to.
You face 100% scores. the evaluation wins everything.
Now you'll argue that you need modern hardware.
So we do 3 minutes a move at a single core core2 2.4ghz or something, or a K8 whatever.
diep evaluation with simple search from 90s. Even improvements of the above search i made past years we can wash away.
So the above is like 10 ply search max versus 20 ply for todays deep searching engines.
if you pick the best evaluation function from 90s and let it play todays evaluation function from Diep, you total hammer away any search improvement.
So none of all those search improvements has any value without an up to date evaluation function. Without it, it's all elo 0.
It's really 100% score for todays evaluation function against 90s.
20 ply, 30 ply for 90s software, doesn't matter. You're dead against todays eval.
I noticed this effect very clearly in a big test we did do in the year 2000, after i had improved diep's evaluation to for what was back then modern standards. Jan Louwman at his 36 computers did do a lot of the testing for that.
You can definitely say i was quite shocked with the big progress been made and how world top engines from a few years before got total slaughtered.
Please note other evaluations than diep are more tricky, as they rely bigtime upon search for kingsafety. Most have shitzero kingsafety, diep has huge kingsafety knowledge obviously, so a small search depth for its kingsafety doesn't matter, where it would matter really bigtime for the fruit 2.1+ generation engines such as rybka, stockfish etc.
Can you give exact conditions that people can test to see if they get 100% score?
What is the book that you use for both programs?
Are you talking about 10 plies against 20 plies?
I may try to test 20 plies of strelka with primitive evaluation against 10 plies of rybka3 if you are interested(hope that 20 plies of strelka with primitive evaluation does not take too much time and note that 10 plies of rybka3 are practically 12 plies and not 10 plies)
Edit:Note that when I say primitive evaluation I do not mean to strelka2's evaluation but to a modified version of strelka2 that has only average of piece square table between opening and endgame and has no pawn structure evaluation no mobility evaluation no king safety evaluation and no imbalance material tables.
Uri