Is search irrelevant when computing ahead of very big trees?

Kempelen · Post by **Kempelen** » Wed Jul 24, 2013 11:19 am

Hello,

I have a though in my mind for some time I want to share to ask for opinions. It is about the eternal debate of reaching more search depth vs knowledge. The open question is: what is more important today: depth or knowledge? I will center the debate in long (very long) time controls.

My though is as follow: in a "lot of time calculated tree" from a equal position where two strong engines plays without obvious piece hanging in the short and middle term, there are always nodes in all paths where position is not tactical and it is 'calm', so we could draw another tree, with less depth, where all sheets are important positional points. My theory is that those positions is like a kind of style choosing set, and looking ahead is not futile, but maybe a waste of time. At the end, the engine will think there will be only one of those paths as the 'win' path, but the true is that more than one path are valid. In those cases, is more imporant an accurate eval, than a good search, as the time is supplying/hiding the search problems.
In the past, as there were not enought cpu power to complete a "tactic free" tree, this did not apply.

I can resume my though in two points:
1) In a match of equal strength engines, where one plays, for example, with 15 min/move, and the other 30 min/move, both will ends in similar strong result. I have not tested this, but suspect it could be true, as is the quality of the eval function what determine which engine is stronger.
2) I have personal subjetive evidence that invest in knowledge pays off more that in the search in the long run. With Rodin, I get more result for eval function than trying search improvements.

So the open question is: with enough time, is there a point in the search where computing further than that could be irrelevant?

Regards
Fermin

P.S. Maybe a complicated explanation from me, but hope you get the point.

Don · Post by **Don** » Wed Jul 24, 2013 4:26 pm

You question is poorly defined, but I'll take a crack at it anyway.

I think evaluation is and will continue to become more important as chess programs and hardware continue to advance. I'm not going to say which is more important because that is pretty ambiguous - they are both critically important.

In fact it's because of great searching that evaluation will become more important. In the past programs played to survive tactics but now positional decisions are become a lot more important. So program make decisions that have very long term consequences. This has always been the case of course but programs didn't have much control over that as tactics dominated, now they do.

A program is not going to beat because it overlooked a fork which would have been seen with 1 more ply of searching (although that problem never goes away it is increasingly rare.) instead they are being beat because they didn't understand which of two "plausible" ways to proceed was best and chose the wrong one.

Don

Uri Blass · Post by **Uri Blass** » Wed Jul 24, 2013 4:48 pm

search is important also at long time control.

In theory if you search deep enough you can avoid losing mistakes in chess because every positional losing mistake is a tactical mistake if you search deep enough and there are certainly mistakes that the program can avoid with 30 minutes that it cannot avoid in 15 minutes.

It is known that there is a diminishing returns and the difference between 30 minutes per move and 15 minutes per move is certainly smaller than the difference between 30 seconds per move and 15 seconds per move but it is never close to 0 and there is a different subject about houdini and the value of doubling the number of nodes.

see http://www.talkchess.com/forum/viewtopic.php?t=48733

tpetzke · Post by **tpetzke** » Wed Jul 24, 2013 5:14 pm

Hi,

I think you can have a good eval without an average search but probably not the other way around.

Your eval determines which part of the tree you search. So if you manage to tune your search that it gets one ply deeper it does not help so much if you do this in the wrong part of the tree.

For a strong chess program both need to be very good of course.

Thomas...

Don · Post by **Don** » Wed Jul 24, 2013 5:21 pm

Uri Blass wrote:I totally disagree that search is not important at long time control.

Where did you read that anyone said it was not important? I don't see that anywhere. In fact by definition longer time controls mean more searching.

This is why I said the question is poorly defined.

In theory if you search deep enough you can avoid losing mistakes in chess because every positional losing mistake is a tactical mistake if you search deep enough and there are certainly mistakes that the program can avoid with 30 minutes that it cannot avoid in 15 minutes.

What you are saying is correct, but it does not address the question. The reality is that programs with better evaluation scale better. So what Fermin is really asking, even if he doesn't realize it, is what is the relationship between evaluation and scaling and search - they all have an impact on what performance you can expect with increasingly more CPU power.

We can do a thought experiment. Suppose you take a program such as Houdini or Komodo and find a way to improve the program by 100 ELO at a specific time control on a specific machine. But you do this with evaluation work only and you are able to do this without any impact on the nodes per second or changing the search.

And then you are able to produce another version that keeps the old evaluation but somehow is able to increase the nodes per second drastically, enough to give exactly the same 100 ELO improvement. So now you have 2 programs that play the same exact strength at this time control and hardware. Which program is better? I argue that the program with the new evaluation is intrinsically better because it will actually scale better. In 10 years with computers that are 5x faster the program with the better evaluation will be superior.

For anyone here who understands Big O notation, I think of computer chess exactly the same way except that I don't think of performance in terms of speed but in terms of ELO.

Here is how it works:

In Big O notation you are concerned with the scaling characteristics of a given algorithm. For example a sorting algorithm such as quicksort is considered better than insertion sort. Calling it better is better only in a theoretical sense. If you sort 100 records using insertion sort implemented in C it will be orders of magnitude faster than if you sort the same 100 records using quicksort in ruby or tcl. But Big O doesn't care about the exact implementation, we still say quicksort is superior in an abstract sense. In fact qucksort in tcl will probably still outperform insertion sort given a huge amount of records.

In computer chess we use algorithms to produce strong play and no too programs are the same. Let's say I could find a way to optimize Komodo in such a way as to double the nodes per seconds without any other changes. Komodo would suddenly be hailed as the strongest program in the world. Although I would be ecstatic with such a change I take the viewpoint that I did not change the "Chess Big O" characteristics of Komodo at all. You could get exactly the same performance by running Komodo on a computer twice as fast, or by running it twice as long.

To me, the program that scales the best IS the best - at least in some theoretical sense because on a fast enough computer it would PLAY the best. It means it has the best chess playing algorithm.

Anyway, that is how I think about chess programming. Of course I am always interested in boosting the general performance too. Whether you are using bubble sort or quicksort you can still do many optimizations, such as writing it in C and using the best compilers and other details and can have a nice impact on the performance that is useful to you. I do the same with Komodo because I know a few percent speedup give me an ELO boost. However such things have no impact on the basic characteristics of the chess playing algorithm so their potential usefulness is bounded.

It is known that there is a diminishing returns and the difference between 30 minutes per move and 15 minutes per move is certainly smaller than the difference between 30 seconds per move and 15 seconds per move but it is never close to 0 and there is a different subject about houdini and the value of doubling the number of nodes.

jdart · Post by **jdart** » Wed Jul 24, 2013 6:05 pm

is there a point in the search where computing further than that could be irrelevant

There have been some studies on this - basically looking at how many PV changes you get as the search depth continues to increase. And IIRC there continue to be a lot of significant changes even at high search depths. Chess is very complex. At least in the middlegame, the search does not settle down in all cases, but instead continues to bring up new moves that involve deep plans or tactics.

That said though I think you are right that eval changes can bring more benefit than search changes. But that is a different issue.

--Jon

Uri Blass · Post by **Uri Blass** » Wed Jul 24, 2013 8:44 pm

Don,
1)I disagree that evaluation change scale better and it is dependent on the specific evaluation change.

There are evaluations changes that help only at blitz and I can imagine that some knowledge about winning some endgames(assuming that the program knows that the position is winning) may be productive only at blitz because at long time control you can find the win by search.

2)search change is not equivalent to constant speed improvement and there may be search changes that scale well.

better order of moves may be equivalent to more than a constant speed improvement.

Don · Post by **Don** » Wed Jul 24, 2013 9:35 pm

Uri Blass wrote:Don,
1)I disagree that evaluation change scale better and it is dependent on the specific evaluation change.

There are evaluations changes that help only at blitz and I can imagine that some knowledge about winning some endgames(assuming that the program knows that the position is winning) may be productive only at blitz because at long time control you can find the win by search.

2)search change is not equivalent to constant speed improvement and there may be search changes that scale well.

better order of moves may be equivalent to more than a constant speed improvement.

We have some evidence that the best evaluation function is somewhat dependent on the level just as you say.

But I have done many experiments that proves that evaluation scales. Turn off a bunch of evaluation and run matches at various time controls with and without this evaluation and you will a very clear trend, the percentage of time you need to equalize will grow with the level.

Don't just have an opinion or say you disagree, try it for yourself.

Berliner also did a classic experiment showing exactly the same thing. I personally believe his experiment is flawed but he did come the same conclusion. The flaw was that his sample count was way too low. But it did show this trend. It was the hitech/lotech experiments.

Uri Blass · Post by **Uri Blass** » Wed Jul 24, 2013 11:23 pm

If you turn off part of the evaluation that is relevant for all the game then I agree but it may be different for adding some complicated evaluation that is about winning won endgame.

I can imagine that removing knowledge of how to win won endgames can scale well because at fast time control removing the knowledge may prevent the program to win won endgame when at slow time control there is no demage because the program win the endgame by search after it get it.

For example it may be possible that without tablebases specific knowledge how to win endgames like KBN vs K may be productive at fast time control when this knowledge does not help at slow time control because the program can find the win by search.

More general case is simply replacing every evaluation that is bigger than n pawns and is not mate by n pawns for some big n.

You can describe it as removing knowledge and it may cause big problems at fast time control because the program will not be able to win some won endgame but cause no problem at slow time control because the program is going to find the mate by search.

Uri

Don · Post by **Don** » Thu Jul 25, 2013 12:57 am

Uri Blass wrote:If you turn off part of the evaluation that is relevant for all the game then I agree but it may be different for adding some complicated evaluation that is about winning won endgame.

I can imagine that removing knowledge of how to win won endgames can scale well because at fast time control removing the knowledge may prevent the program to win won endgame when at slow time control there is no demage because the program win the endgame by search after it get it.

Intuition is misleading, you are basing this all on your intuition.

In fact I have the opposite intuition, that if you don't know which ending is a win, loss or draw it is too late to figure it out once you are in it. So my guess here is that knowledge of endings and whether they are won or drawn is more important for a deep searchers.

However, my intuition may be wrong too. What matters is reality and the actual studies done on this and it's pretty clear that knowledge is scalable as a general rule.

I do think that certain knowledge may not be scalable and may work especially well on short searches.

For example it may be possible that without tablebases specific knowledge how to win endgames like KBN vs K may be productive at fast time control when this knowledge does not help at slow time control because the program can find the win by search.

The kind of knowledge you are talking about here is not particularly relevant, the most important knowledge is based on evaluating a position correctly so that when you make decisions you go into the right sort of position - the kind that makes a difference between win and draw and lose. Knowing how to win a won game is useful for very shallow searches but this is not the kind of knowledge that is scalable. It's technique knowledge, not deep knowledge.

Komodo actually has specific knowledge on how to win this ending. Not because it really helps but because all my programs have had this and it's almost like something inherited. Like you say it can win this without the knowledge, but on a hyper blitz level it is possible that it will make a difference. But that is not scalable knowledge.

More general case is simply replacing every evaluation that is bigger than n pawns and is not mate by n pawns for some big n.

You can describe it as removing knowledge and it may cause big problems at fast time control because the program will not be able to win some won endgame but cause no problem at slow time control because the program is going to find the mate by search.

Uri

I have related this story on the forum before so I will keep this short - but it did remind me of something I already know.

In the early days of Doch I was using Glaurung as a sparring partner and trying to "catch" it. Doch was getting stronger every day and at one point I announced to Larry that we had finally caught and passed Glaurung. I was basing this on the hyper blitz time control tests were were doing, such as 3 seconds per game. I quickly discovered that Glaurung still owned us when testing at more realistic time controls and even did a graph which showed that Doch was stronger at fast time controls and weaker at longer ones. The lines were not parallel at all, in fact the difference was extreme once you got to 1 minute - something like 150 ELO or more. It was ridiculous.

Larry and I wasted a lot of time on this, I think it was many weeks trying to determine what the scalability problem was. Clearly, there was something wrong with Doch and we tried every kind of test and modification we could imagine. Were we too selective? Not selective enough? Was it futility pruning or our forward pruning tricks? Was null move pruning broken?

Eventually, I got frustrated at the lack of progress and told Larry we needed to move on for a while and come back to this later. So we began working on low hanging fruit, the weakest part of Doch at the time, king safety. Doch responded enormously to our fixing this problem and was suddenly much stronger - but to both our surprises the scalability issue completely went away! We were now beating this version of Glaurung at every level!

Even though I already knew this, it was a reminder about how scalability works. No matter what depth you search, you are going to encounter end nodes that you do not understand. You MUST be able to tell which is better or else adding even 20 ply to the search may not matter. However know that some configuration of KBNvsK is better than another is not going to make the slightest bit of difference unless you are already in that ending. At best it will tell the program how to get into a version of it that wins a little quicker - not really useful for ELO points. But your king can be in danger for 10, 20 or more moves and if your program doesn't understand that it will make mistakes constantly, game losing mistakes that 10 extra ply won't be able to rectify. THAT is why Doch was not scaling, because the extra depth helped Doch tactically and in many other ways, but it was still vulnerable to anything related to king safety - it was like the extra depth was close to meaningless.

And that is what you will get with a lack of knowledge, big holes in the program that you cannot brute force your way out of. Each extra ply of search will cause your program to fall a little more behind a program that understands these things even at end nodes.

I think you are guilty of thinking about the root position and what a program can do - can it win this ending, can it do this, can it do that. But what really matters is what it understands at END node evaluation time. If it understands something in the evaluation then every extra ply of search makes it stronger - it can see that thing 1 ply deeper. But if it doesn't understand it then each additional ply has no impact - at least until it is looking so deeply that it can solve it tactically. Sometimes you can survive a king safety attack for 20 or 30 ply before finally losing material or getting mated - so not having king safety makes your program 20 or 30 ply weaker in many positions.

Is search irrelevant when computing ahead of very big trees?

Is search irrelevant when computing ahead of very big trees?

Re: Is search irrelevant when computing ahead of very big tr

Re: Is search irrelevant when computing ahead of very big tr

Re: Is search irrelevant when computing ahead of very big tr

Re: Is search irrelevant when computing ahead of very big tr

Re: Is search irrelevant when computing ahead of very big tr

Re: Is search irrelevant when computing ahead of very big tr

Re: Is search irrelevant when computing ahead of very big tr

Re: Is search irrelevant when computing ahead of very big tr

Re: Is search irrelevant when computing ahead of very big tr