Singular Extensions

Don · Post by **Don** » Mon Aug 02, 2010 11:38 pm

bob wrote:
Don wrote:I forgot to mention that these games are hyper blitz speed. I'm running 4 simultaneous games on a 2 core laptop at 60 seconds + 1 second Fischer time control.

And here is an update:
Code: Select all
  RANK      ELO     +/-     Tme/Gme  Tot Gms  PLAYER
-------  -------  -----  ----------  -------  ----------------
     1    3000.0   84.6     106.010       68  komodo 1.2
     2    2974.6   84.6     105.078       68  komodo 1.2-noSing
I don't consider 1+1 to be "hyper-blitz". It is actually a pretty even-paced game if I play this as a human using a clock that supports this, such as on ICC. for hyper-blitz I think of things like 10secs or less on clock, 0.1 secs or less for an inc...

Is there an official definition of bullet and hyper-bullet?

I set my tester for 4 cpu even though I'm running on a 2 cores system, so this is really more like 20 seconds + 03 Fischer on a slow computer. And the laptop is significantly slower than my other machines, so it might be equivalent to something like 10 seconds running on my 980x. It's all relative to the hardware I suppose - in 5 or 10 years 5 minute chess on todays hardware might be the equivalent of hyper bullet.

We have often used 6 + 0.1 second fischer time controls - my tester supports time controls that are pretty fast where I can play many games per seconds even. But at some point I worry about the resolution of the timers and the I/O overhead and noise. UCI only supports millisecond resolution so it goes fuzzy if you go too fast.

Daniel Shawul · Post by **Daniel Shawul** » Mon Aug 02, 2010 11:46 pm

It was a governmental cover up.

Seriously ?
Everyone claimed atleast +40 elo at their tests with that form of SE. I bet they used a time control much smaller than 5 + 5 and probably a flowed test
as you are currently doing yourself. Self test? Divide what you are getting right now (+24 from 300 games) by 2, and you are right there with stockfish +12 elo..

Don · Post by **Don** » Mon Aug 02, 2010 11:49 pm

Daniel Shawul wrote:I won't be surprised if it doesn't get any better or even worse for longer tcs. The result could be simply explained by the fact that , depthleft = 8 is a bit too high and with that constrained form of SE, the chances of extensions are too rare even for an engine averaging 21+ in middle game. Or it could be just due to luck, also the improvement is not that significant compared to what has been claimed. CEGT tests at 40/4 and 40/20. So if we _assume_ it improves a bit then most probably will fail in 20-30 elo range max at 40/20.

What do you think is the reason for believing SE would give more improvement with longer tc? Has the test been done before ? No. Just hunch.
Also why would I believe it when extensions have been always worse with larger depth.

For me it's just a hunch and it could be wrong. But I do have some logic behind my belief. I believe that what you always want to do for scalability is the obvious throw out more bad moves and include more good moves. An intelligent extension does try to do that. Even though it's an extension, you can recast an extension as a reduction of everything else.

It is likely the case that extensions like check help at shallow depths but no so much at deeper levels because they are not particularly intelligent. A lot of checks are stupid moves so the principle of throwing out the bad is violated.

But SE is dynamic, it's based on a "proven" good move and it adjusts itself with depth - so even when it's wrong it "recovers" and dynamically readjusts itself.

That's why I think it's probably scalable. I don't know if my reasoning is completely correct, but at least my "hunch" is based on something more than just superstition.

This is one of those things that sometimes does not act or behave the way you expect it to, so I'm not married to this belief.

Don · Post by **Don** » Mon Aug 02, 2010 11:49 pm

Don wrote:
Daniel Shawul wrote:I won't be surprised if it doesn't get any better or even worse for longer tcs. The result could be simply explained by the fact that , depthleft = 8 is a bit too high and with that constrained form of SE, the chances of extensions are too rare even for an engine averaging 21+ in middle game. Or it could be just due to luck, also the improvement is not that significant compared to what has been claimed. CEGT tests at 40/4 and 40/20. So if we _assume_ it improves a bit then most probably will fail in 20-30 elo range max at 40/20.

What do you think is the reason for believing SE would give more improvement with longer tc? Has the test been done before ? No. Just hunch.
Also why would I believe it when extensions have been always worse with larger depth.
For me it's just a hunch and it could be wrong. But I do have some logic behind my belief. I believe that what you always want to do for scalability is the obvious throw out more bad moves and include more good moves. An intelligent extension does try to do that. Even though it's an extension, you can recast an extension as a reduction of everything else.

It is likely the case that extensions like check help at shallow depths but no so much at deeper levels because they are not particularly intelligent. A lot of checks are stupid moves so the principle of throwing out the bad is violated.

But SE is dynamic, it's based on a "proven" good move and it adjusts itself with depth - so even when it's wrong it "recovers" and dynamically readjusts itself.

That's why I think it's probably scalable. I don't know if my reasoning is completely correct, but at least my "hunch" is based on something more than just superstition.

This may be one of those things that sometimes does not act or behave the way you expect it to, so I'm not married to this belief.

Uri Blass · Post by **Uri Blass** » Mon Aug 02, 2010 11:53 pm

Daniel Shawul wrote:I won't be surprised if it doesn't get any better or even worse for longer tcs. The result could be simply explained by the fact that , depthleft = 8 is a bit too high and with that constrained form of SE, the chances of extensions are too rare even for an engine averaging 21+ in middle game. Or it could be just due to luck, also the improvement is not that significant compared to what has been claimed. CEGT tests at 40/4 and 40/20. So if we _assume_ it improves a bit then most probably will fail in 20-30 elo range max at 40/20.

What do you think is the reason for believing SE would give more improvement with longer tc? Has the test been done before ? No. Just hunch.
Also why would I believe it when extensions have been always worse with larger depth.

It seems that with stockfish 10+10 is better for SE than 5+5 so I do not believe in the rule that extensions are always worse with larger depth.

Uri

Ralph Stoesser · Post by **Ralph Stoesser** » Tue Aug 03, 2010 12:23 am

Daniel Shawul wrote: What do you think is the reason for believing SE would give more improvement with longer tc? Has the test been done before ? No. Just hunch.
Also why would I believe it when extensions have been always worse with larger depth.

Something like a check extension is a stupid thing compared to SE. SE accuracy scales with increasing depth. SE is very much like experienced human chess players calculate. If it seems you have only one playable move in a position, extend this line as soon as possible to get more information about the issue. This singularity could be game deciding, in one or in the other direction. Ok, humans have no random SE moves from TT;), but if even the "defective" ttSE make the program stronger, then it seems SE is very potent extension.

I'm not a "believer". I'm interested in the issue, because I want to know if it's worth spending much time to try to make ttSE work better. Insofar, to me, this test was not complete. The 5+5 result was a urban myth debunking point for Bob, but the 10+10 result was somehow swept under the carpet.

Don · Post by **Don** » Tue Aug 03, 2010 12:24 am

Daniel Shawul wrote:

It was a governmental cover up.
Seriously ?
Everyone claimed atleast +40 elo at their tests with that form of SE. I bet they used a time control much smaller than 5 + 5 and probably a flowed test
as you are currently doing yourself. Self test? Divide what you are getting right now (+24 from 300 games) by 2, and you are right there with stockfish +12 elo..

The best way to contribute is to run some tests of your own.

bob · Post by **bob** » Tue Aug 03, 2010 12:48 am

Ralph Stoesser wrote:
Daniel Shawul wrote: 5 + 5 gives enough depth so why ask for more ??

Because 10+10 was measured (comparatively much) stronger, with an increasing tendency? Don't you believe in holy cluster test results??

But suddenly the test was stopped ... suprise, surprise.

where is this "much" stronger coming from? I got roughly +5 at one time control, +17 at another. That is not "much stronger".

bob · Post by **bob** » Tue Aug 03, 2010 12:49 am

Uri Blass wrote:
bob wrote:
Don wrote:
Daniel Shawul wrote:Duh I am _not_ comapring 1.7.1 and 1.8.
The point is Stockfish 1.7 or 1.8 both has SE and their blitz or long time rating yet remains the same. If it gave it a push we should see its benefits there too, no ?
No. If SF gets the same rating at both short and long time controls, why is it you think you can pick out one thing (such as SE) and claim that this is proof that SE does not help or hurt it at long time controls?
I go with Daniel here. Most of those programs in the lists are not SE-based. If SE "picks up Elo" as the depth increases should it not widen the gap between itself and other programs below it that won't pick up that same boost since they don't have SE?

It could be (and almost certainly is the case) that some things in SF scale better than others. They have the same trouble everyone else does, it's very difficult to get a lot of games in at long time controls.

So some of the things in SF probably help the program even more at longer time controls and some things help less, or even hurt it at longer time controls.

The fact that it does not get weaker or stronger at long time controls means that on average they balance out. It doesn't mean you can pick a feature at random and say this proves that feature does not help or hurt at long time controls.
No, but if you have features that do worse and offset any potential gain from something else, just because you go deeper, should not those kinds of features be removed a.s.a.p.? Why have something that does _worse_ as hardware improves?
The reason to have something that does worse as hardware improves is that it simply does better than previous versions at all time controls.

I believe that a simple speed improvement does worse as hardware improves because of small diminishing returns.

being 10 times faster may give you 220 elo at 5+5 time control and only 200 elo at 10+10 time control and there may be improvements that are equivalent to speed improvements.

This is simply not possible. If it does better at _all_ time controls, then by definition, it will do better at faster hardware, because increased hardware speed tomorrow is the same as a longer time control today.

Daniel Shawul · Post by **Daniel Shawul** » Tue Aug 03, 2010 12:56 am

The best way to contribute is to run some tests of your own.

So now you are directing the blame elsewhere when a flaw in your test is pointed out. What ? you did 360 or so games and call it a contribution while you assumed I did nothing.
I did many games with it with various forms. I have already posted many times here I didn't get anything out of it with various forms of it. I did more than 9000 or so games at 40/30.

So I am supporting fact , and you are relying on hunch.. For starters Scorpio,Crafty,Spike got nothing out of it so far..
Send Doch with the SE option that I can enable/disable so I can do the test down to 1 elo point at 1+1 (same us what you are doing) against many opponents.. Send it and we will see what it is made of.

I actually find it _really_ hard to believe your form of SE , PV node only with 0.8 pawn, gives you results better than stockfish's. Atleast SF does it at every node where there is fail high. That is vastly many many nodes more than what you try it on.

Singular Extensions

Re: Singular Extensions - long games

Re: Singular Extensions - long games

Re: Singular Extensions - long games

Re: Singular Extensions - long games

Re: Singular Extensions - long games

Re: Singular Extensions - long games

Re: Singular Extensions - long games

Re: Singular Extensions - long games

Re: Singular Extensions - long games

Re: Singular Extensions - long games