Singular Extensions

Don · Post by **Don** » Mon Aug 02, 2010 9:40 pm

Daniel Shawul wrote:For Doch probably without SE (that is first version)

40/4
Code: Select all
208 	Doch 09.980 x64 1CPU 	2874 	17 	17 	1000 	47.2% 	2894 	39.6%
40/40
Code: Select all
197 	Doch 09.980 x64 	2887 	16 	16 	1064 	53.9% 	2859 	39.8%
Doch is even more amazing at this infact. So your previous argument doesn't hold a candle, if this version doesn't have SE.

BTW, I checked that the both lists are calculated from the same database.

Let me see if I understand your point. Doch gets a higher rating with longer time control and kmodo doesn't. Since one of the many things I added to Komodo was SE, then it MUST be SE causing the difference?

How can you be sure than our much more aggressive LMR did not cause this result? We have long suspected that overdoing LMR can cause scalability issues, so what experiments did you do to prove this was not the case and that it must be SE?

Uri Blass · Post by **Uri Blass** » Mon Aug 02, 2010 9:54 pm

Daniel Shawul wrote:Latest version i can get in both lists without SE.

40/4
Code: Select all
96 	Stockfish 1.5.1 JA w32 2CPU 	2973 	14 	14 	1600 	60.1% 	2902 	33.3%
40/20
Code: Select all
71 	Stockfish 1.5.1 x64 2CPU 	2980 	17 	17 	904 	53.4% 	2956 	42.5%
So with SE or without SE, the trend remains the same... Blitz and long tc ratings are more or less the same. So what is it now ? The SE is gone ...

You compare 64 bits with 32 bits

1.5.1 has a higer rating at blitz and the same for most stockfish versions
and it proves nothing

Note that if SE gives additional 10 elo for doubling the time control(possible based on bob's result) then the difference between 40/4 and 40/20 is only 23 elo and I think that other changes may help more at blitz and less at long time control(I believe that changes that are equivalent to speed improvement help more at blitz).
Uri

Daniel Shawul · Post by **Daniel Shawul** » Mon Aug 02, 2010 10:00 pm

No. They both dosn't get significantly better or worse. Error bar is too big for that. I quote the results here again.

40/4 Komodo 1.2 x64 1CPU 3007 15 15 1198 54.8% 2974 41.7%
40/40 Komodo 1.2 x64 3004 17 17 898 48.7% 3013 46.8%

40/4 208 Doch 09.980 x64 1CPU 2874 17 17 1000 47.2% 2894 39.6%
40/40 197 Doch 09.980 x64 2887 16 16 1064 53.9% 2859 39.8%

So how did the addition of SE alter the situation ?? Same goes for stockfish. If the behaviour of the engine at different time controls is the same with or without SE, how can SE be responsible ?

Also why are long time tests being requested when no one ever did the test at such time controls. The best I can get to disprove such claims is rating lists lists which conduct both kind of tests..
Do we really have any other reason to extend the tc other than one's belief. 5 + 5 gives enough depth so why ask for more ??

Don · Post by **Don** » Mon Aug 02, 2010 10:02 pm

Uri Blass wrote:
Daniel Shawul wrote:Latest version i can get in both lists without SE.

40/4
Code: Select all
96 	Stockfish 1.5.1 JA w32 2CPU 	2973 	14 	14 	1600 	60.1% 	2902 	33.3%
40/20
Code: Select all
71 	Stockfish 1.5.1 x64 2CPU 	2980 	17 	17 	904 	53.4% 	2956 	42.5%
So with SE or without SE, the trend remains the same... Blitz and long tc ratings are more or less the same. So what is it now ? The SE is gone ...
You compare 64 bits with 32 bits

1.5.1 has a higer rating at blitz and the same for most stockfish versions
and it proves nothing

Note that if SE gives additional 10 elo for doubling the time control(possible based on bob's result) then the difference between 40/4 and 40/20 is only 23 elo and I think that other changes may help more at blitz and less at long time control(I believe that changes that are equivalent to speed improvement help more at blitz).
Uri

I think it's especially difficult to maintain a given ELO superiority at high levels because of the draw factor. Chess is gradually becoming more drawish as programs get stronger and stronger.

Daniel Shawul · Post by **Daniel Shawul** » Mon Aug 02, 2010 10:06 pm

You compare 64 bits with 32 bits

I saw that but it is the best I can get by looking at the page ....

1.5.1 has a higer rating at blitz and the same for most stockfish versions
and it proves nothing

Note that if SE gives additional 10 elo for doubling the time control(possible based on bob's result) then the difference between 40/4 and 40/20 is only 23 elo and I think that other changes may help more at blitz and less at long time control(I believe that changes that are equivalent to speed improvement help more at blitz).
Uri

The question is really simple.
If an engine performs more or less the same with or without SE, then what is the effect of SE ? Not to the overall strength of the engine, but as to changing it performance at different tcs.

Uri Blass · Post by **Uri Blass** » Mon Aug 02, 2010 10:13 pm

Daniel Shawul wrote:
You compare 64 bits with 32 bits
I saw that but it is the best I can get by looking at the page ....

1.5.1 has a higer rating at blitz and the same for most stockfish versions
and it proves nothing

Note that if SE gives additional 10 elo for doubling the time control(possible based on bob's result) then the difference between 40/4 and 40/20 is only 23 elo and I think that other changes may help more at blitz and less at long time control(I believe that changes that are equivalent to speed improvement help more at blitz).
Uri
The question is really simple.
If an engine performs more or less the same with or without SE, then what is the effect of SE ? Not to the overall strength of the engine, but as to changing it performance at different tcs.

The performance is not the same with and withous SE because SE is not the only change and there are more changes.

If you want to test the effect of SE then you need to have SE as the only change between version X and version X+1.

Uri

Daniel Shawul · Post by **Daniel Shawul** » Mon Aug 02, 2010 10:22 pm

That was the only escape route available to them *, so they can take it ...

No offence but the exaggerated SE benefit numbers are based on one's haunch so far as I can tell. CEGT & Bob's test says otherwise..

If you want to test the effect of SE then you need to have SE as the only change between version X and version X+1.

X has been matched with X+1 with a time control I believe no one used before releasing SE versions. See the results from Bob. Even if the 60+60 is done, some one would say not unless 200+200 and so on and so forth.

* 'SE is superb at long tc' believers

bob · Post by **bob** » Mon Aug 02, 2010 10:22 pm

Don wrote:I forgot to mention that these games are hyper blitz speed. I'm running 4 simultaneous games on a 2 core laptop at 60 seconds + 1 second Fischer time control.

And here is an update:
Code: Select all
  RANK      ELO     +/-     Tme/Gme  Tot Gms  PLAYER
-------  -------  -----  ----------  -------  ----------------
     1    3000.0   84.6     106.010       68  komodo 1.2
     2    2974.6   84.6     105.078       68  komodo 1.2-noSing

I don't consider 1+1 to be "hyper-blitz". It is actually a pretty even-paced game if I play this as a human using a clock that supports this, such as on ICC. for hyper-blitz I think of things like 10secs or less on clock, 0.1 secs or less for an inc...

bob · Post by **bob** » Mon Aug 02, 2010 10:23 pm

Daniel Shawul wrote:What time control are you using ?

Here are some basic details of how I do this in Komodo: This is done in PV nodes only, there must be a hash table move available, not generated from IID, I use 0.80 pawn margin with a zero window search reduced always by 4 ply and the test is only done when the depth remaining is 5 or greater. I don't take into consideration whether the hash table score is a bound nor do I consider the depth of the hash table entry. I noticed that stockfish looks at some of these issues but I don't.
I think yours is as constrained as it gets.At PV nodes only makes it a hand full of nodes to test for singularity. And 0.8 pawn margin, I think that is too big. That will not consider many good positional moves. Most will be captures and devastating positional moves better than any other move. The 20-30 elo advantage is more believable especially if we consider everyone does different forms of reductions and extensions.

IIRC Zappa used to use these singular extensions and had a capability to display singular moves in its PV. When I watched it play most of the time it was captures.

In Cray Blitz, and in the one version of Crafty released with singular extensions, we copied Hsu and added an * (asterisk) to the move if it was extended as singular (checks and evasions were obvious. We used a different character for recaptures since not every recapture was extended.

bob · Post by **bob** » Mon Aug 02, 2010 10:25 pm

Ralph Stoesser wrote:
Daniel Shawul wrote:Well i expected a better result at longer tc like 10+10 which btw is too long to test at for anyone unless you have a cluster.
So the results we see here are disappointing and that is a big understatement compared to previous hype.
Yes, but nevertheless, the Elo gain from 5+5 results compared to 10+10 results is remarkable. Because the difference is rather huge, it would be interesting to see some 60+60 results for comparison.

5 or so vs 15? That's not "huge" in my book unless you live on a microscope slide.

Singular Extensions

Re: Singular Extensions - long games

Re: Singular Extensions - long games

Re: Singular Extensions - long games

Re: Singular Extensions - long games

Re: Singular Extensions - long games

Re: Singular Extensions - long games

Re: Singular Extensions - long games

Re: Singular Extensions - long games

Re: Singular Extensions - long games

Re: Singular Extensions - long games