Question regarding WAC number 2

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

Uri Blass
Posts: 10314
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: Question regarding WAC number 2

Post by Uri Blass »

looking at it again here is one of Bob's posts in that thread when
Bob admit that SE is productive for stockfish.

see the last post of page 5
in the same link:
http://www.talkchess.com/forum/viewtopi ... =&start=40


"I finally stopped the test last night, error bar was down to +/- 8, difference was +18 Elo. Not insignificant, but also not in line what claims I had seen on freechess. One person there claimed +100 or so which would be remarkable for any change."
bhlangonijr
Posts: 482
Joined: Thu Oct 16, 2008 4:23 am
Location: Milky Way

Re: Question regarding WAC number 2

Post by bhlangonijr »

Uri Blass wrote:looking at it again here is one of Bob's posts in that thread when
Bob admit that SE is productive for stockfish.

see the last post of page 5
in the same link:
http://www.talkchess.com/forum/viewtopi ... =&start=40


"I finally stopped the test last night, error bar was down to +/- 8, difference was +18 Elo. Not insignificant, but also not in line what claims I had seen on freechess. One person there claimed +100 or so which would be remarkable for any change."
Yes, I saw the post. Another thing I think should be taken into account: From what I've read in the source code Stockfish has a reduced search for both the hash and sibling moves in PV and Non PV search. Also, it implements a different hashing scheme for supporting the partial searches. Those additional searches might slow down the search considerably, and if SE is just bogus code as Bob claims what explains the fact the Elo is not going down for Stockfish with SE?
Another fact that supports my "reasoning": Stockfish 1.8 is something about 200 Elo points stronger than Glaurung 2.2. Besides of parameters tuning there are not a LOT of new features comparing the two versions. SE is one of the major changes, so I think it might be doing something positive for Stockfish.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Question regarding WAC number 2

Post by bob »

Uri Blass wrote:
bob wrote:
bhlangonijr wrote:
bob wrote:
bhlangonijr wrote:
bob wrote:
Uri Blass wrote:
jacobbl wrote:Interesting, it doesn't seem logical that a so simple rule is the optimal solution, but I will try it. But then I suspect you have advanced pruning/reductions in your engine.

Regards
Jacob
Of course there is no reason to believe that extending only checks is the optimal solution.

Stockfish that is stronger than Crafty extend moves that are not checks(for example it is using singular extensions) and I guess that it is one of the reason that Stockfish is stronger than Crafty.

Fortunately, _some_ of us don't have to "guess". :)

SE is _minimal_ improvement at best, zero in the testing I did...

No one knows what is optimal for extensions or reductions. Excessive extensions waste time and hurt performance. Excessive reductions or pruning misses important tactical/positional moves. Lots of trade-offs...
Bob, have you tried removing SE from Stockfish and test it using your cluster? Because there is a chance Crafty is not taking all advantages of SE.

Regards,
Yes I did. And I posted the results in a SE thread here a few weeks back. Removing SE from SF 1.8 made no difference at all in 2 30K game runs, in the 3rd it was a very small gain. I don't recall the specifics but the thread can be found in this forum...

The SE they use is the idea taken from IP* and friends, namely to extend TT "best moves" if the value appears to be interesting. IMHO this is a random extension, because you don't extend "singular moves" as is used in the Hsu/Campbell definition. You have to first find the move in the TT, which is a small percentage of all nodes in a middlegame, and you suffer when entries get replaced/overwritten since you can't extend that which you don't find in the TT. The overall idea of SE might be a good one, if done right. But not _this_ idea...
I found the post you have mentioned.
http://www.talkchess.com/forum/viewtopi ... =&start=40
I think that your tests strongly suggests that SE in Stockfish is a real improvement. If we assume that SE implementation slow things down in search and you have got a small improvement . It shows that even in faster TC SE pays off the fact that Stockfish with no SE has faster search. Don't you think so? I would not be surprised if the Elo difference is even bigger with longer TC...
If you look all the way thru the various threads, I ran the test at longer time controls as well... I don't see where you conclude "a real improvement" when with an error margin of +/- 4 the gain was "zero". I am not sure what your reasoning is based on.
Here is the relevant data at longer time control.

Stockfish 1.8 64bit 2796 16 16 1255 75% 2600 38%
Stockfish 1.8noSE 64bit 2781 15 15 1325 72% 2602 41%

You stopped the test at longer time control but the data clearly support the idea that you get a real improvement from SE in stockfish.
How does that "clearly support" anything? Error bar +/- 15? I have previously shown test results where the result after 1,000 games is nothing like the results after 10,000 and beyond...

I suppose I can go back and run this for a couple of weeks to definitively answer the question, when we have a break in Crafty testing...

But even if you take that +15, that is hardly "the main reason why SF beats Crafty" as I wrote...
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Question regarding WAC number 2

Post by bob »

bhlangonijr wrote:
Uri Blass wrote:looking at it again here is one of Bob's posts in that thread when
Bob admit that SE is productive for stockfish.

see the last post of page 5
in the same link:
http://www.talkchess.com/forum/viewtopi ... =&start=40


"I finally stopped the test last night, error bar was down to +/- 8, difference was +18 Elo. Not insignificant, but also not in line what claims I had seen on freechess. One person there claimed +100 or so which would be remarkable for any change."
Yes, I saw the post. Another thing I think should be taken into account: From what I've read in the source code Stockfish has a reduced search for both the hash and sibling moves in PV and Non PV search. Also, it implements a different hashing scheme for supporting the partial searches. Those additional searches might slow down the search considerably, and if SE is just bogus code as Bob claims what explains the fact the Elo is not going down for Stockfish with SE?
Another fact that supports my "reasoning": Stockfish 1.8 is something about 200 Elo points stronger than Glaurung 2.2. Besides of parameters tuning there are not a LOT of new features comparing the two versions. SE is one of the major changes, so I think it might be doing something positive for Stockfish.
SE might work to some extent. It is absolutely _not_ a significant change, unless you call < 20 Elo significant. That is less than 10% of the difference between glaurung 2.x and SF 1.8... There are many more important changes that account for this, not SE.
smcracraft
Posts: 737
Joined: Wed Mar 08, 2006 8:08 pm
Location: Orange County California
Full name: Stuart Cracraft

Re: Question regarding WAC number 2

Post by smcracraft »

Can anyone hazard a guess what my program's problem is in solving WAC #2? (other than me.) ?

Thanks.

--Stuart

Code: Select all


-- ** -- ** -- ** -- **
** -- ** -- ** -- ** BP
-- ** -- ** -- BK -- **
** -- ** -- ** BP ** --
BP ** BP ** -- WP -- **
WP BR ** BP WP WK ** --
-- WP -- WR -- ** -- WP
** -- ** -- ** -- ** --

black to move, castle = ----, nominal depth = 10, hashkey=99029fca297c6ac1, phashkey=4c2613800e489f44
eval = material&#91;stm&#93;=10001, material&#91;stm^1&#93;=10001, positional&#91;stm&#93;=1645, positional&#91;stm^1&#93;=-611 = 2256
phase = 4
! depth 99
i scanned maxdepth = 99
! go
maxdepth = 99
1.  5886   0.01        1 139 Rb3b2
2. -1409   0.01       25 3027 Pc4c3 Rd2d3
3.  5159   0.01       95 10130 Ph7h5 Pe3e4 Rb3b2
4.  2106   0.02      979 54151 Kf6e6 Kf3f2 Ke6d5 Kf2e1
5.  3544   0.02     1628 70697 Kf6e6 Pe3e4 Ke6f7 Pe4f5 Rb3b2
6.  2240   0.05     6713 142098 Kf6e6 Kf3f2 Ke6d5 Kf2e1 Rb3b8 Pb2b3
7.  3232   0.09    13218 150176 Kf6e6 Pe3e4 Ke6f7 Pe4f5 Rb3b8 Pf5f6 Rb8b2
8.  2162   0.82   162798 198055 Kf6e7 Kf3f2 Ke7d6 Kf2e1 Kd6d5 Rd2g2 Rb3b8 Rg2g7
9.  3440   1.40   261980 187798 Kf6e7 Pe3e4 Ke7f7 Pe4f5 Ph7h6 Kf3g2 Kf7f6 Rd2d1 Rb3b2 Rd1d2
10.  2248   7.62  1674891 219884 Kf6e7 Ph2h4 Rb3b6 Ph4h5 Ke7d6 Kf3g2 Kd6d5 Kg2f2 Rb6b8 Kf2e1
11.  1798  16.61  3271645 197026 Kf6e7 Pe3e4 Ke7e6 Pe4f5 Ke6f5 Ph2h4 Rb3b5 Kf3g2 Kf5e6 Kg2g1 Ke6d6 Pf4f5
12.  2041  78.64 14825428 188522 Ph7h5 Kf3g2 Rb3b8 Kg2f2 Kf6f7 Kf2f3
13. 
Uri Blass
Posts: 10314
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: Question regarding WAC number 2

Post by Uri Blass »

bob wrote:
bhlangonijr wrote:
Uri Blass wrote:looking at it again here is one of Bob's posts in that thread when
Bob admit that SE is productive for stockfish.

see the last post of page 5
in the same link:
http://www.talkchess.com/forum/viewtopi ... =&start=40


"I finally stopped the test last night, error bar was down to +/- 8, difference was +18 Elo. Not insignificant, but also not in line what claims I had seen on freechess. One person there claimed +100 or so which would be remarkable for any change."
Yes, I saw the post. Another thing I think should be taken into account: From what I've read in the source code Stockfish has a reduced search for both the hash and sibling moves in PV and Non PV search. Also, it implements a different hashing scheme for supporting the partial searches. Those additional searches might slow down the search considerably, and if SE is just bogus code as Bob claims what explains the fact the Elo is not going down for Stockfish with SE?
Another fact that supports my "reasoning": Stockfish 1.8 is something about 200 Elo points stronger than Glaurung 2.2. Besides of parameters tuning there are not a LOT of new features comparing the two versions. SE is one of the major changes, so I think it might be doing something positive for Stockfish.
SE might work to some extent. It is absolutely _not_ a significant change, unless you call < 20 Elo significant. That is less than 10% of the difference between glaurung 2.x and SF 1.8... There are many more important changes that account for this, not SE.
I agree that there are more changes except SE that are responsible for the fact that stockfish is stronger than Crafty.

Note that SE is not the only extension that stockfish has that is not check extension(for example mate threat extension or passed pawn extension or single evasion extension).

It may be interesting what is the rating advantage that stockfish get from all the extensions that it does except the check extension.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Question regarding WAC number 2

Post by bob »

Uri Blass wrote:
bob wrote:
bhlangonijr wrote:
Uri Blass wrote:looking at it again here is one of Bob's posts in that thread when
Bob admit that SE is productive for stockfish.

see the last post of page 5
in the same link:
http://www.talkchess.com/forum/viewtopi ... =&start=40


"I finally stopped the test last night, error bar was down to +/- 8, difference was +18 Elo. Not insignificant, but also not in line what claims I had seen on freechess. One person there claimed +100 or so which would be remarkable for any change."
Yes, I saw the post. Another thing I think should be taken into account: From what I've read in the source code Stockfish has a reduced search for both the hash and sibling moves in PV and Non PV search. Also, it implements a different hashing scheme for supporting the partial searches. Those additional searches might slow down the search considerably, and if SE is just bogus code as Bob claims what explains the fact the Elo is not going down for Stockfish with SE?
Another fact that supports my "reasoning": Stockfish 1.8 is something about 200 Elo points stronger than Glaurung 2.2. Besides of parameters tuning there are not a LOT of new features comparing the two versions. SE is one of the major changes, so I think it might be doing something positive for Stockfish.
SE might work to some extent. It is absolutely _not_ a significant change, unless you call < 20 Elo significant. That is less than 10% of the difference between glaurung 2.x and SF 1.8... There are many more important changes that account for this, not SE.
I agree that there are more changes except SE that are responsible for the fact that stockfish is stronger than Crafty.

Note that SE is not the only extension that stockfish has that is not check extension(for example mate threat extension or passed pawn extension or single evasion extension).

It may be interesting what is the rating advantage that stockfish get from all the extensions that it does except the check extension.
I can answer that. zero. From testing. I removed them from Crafty and found Elo went up. I removed them from SF 1.6, one by one, and found that the elo either went up slightly or did not change. Except for the check extension itself which is worthwhile although not +20 worthwhile...
Uri Blass
Posts: 10314
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: Question regarding WAC number 2

Post by Uri Blass »

bob wrote:
Uri Blass wrote:
bob wrote:
bhlangonijr wrote:
Uri Blass wrote:looking at it again here is one of Bob's posts in that thread when
Bob admit that SE is productive for stockfish.

see the last post of page 5
in the same link:
http://www.talkchess.com/forum/viewtopi ... =&start=40


"I finally stopped the test last night, error bar was down to +/- 8, difference was +18 Elo. Not insignificant, but also not in line what claims I had seen on freechess. One person there claimed +100 or so which would be remarkable for any change."
Yes, I saw the post. Another thing I think should be taken into account: From what I've read in the source code Stockfish has a reduced search for both the hash and sibling moves in PV and Non PV search. Also, it implements a different hashing scheme for supporting the partial searches. Those additional searches might slow down the search considerably, and if SE is just bogus code as Bob claims what explains the fact the Elo is not going down for Stockfish with SE?
Another fact that supports my "reasoning": Stockfish 1.8 is something about 200 Elo points stronger than Glaurung 2.2. Besides of parameters tuning there are not a LOT of new features comparing the two versions. SE is one of the major changes, so I think it might be doing something positive for Stockfish.
SE might work to some extent. It is absolutely _not_ a significant change, unless you call < 20 Elo significant. That is less than 10% of the difference between glaurung 2.x and SF 1.8... There are many more important changes that account for this, not SE.
I agree that there are more changes except SE that are responsible for the fact that stockfish is stronger than Crafty.

Note that SE is not the only extension that stockfish has that is not check extension(for example mate threat extension or passed pawn extension or single evasion extension).

It may be interesting what is the rating advantage that stockfish get from all the extensions that it does except the check extension.
I can answer that. zero. From testing. I removed them from Crafty and found Elo went up. I removed them from SF 1.6, one by one, and found that the elo either went up slightly or did not change. Except for the check extension itself which is worthwhile although not +20 worthwhile...
If this is the case then I wonder why the stockfish team keep extensions except check extension and SE.

I remember reading that they removed parts of the evaluation that may give less than 4 elo to make the code simpler so by the same logic I could expect them also to remove extensions to make the code simpler.

Uri
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Question regarding WAC number 2

Post by bob »

Uri Blass wrote:
bob wrote:
Uri Blass wrote:
bob wrote:
bhlangonijr wrote:
Uri Blass wrote:looking at it again here is one of Bob's posts in that thread when
Bob admit that SE is productive for stockfish.

see the last post of page 5
in the same link:
http://www.talkchess.com/forum/viewtopi ... =&start=40


"I finally stopped the test last night, error bar was down to +/- 8, difference was +18 Elo. Not insignificant, but also not in line what claims I had seen on freechess. One person there claimed +100 or so which would be remarkable for any change."
Yes, I saw the post. Another thing I think should be taken into account: From what I've read in the source code Stockfish has a reduced search for both the hash and sibling moves in PV and Non PV search. Also, it implements a different hashing scheme for supporting the partial searches. Those additional searches might slow down the search considerably, and if SE is just bogus code as Bob claims what explains the fact the Elo is not going down for Stockfish with SE?
Another fact that supports my "reasoning": Stockfish 1.8 is something about 200 Elo points stronger than Glaurung 2.2. Besides of parameters tuning there are not a LOT of new features comparing the two versions. SE is one of the major changes, so I think it might be doing something positive for Stockfish.
SE might work to some extent. It is absolutely _not_ a significant change, unless you call < 20 Elo significant. That is less than 10% of the difference between glaurung 2.x and SF 1.8... There are many more important changes that account for this, not SE.
I agree that there are more changes except SE that are responsible for the fact that stockfish is stronger than Crafty.

Note that SE is not the only extension that stockfish has that is not check extension(for example mate threat extension or passed pawn extension or single evasion extension).

It may be interesting what is the rating advantage that stockfish get from all the extensions that it does except the check extension.
I can answer that. zero. From testing. I removed them from Crafty and found Elo went up. I removed them from SF 1.6, one by one, and found that the elo either went up slightly or did not change. Except for the check extension itself which is worthwhile although not +20 worthwhile...
If this is the case then I wonder why the stockfish team keep extensions except check extension and SE.

I remember reading that they removed parts of the evaluation that may give less than 4 elo to make the code simpler so by the same logic I could expect them also to remove extensions to make the code simpler.

Uri
There is a lot of urban myth concerning search extensions. It takes a lot of time to run the tests to check each one individually, and then check them in groups in case they are somehow dependent. I simply chose to run the tests on Crafty, similar to the "history counter test" I did back when I first started playing with reductions. I found the history counter did nothing. I then removed it from Fruit and found the same thing, verifying my result...

Not everyone is willing to run those kinds of tests, for me, a couple of hours here and there often reveals things that are surprising or different from what we have been believing for many years...
mcostalba
Posts: 2684
Joined: Sat Jun 14, 2008 9:17 pm

Re: Question regarding WAC number 2

Post by mcostalba »

bob wrote: I can answer that. zero. From testing. I removed them from Crafty and found Elo went up. I removed them from SF 1.6, one by one, and found that the elo either went up slightly or did not change. Except for the check extension itself which is worthwhile although not +20 worthwhile...
Ok, I will remove all the extension at once, apart from SE and check exstension, and test and if there is no advantage we will remove all them in one go ;-)