Difficult ending for Stockfish

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

Uri Blass
Posts: 10296
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: Difficult ending for Stockfish

Post by Uri Blass »

syzygy wrote:
Uri Blass wrote:
syzygy wrote:So that's what removing verification search as "simplification" gets you...

Interesting philosophy that is being applied now. If some good patch manages to get into SF by passing all tests, it then becomes the prey of various extremists that will simply repeatedly test it until its removal is "proven" to be a safe simplification.
The verification search was not some good patch that passed all tests so I do not see the basis for your claim.
Maybe not in this case, but as far as I understand the test for an at some point accepted patch to not be thrown out as "simplification" is as tough as the test for a new patch to be accepted into the tree. So for a patch to stay it has to survive the original acceptance test over and over again (and by statistics it won't).
No
The test is not as tough.

For simplification you need to pass twice sprt(-4,0)
For other patchs you usually need to pass sprt(-1.5,4.5) at STC and SPRT(0.6) at long time control.

practically marco does not like patchs that need many games to pass sprt(-4,0) at long time control so he asked to stop one patch that had chances to pass SPRT(-4,0) at long time control.

see eval_sim1
http://tests.stockfishchess.org/tests/user/Yery

Note that I do not think that it is good to stop sprt in the middle and other people agree with me but marco decided about it and not me or other people so the author of the patch decided to stop the test because he understood that even if the test pass marco is not going to accept it.
Michel
Posts: 2272
Joined: Mon Sep 29, 2008 1:50 am

Re: Difficult ending for Stockfish

Post by Michel »

other people so the author of the patch decided to stop the test because he understood that even if the test pass marco is not going to accept it.
I do not think Marco ever said he would not apply it (why would he not?).

People retesting succesful patches for removal until they succeed is theoretically possible. But I assume such unreasonable behaviour would not be tolerated.
Uri Blass
Posts: 10296
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: Difficult ending for Stockfish

Post by Uri Blass »

Michel wrote:
other people so the author of the patch decided to stop the test because he understood that even if the test pass marco is not going to accept it.
I do not think Marco ever said he would not apply it (why would he not?).

People retesting succesful patches for removal until they succeed is theoretically possible. But I assume such unreasonable behaviour would not be tolerated.
see the following thread

https://groups.google.com/forum/?fromgr ... -SsYIZI5tc

marco:
"I suggest to limit LTC test of semplifications to 40k games to avoid wasting resources on dubious tests (as is happening now on an eval semplification test)."

marco later:
"I mean that after 40K games test is dropped. Not that patch is committed."

The author of the patch yeri later:

"Well I think that the current system of testing simplifications is already proving better than fixed games. I've tested 6 eval simplifications; and 5 of them failed well under 40000 games each; so that is resources saved.

But if you are not going to commit that patch nr 1 even if it would succeed, then by all means abort it now and save resources."

The author later stopped the test and he probably got the impression that marco is not going to commit the patch even if it is going to pass.
Michel
Posts: 2272
Joined: Mon Sep 29, 2008 1:50 am

Re: Difficult ending for Stockfish

Post by Michel »

marco later:
"I mean that after 40K games test is dropped. Not that patch is committed."
This was simply a proposal by Marco that was shown to be statistically unsound. So I assume he would not have insisted on it. Anyway the use of the SPRT(-4,0) has now become completely uncontroversial as a means for validating simplifications.
syzygy
Posts: 5566
Joined: Tue Feb 28, 2012 11:56 pm

Re: Difficult ending for Stockfish

Post by syzygy »

syzygy wrote:
Uri Blass wrote:
syzygy wrote:So that's what removing verification search as "simplification" gets you...

Interesting philosophy that is being applied now. If some good patch manages to get into SF by passing all tests, it then becomes the prey of various extremists that will simply repeatedly test it until its removal is "proven" to be a safe simplification.
The verification search was not some good patch that passed all tests so I do not see the basis for your claim.
Maybe not in this case, but as far as I understand the test for an at some point accepted patch to not be thrown out as "simplification" is as tough as the test for a new patch to be accepted into the tree. So for a patch to stay it has to survive the original acceptance test over and over again (and by statistics it won't).
It seems common sense has prevailed and removing a few lines of useful straightforward code only for the sake of removing code (for what? the ultimate goal of a chess engine without code?) was not considered worth it, this time.
Uri Blass
Posts: 10296
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: Difficult ending for Stockfish

Post by Uri Blass »

Michel wrote:
marco later:
"I mean that after 40K games test is dropped. Not that patch is committed."
This was simply a proposal by Marco that was shown to be statistically unsound. So I assume he would not have insisted on it. Anyway the use of the SPRT(-4,0) has now become completely uncontroversial as a means for validating simplifications.
Maybe but he did not respond to say it so I guess that the author of the patch got the impression that marco is not going to accept the patch even if it pass so he stopped the test.
Uri Blass
Posts: 10296
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: Difficult ending for Stockfish

Post by Uri Blass »

syzygy wrote:
syzygy wrote:
Uri Blass wrote:
syzygy wrote:So that's what removing verification search as "simplification" gets you...

Interesting philosophy that is being applied now. If some good patch manages to get into SF by passing all tests, it then becomes the prey of various extremists that will simply repeatedly test it until its removal is "proven" to be a safe simplification.
The verification search was not some good patch that passed all tests so I do not see the basis for your claim.
Maybe not in this case, but as far as I understand the test for an at some point accepted patch to not be thrown out as "simplification" is as tough as the test for a new patch to be accepted into the tree. So for a patch to stay it has to survive the original acceptance test over and over again (and by statistics it won't).
It seems common sense has prevailed and removing a few lines of useful straightforward code only for the sake of removing code (for what? the ultimate goal of a chess engine without code?) was not considered worth it, this time.
It is certainly not the ultimate goal of the people who suggest removing the code.

The target is to improve stockfish.
You can disagree with the people who suggested to remove the verification code but I consider distorting the goal as something that you should not do.

Certainly chess engine without code cannot even play chess and there is a limit to the number of lines that you can remove and still pass (-4,0) twice when you do not try to remove the same code again and again.

I find the behaviour of marco non consistent.
Stockfish suffer from bad behaviour in evaluating stalemate positions that cause it to show non draw score for positions like the following position.

[D]7k/6p1/8/8/8/8/6BP/7K w - - 0 1

Some change to fix it did not pass normal SPRT that is using (-1.5,4.5) and (0,6)

The author of the change asked to test adding the code at SPRT(-4,0) twice but marco was against it because he claimed that he is against trying the same idea again and again until it pass.

I think that deciding that fixing the stalemate bad behavior is bad even if it is no regression is not consistent
with deciding that verification code is good even if it is not an improvement.

The only consistent thing in the behavior of marco seems to be
prefering the past as long as it is something that was for a long time and not something that he committed some days ago and it seems that he is simply against changes.

Adding stalemate detection code is a change so marco is against it
and it is not enough for him to see a proof that the change is not negative. He needs to see a proof that the change is positive.

Removing verification search code is a change so marco is against it
unless you prove that it is a positive change and proving that it is non negative change is not enough for him.

Another example is a patch of me when marco simplified it correctly and reverted the patch in 9.1 for no reason except his fear from changes.

people explained him that the simpler condition is practically equivalent(see the posts of lp in the link at the bottom of this post)
but he did not like to change it and only lately somebody committed a patch that include the simplification and passed SPRT(-4,0) 4 different times at 4 different time controls so he accepted it


https://groups.google.com/forum/?fromgr ... qeIZUagr6w

Latest example that I remember is that marco did not like some simplification of removing GrainSize and needed another test that suggested that not only that it does not lose elo but also that it earns elo.

Note that removing GrainSize is something that is logical for every chess player because it is better to have a more accurate evaluation and dividing the evaluation of the position by 4 and multiplying it by 4 only to get some number that you can divide by 4 simply does not make sense except very fast time control when it is more important to get bigger depths by more cutoffs.