Stockfish no progress in 2month and half , why ?

Uri Blass · Post by **Uri Blass** » Fri Sep 01, 2017 2:39 pm

JJJ wrote:In theory, these green patches worth 1-2 elo each, I don't know how many green patches passed between these two regression test, I believe more than 10. So why shouldn't it be 20 elo ? Most off the time these patches adds up , why not this time ?

In theory the value of green patches can be less than 1 elo even if we include only green patches that are not simplifications.

Suppose that you have many patches that give 0.5 elo improvement.
Most of them are going to fail but some are going to pass and maybe it is already the case for patches that people test at LTC.

JJJ · Post by **JJJ** » Fri Sep 01, 2017 3:36 pm

syzygy wrote:
JJJ wrote:In theory, these green patches worth 1-2 elo each, I don't know how many green patches passed between these two regression test, I believe more than 10. So why shouldn't it be 20 elo ? Most off the time these patches adds up , why not this time ?
OK, you meant many green patches taken together. I thought you were saying that many of the green patches should add 20 Elo each.

How many non-simplification patches were applied between June 22 (+27 regression test) and August 26 (+29)?

Let's check: https://github.com/official-stockfish/S ... its/master
- Count all weak squares
- Use moveCount history for reduction
- Tweak connected pawns seed[] array values
- Queens vs. Minors imbalance

Many of the other patches are simplifications for which a 1 Elo loss is not necessarily expected, but accepted.

All right, if it is only 4 green patches, I understand why it is only +2 elo then.

Lyudmil Tsvetkov · Post by **Lyudmil Tsvetkov** » Fri Sep 01, 2017 6:45 pm

syzygy wrote:
JJJ wrote:In theory, these green patches worth 1-2 elo each, I don't know how many green patches passed between these two regression test, I believe more than 10. So why shouldn't it be 20 elo ? Most off the time these patches adds up , why not this time ?
OK, you meant many green patches taken together. I thought you were saying that many of the green patches should add 20 Elo each.

How many non-simplification patches were applied between June 22 (+27 regression test) and August 26 (+29)?

Let's check: https://github.com/official-stockfish/S ... its/master
- Count all weak squares
- Use moveCount history for reduction
- Tweak connected pawns seed[] array values
- Queens vs. Minors imbalance

Many of the other patches are simplifications for which a 1 Elo loss is not necessarily expected, but accepted.

I was right, they are going to simplify SF at the end.

MikeB · Post by **MikeB** » Sat Sep 02, 2017 5:30 am

Lyudmil Tsvetkov wrote:
syzygy wrote:
JJJ wrote:In theory, these green patches worth 1-2 elo each, I don't know how many green patches passed between these two regression test, I believe more than 10. So why shouldn't it be 20 elo ? Most off the time these patches adds up , why not this time ?
OK, you meant many green patches taken together. I thought you were saying that many of the green patches should add 20 Elo each.

How many non-simplification patches were applied between June 22 (+27 regression test) and August 26 (+29)?

Let's check: https://github.com/official-stockfish/S ... its/master
- Count all weak squares
- Use moveCount history for reduction
- Tweak connected pawns seed[] array values
- Queens vs. Minors imbalance

Many of the other patches are simplifications for which a 1 Elo loss is not necessarily expected, but accepted.
I was right, they are going to simplify SF at the end.

+1 - don't know why , but that comment made me laugh...as in LOL

Eelco de Groot · Post by **Eelco de Groot** » Thu Sep 14, 2017 3:43 am

Gian-Carlo has made a beta available that is moving towards 0.11.0

0.11.0 Beta 3 is only a few hours old. There is even a MacOS and Arch/Manjaro version. (I have not the slightest idea what that is...

)

There are many changes listed! Thanks Gian-Carlo Pascutto!

shrapnel · Post by **shrapnel** » Thu Sep 14, 2017 7:01 am

Have a hunch that some of those failed Patches work on Houdini.
Houdart may have used those to strengthen Houdini 6 remarkably.
That is why in Ingo Bauer's Test, Houdini is crushing Stockfish.
This is indeed the PERFECT way to beat Stockfish. Take the ideas (failed and successful) from the Open Stockfish Framework, improve upon them and adapt them to work on one's Private/Commercial Engine !
Of course, it would take an extremely talented Chess Programmer to do it, which condition Robert Houdart meets easily.

Jouni · Post by **Jouni** » Thu Sep 14, 2017 8:32 am

Yes it's stunning, that SF has got only +23 after version 8, but Komodo and Houdini one/two man team +50

. Obviously SF team should look at Komodo/Houdini code now

. And I don't understand endless tuning of time management. Isn't it so easy thing once programmed no need to ever change?

Ras · Post by **Ras** » Thu Sep 14, 2017 10:04 am

Jouni wrote:And I don't understand endless tuning of time management. Isn't it so easy thing once programmed no need to ever change?

Actually, it's pretty difficult. You need to take more time when it will make a difference, but the catch is that the difference only shows up when you have taken the additional time.

Uri Blass · Post by **Uri Blass** » Thu Sep 14, 2017 11:07 am

Jouni wrote:Yes it's stunning, that SF has got only +23 after version 8, but Komodo and Houdini one/two man team +50 . Obviously SF team should look at Komodo/Houdini code now . And I don't understand endless tuning of time management. Isn't it so easy thing once programmed no need to ever change?

1)Time management is not something that no need to ever change.
There are many possible ideas that may work or not work and people did not try all ideas.

2)It is not surprising that SF has got only +23 after version 8 and houdini got more than it.

The fact that is surprising for me is that so many people give computer time to stockfish when stockfish is optimized only for bullet and also does not test the real value of every patch they accept

SPRT does not give unbiased estimate and I think that it is better at least to test every patch that pass in a fixed number of games against previous version(let say 40000 games at STC and LTC and VLTC) let say 15+0.15 60+0.6 and 240+2.4

I think that it is better for computer chess if the stockfish team stop testing new patches in the near future and simply test patches that they accepted in the past against previous version with 40000 games in these time control so we can know better if there are patches that scale well(earn more elo at LTC).

If we find few patches that scale well out of many patches that do not scale well then I think that we should test them again in 100,000 games at all time controls to verify that we did not get a lucky run.

I believe that the knowledge that we can get by this process can help to improve stockfish and other programs in the future because people will have a better idea if a patch scale well or does not scale well before testing it.

Today people have almost no idea about the value of patches at different time control(number of games to pass SPRT means almost nothing).

I believe that it is bad for Komodo or Houdini and also bad for Stockfish because knowing the value of the patches of the past can help to suggest better patches in the future.

Houdini · Post by **Houdini** » Thu Sep 14, 2017 11:26 am

shrapnel wrote:Have a hunch that some of those failed Patches work on Houdini.
Houdart may have used those to strengthen Houdini 6 remarkably.
That is why in Ingo Bauer's Test, Houdini is crushing Stockfish.
This is indeed the PERFECT way to beat Stockfish. Take the ideas (failed and successful) from the Open Stockfish Framework, improve upon them and adapt them to work on one's Private/Commercial Engine !
Of course, it would take an extremely talented Chess Programmer to do it, which condition Robert Houdart meets easily.

It always amazes me that even at a computer chess forum people have such simple ideas about engine development.
In the 50-60 Elo improvement of Houdini 6 there is literally nothing that comes from "failed" or even "successful patches" of Stockfish (and it's not for a lack of trying).
There are a lot of high-level similarities between today's top engines, but the low-level differences are such that it's very unlikely that a parameter tweak that works for one engine will also work for another engine (provided that the parameter even makes sense for the other engine).
And it's not as if there have been groundbreaking ideas in the SF code this year, it's been mainly tinkering around the edges. Just like Houdini, to be honest, but the edges and the tinkering are different

.

Jouni wrote:And I don't understand endless tuning of time management. Isn't it so easy thing once programmed no need to ever change?

The new TM in Stockfish was recently introduced as a "simplification", but it's actually a complex change that is not necessarily an improvement.

Stockfish no progress in 2month and half , why ?

Re: Stockfish no progress in 2month and half , why ?

Re: Stockfish no progress in 2month and half , why ?

Re: Stockfish no progress in 2month and half , why ?

Re: Stockfish no progress in 2month and half , why ?

Re: Stockfish no progress in 2month and half , why ?

Re: Stockfish no progress in 2month and half , why ?

Re: Stockfish no progress in 2month and half , why ?

Re: Stockfish no progress in 2month and half , why ?

Re: Stockfish no progress in 2month and half , why ?

Re: Stockfish no progress in 2month and half , why ?