Stockfish no progress in 2month and half , why ?

JJJ · Post by **JJJ** » Thu Aug 31, 2017 3:49 am

If non functionnal patches doesn't make an elo loss that's mean most of your green patches lately ( not always ) doesn't give elo, so you re adding patches and hope to get lucky with it.

Eelco de Groot · Post by **Eelco de Groot** » Thu Aug 31, 2017 5:05 am

Maybe the [-4, 0 ] and [-3, 1] patches should get a different color. I suggest pink. The Stockfish team does not add them hoping to get lucky. I indeed think there have not been an awful lot of [0, 4] and [0, 5] patches in the last couple of months so I don't think the Stockfish people expected a lot of Elo progress. Compared to Stockfish 8 the change is still substantial.

The simplification patches are supposed to be "Zero Elo" patches. They are not all non functional. Some of them may even lose a little Elo but that is calculated in. On the whole they are supposed to be at least neutral. They are meant to keep the code consistent, readable, up to new compiler architectures and language updates or prepare the code for further changes. In the long range they help Stockfish. And sometimes just making some changes is good, that is like unloosening some screws, so that in other places you can retune things better.

That graph looks like Mount Everest. There can be some plateaus, even temporary cliffs and ravines along the way. Eventually you might reach a top. For a mountaineer and a mathematician just hoping to 'model'or mathematically optimizing play, that can mean he reached his goal. No further optimization is possible in this place. To get higher, he must first find another mountain.

Personally I think sometimes it would have been better if people had chosen another Drosophila for AI than chess and left chess as a human game. Make Stockfish only play 3D chess or something. That would be fun, but also very hard to visualize. I have the utmost admiration for Leela, the Go program from Gian-Carlo Pascutto. It can even make use of the floating point calculations of your video card! But it has been a long time since I last played a game, I'd have to learn again, so unfortunately this beautiful piece of software has not been used an awful lot.

Uri Blass · Post by **Uri Blass** » Thu Aug 31, 2017 6:16 am

JJJ wrote:Why don't you test something like this :

Remove all green patch and test a version with all non fonctionnal patch only, then do the opposite , remove all non fonctionnal patch and test all green patch.

At least you will now better and quickly if the probleme is from there or not.

Many green patches should have add +20 elo at least, that is not the case, even 10 elo would have been nice, but almost 0 looks wrong to me. I think there was this probleme just after Stockfish 8, when many green patches did not much elo increase for a while.

I am not sure that it is so easy to remove patches that are not consecutive patches with no bugs.

At least in theory code may be dependent on previous code so the program may even crash if you simply try it or lose a lot of elo in both cases.

Frank Brenner · Post by **Frank Brenner** » Thu Aug 31, 2017 8:46 pm

JJJ wrote:Why don't you test something like this :

Remove all green patch and test a version with all non fonctionnal patch only, then do the opposite , remove all non fonctionnal patch and test all green patch.

At least you will now better and quickly if the probleme is from there or not.

Many green patches should have add +20 elo at least, that is not the case, even 10 elo would have been nice, but almost 0 looks wrong to me. I think there was this probleme just after Stockfish 8, when many green patches did not much elo increase for a while.

This is indeed a good idea.

But you can be sure for the following:

The SF ppl that are responsible for this kind of decision is full width ignoring all that what is written from unknown persons like us.

Joerg Oster · Post by **Joerg Oster** » Thu Aug 31, 2017 10:09 pm

JJJ wrote:Why don't you test something like this :

Remove all green patch and test a version with all non fonctionnal patch only, then do the opposite , remove all non fonctionnal patch and test all green patch.

At least you will now better and quickly if the probleme is from there or not.

Many green patches should have add +20 elo at least, that is not the case, even 10 elo would have been nice, but almost 0 looks wrong to me. I think there was this probleme just after Stockfish 8, when many green patches did not much elo increase for a while.

Do you understand that non-functional patches which have been tested in the framework, are also green patches?

Frank Brenner · Post by **Frank Brenner** » Thu Aug 31, 2017 10:47 pm

Joerg Oster wrote:
JJJ wrote:Why don't you test something like this :

Remove all green patch and test a version with all non fonctionnal patch only, then do the opposite , remove all non fonctionnal patch and test all green patch.

At least you will now better and quickly if the probleme is from there or not.

Many green patches should have add +20 elo at least, that is not the case, even 10 elo would have been nice, but almost 0 looks wrong to me. I think there was this probleme just after Stockfish 8, when many green patches did not much elo increase for a while.
Do you understand that non-functional patches which have been tested in the framework, are also green patches?

Don't worry about the colors. It's clear what he meant.

lucasart · Post by **lucasart** » Fri Sep 01, 2017 9:23 am

mcostalba wrote:
Michel wrote: Probably the majority of the patches that pass STC are lucky runs these days (this will happen for 1 neutral patch in 20). However most of those lucky runs will be caught by the LTC test. This creates somehow the perception that the STC test is not a good predictor for the LTC test, leading people to make misguided calls for increasing the STC TC.
This is a comment that makes sense (a novelty in this thread).

In these 2 months there has been a huge number of tests and attempts tried by many people, not less then in the past, and for me this is the most important point. It means interest of developers is still high with SF.

Also finding good patches is a statistical process: sometime you find 3 in a row, sometime you fish for months for nothing....

Still too early to tell if we reached a plateau with current development model or it is just a temporary glitch.

Looks like we need to take a Bayesian approach to the problem. If indeed the success rate of patches is getting lower, we should use stricter tests (lowering alpha) to keep the type I error the same.

First estime the a priori portability of any patch to pass. Just look at the last few patches pass/fail status to estimate.

Then apply bayes formula twice, and you get the a posteriori proba for a patch passing STC+LTC. And recalibrate alpha accordingly (should probably be lower that 5%).

syzygy · Post by **syzygy** » Fri Sep 01, 2017 12:45 pm

Frank Brenner wrote:
Joerg Oster wrote:
JJJ wrote:Why don't you test something like this :

Remove all green patch and test a version with all non fonctionnal patch only, then do the opposite , remove all non fonctionnal patch and test all green patch.

At least you will now better and quickly if the probleme is from there or not.

Many green patches should have add +20 elo at least, that is not the case, even 10 elo would have been nice, but almost 0 looks wrong to me. I think there was this probleme just after Stockfish 8, when many green patches did not much elo increase for a while.
Do you understand that non-functional patches which have been tested in the framework, are also green patches?
Don't worry about the colors. It's clear what he meant.

If he really meant to say that many green (non-simplification) patches are supposed to add +20 Elo at least, then he is just showing he has an awful lot to learn about engine development.

JJJ · Post by **JJJ** » Fri Sep 01, 2017 1:41 pm

In theory, these green patches worth 1-2 elo each, I don't know how many green patches passed between these two regression test, I believe more than 10. So why shouldn't it be 20 elo ? Most off the time these patches adds up , why not this time ?

syzygy · Post by **syzygy** » Fri Sep 01, 2017 1:56 pm

JJJ wrote:In theory, these green patches worth 1-2 elo each, I don't know how many green patches passed between these two regression test, I believe more than 10. So why shouldn't it be 20 elo ? Most off the time these patches adds up , why not this time ?

OK, you meant many green patches taken together. I thought you were saying that many of the green patches should add 20 Elo each.

How many non-simplification patches were applied between June 22 (+27 regression test) and August 26 (+29)?

Let's check: https://github.com/official-stockfish/S ... its/master
- Count all weak squares
- Use moveCount history for reduction
- Tweak connected pawns seed[] array values
- Queens vs. Minors imbalance

Many of the other patches are simplifications for which a 1 Elo loss is not necessarily expected, but accepted.

Stockfish no progress in 2month and half , why ?

Re: Stockfish no progress in 2month and half , why ?

Re: Stockfish no progress in 2month and half , why ?

Re: Stockfish no progress in 2month and half , why ?

Re: Stockfish no progress in 2month and half , why ?

Re: Stockfish no progress in 2month and half , why ?

Re: Stockfish no progress in 2month and half , why ?

Re: Stockfish no progress in 2month and half , why ?

Re: Stockfish no progress in 2month and half , why ?

Re: Stockfish no progress in 2month and half , why ?

Re: Stockfish no progress in 2month and half , why ?