WDL

Ovyron · Post by **Ovyron** » Sat Jul 04, 2020 6:16 am

zullil wrote: ↑Sat Jul 04, 2020 2:37 amI don't know how well the model fits the actual fishtest data.

And I can just wave my hand in the air and magically know that their WDL won't fit the data at all and it'll just mislead people that assume it fits.

zullil wrote: ↑Sat Jul 04, 2020 2:37 amDon't care

It's fine if you don't care what happens to other people, but don't try to discourage me from trying to spread the information about the damaging aspects of WDL.

If WDL fit the data then King vs King would correctly show white has 0% chance of winning

zullil · Post by **zullil** » Sat Jul 04, 2020 12:54 pm

Ovyron wrote: ↑Sat Jul 04, 2020 6:16 am
zullil wrote: ↑Sat Jul 04, 2020 2:37 amI don't know how well the model fits the actual fishtest data.
And I can just wave my hand in the air and magically know that their WDL won't fit the data at all and it'll just mislead people that assume it fits.

zullil wrote: ↑Sat Jul 04, 2020 2:37 amDon't care
It's fine if you don't care what happens to other people, but don't try to discourage me from trying to spread the information about the damaging aspects of WDL.

If WDL fit the data then King vs King would correctly show white has 0% chance of winning

I'm not interested in discouraging you. It just seemed to me, based on the last sentence of your initial post, that you didn't understand how to interpret WDL output. Maybe I misunderstood that post.

Here's the official statement from the Stockfish Readme:

UCI_ShowWDL
If enabled, show approximate WDL statistics as part of the engine output. These WDL numbers model expected game outcomes for a given evaluation and game ply for engine self-play at fishtest LTC conditions (60+0.6s per game).

zullil · Post by **zullil** » Sat Jul 04, 2020 1:11 pm

Ovyron wrote: ↑Fri Jul 03, 2020 8:00 am
I still can't wrap my head around something like "White has a 50% chance to win this", because for that to be useful I also need to know black's winning chances. If black has 0% chance to win this then this position is a great one to aim for. If black also has 50% chance to win this then the expected performance is 50%, and I'd rather play into one where White's chances are only 30% but black's are only 10%.

But WDL has no way to show that difference, so it falls flat on its face.

Stockfish's WDL output is always shown from the perspective of the side having the move, as indicated in the FEN. So let's assume that it's White's move and the output is 500 500 0. That would correspond to your first case, of Black having 0% chance of winning. If the output were 500 0 500, that would correspond to your second case, of each side having a 50% chance of winning (with no possibility of a draw).

So your final sentence is false.

syzygy · Post by **syzygy** » Sat Jul 04, 2020 1:55 pm

Ovyron wrote: ↑Fri Jul 03, 2020 11:54 pm
zullil wrote: ↑Fri Jul 03, 2020 1:19 pm If WDL shows 200 500 300, for example, then the interpretation is that the side-to move has a 20% chance of winning and a 30% chance of losing.
That predicts that 50% of the games will be drawn. The rest are decided games. How accurate is that? Where's the data that shows when 200 500 300 is shown, that half the games are actually drawn?

Because if they are then I stand corrected and tip off my hat to WDL, as I have no way of knowing when a position has 50% chance of being decided in any way, WDL is providing incredibly useful information just like that.

But if in reality when that position is played out the decided games are nowhere near 50%, then WDL is just smoke and mirrors. People are making wrong decisions because of faulty information WDL shows them.

The real information is reported score and depth. The WDL printed by SF is just a re-interpretation of the reported score and depth.

Most people are very happy with getting such numbers even if they don't mean anything. I have seen this before a couple of times in varying contexts. There is probably a good business idea here (write an app that predicts something and sell it, the quality of your model is of no importance).

Joerg Oster · Post by **Joerg Oster** » Sat Jul 04, 2020 2:24 pm

syzygy wrote: ↑Sat Jul 04, 2020 1:55 pm The real information is reported score and depth. The WDL printed by SF is just a re-interpretation of the reported score and depth.

That's not fully correct.
Change to

Code: Select all

The WDL printed by SF is just a re-interpretation of the reported score and current move number.

zullil · Post by **zullil** » Sat Jul 04, 2020 2:39 pm

Joerg Oster wrote: ↑Sat Jul 04, 2020 2:24 pm
syzygy wrote: ↑Sat Jul 04, 2020 1:55 pm The real information is reported score and depth. The WDL printed by SF is just a re-interpretation of the reported score and depth.
That's not fully correct.
Change to
Code: Select all
The WDL printed by SF is just a re-interpretation of the reported score and current move number.

Right. Although the reported score likely varies from iteration to iteration of the search, the current depth of the search is not directly used in the production of the WDL output.

syzygy · Post by **syzygy** » Sat Jul 04, 2020 2:44 pm

Joerg Oster wrote: ↑Sat Jul 04, 2020 2:24 pm
syzygy wrote: ↑Sat Jul 04, 2020 1:55 pm The real information is reported score and depth. The WDL printed by SF is just a re-interpretation of the reported score and depth.
That's not fully correct.
Change to
Code: Select all
The WDL printed by SF is just a re-interpretation of the reported score and current move number.

You are right, thanks for the correction. I thought "ply" meant depth, but it means "game_ply()".
Maybe it could be improved by taking into account game phase instead. (And contempt would seem to play a role as well.)

Apparently the default value of UCI_ShowWDL has now been changed to "false".

Ovyron · Post by **Ovyron** » Sun Jul 05, 2020 12:32 am

syzygy wrote: ↑Sat Jul 04, 2020 1:55 pmMost people are very happy with getting such numbers even if they don't mean anything.

No, they're happy because they think those numbers mean something, and they think they mean something very useful.

But I'll just shrug and hope I face someone that makes a mistake because they relied on WDL, egotistically something like this being around can only benefit me (and letting people know the numbers don't mean anything is against my interests.)

Knowledge is power.

Ovyron · Post by **Ovyron** » Sun Jul 05, 2020 12:36 am

DOUBLE POST NOTE - If WDL worked as advertised I'd be the first to switch, accurate predictive information of game outcomes would make scores obsolete and WDL would be the best thing that could be used to rank moves and variations.

This thing is called WDL without actually being WDL.

MikeB · Post by **MikeB** » Sun Jul 05, 2020 1:27 am

Ovyron wrote: ↑Sun Jul 05, 2020 12:36 am DOUBLE POST NOTE - If WDL worked as advertised I'd be the first to switch, accurate predictive information of game outcomes would make scores obsolete and WDL would be the best thing that could be used to rank moves and variations.

This thing is called WDL without actually being WDL.

This is all it is as stated by the authors,

#### UCI_ShowWDL
If enabled, show approximate WDL statistics as part of the engine output.
These WDL numbers model expected game outcomes for a given evaluation and
game ply for engine self-play at fishtest LTC conditions (60+0.6s per game).

No one says you have to like it or use it. I think it's interesting data.

Re: WDL

Re: WDL

Re: WDL

Re: WDL

Re: WDL

Re: WDL

Re: WDL

Re: WDL

Re: WDL

Re: WDL