TCEC Question

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

Leo
Posts: 1080
Joined: Fri Sep 16, 2016 6:55 pm
Location: USA/Minnesota
Full name: Leo Anger

Re: TCEC Question

Post by Leo »

I heard something about SF now employing deeper tactics.
Advanced Micro Devices fan.
User avatar
Ovyron
Posts: 4556
Joined: Tue Jul 03, 2007 4:30 am

Re: TCEC Question

Post by Ovyron »

Cornfed wrote: Thu Jul 02, 2020 12:46 am Arguably you could find a bunch of starting positions that favor Lc0 by nature...run things again and get a different result.
People have been trying to do that at Playchess. It turns out that when the people with Lc0 try to play the best openings for Leela and the people with Stockfish clones try to play the best openings for Stockfish, Stockfish comes up ahead. By a mile. Not even close.

What happens is that Leela's clock gets to a point that it's not enough for her to keep up, and she gets murdered tactically, and this is regardless of opening.

The truth appears when both sides are trying their best to win, instead of cherry picked openings, so it's telling Stockfish wins in both scenarios.
Cornfed
Posts: 511
Joined: Sun Apr 26, 2020 11:40 pm
Full name: Brian D. Smith

Re: TCEC Question

Post by Cornfed »

Ovyron wrote: Thu Jul 02, 2020 2:29 am
Cornfed wrote: Thu Jul 02, 2020 12:46 am Arguably you could find a bunch of starting positions that favor Lc0 by nature...run things again and get a different result.
People have been trying to do that at Playchess. It turns out that when the people with Lc0 try to play the best openings for Leela and the people with Stockfish clones try to play the best openings for Stockfish, Stockfish comes up ahead. By a mile. Not even close.

What happens is that Leela's clock gets to a point that it's not enough for her to keep up, and she gets murdered tactically, and this is regardless of opening.

The truth appears when both sides are trying their best to win, instead of cherry picked openings, so it's telling Stockfish wins in both scenarios.
Very interesting. I presume it is pretty much all quick play games?
User avatar
Ovyron
Posts: 4556
Joined: Tue Jul 03, 2007 4:30 am

Re: TCEC Question

Post by Ovyron »

Cornfed wrote: Thu Jul 02, 2020 3:55 am Very interesting. I presume it is pretty much all quick play games?
This has been observed in 20 minutes +3 seconds time controls.

Note that the trend was the opposite back in July 2019, back then Leela was queen, and the openings that were the strongest in Stockfish clones vs Stockfish clones games would fail against Leela, and the most important tournaments were won by Leela users. This caused an opening revolution where some lines' scores were flipped on their heads (the moves Leela was playing were added to books and used by Stockfish clone users to beat other Stockfish clone users), which ultimately led to opening variety getting lost. You play the best lines or you perish.

TCEC's generic lines would stand no chance against this tremendous work that people put into their chess lines, so the most relevant games to know engines' strength are happening at Playchess or the Engine Masters Tournament of InfinityChess. And... huh... - Spoiler Alert - Brainfish is the strongest Stockfish clone, as soon as a better clone appears someone will use it to win the tourneys and we will know, but so far all the bells and whistles of other clones have proven useless, and people don't even use Brainfish's Cerebellum library...
Leo
Posts: 1080
Joined: Fri Sep 16, 2016 6:55 pm
Location: USA/Minnesota
Full name: Leo Anger

Re: TCEC Question

Post by Leo »

SF officially won TCEC 18
Advanced Micro Devices fan.
Nay Lin Tun
Posts: 708
Joined: Mon Jan 16, 2012 6:34 am

Re: TCEC Question

Post by Nay Lin Tun »

Ovyron wrote: Thu Jul 02, 2020 4:19 am
Cornfed wrote: Thu Jul 02, 2020 3:55 am Very interesting. I presume it is pretty much all quick play games?
This has been observed in 20 minutes +3 seconds time controls.

Note that the trend was the opposite back in July 2019, back then Leela was queen, and the openings that were the strongest in Stockfish clones vs Stockfish clones games would fail against Leela, and the most important tournaments were won by Leela users. This caused an opening revolution where some lines' scores were flipped on their heads (the moves Leela was playing were added to books and used by Stockfish clone users to beat other Stockfish clone users), which ultimately led to opening variety getting lost. You play the best lines or you perish.

TCEC's generic lines would stand no chance against this tremendous work that people put into their chess lines, so the most relevant games to know engines' strength are happening at Playchess or the Engine Masters Tournament of InfinityChess. And... huh... - Spoiler Alert - Brainfish is the strongest Stockfish clone, as soon as a better clone appears someone will use it to win the tourneys and we will know, but so far all the bells and whistles of other clones have proven useless, and people don't even use Brainfish's Cerebellum library...

In a few months ago, when I tested brainfish on 4 cpu vs Leela on 1060 GTX. Brainfish lose vs Leela in her main line book exit. And the tournment games keep repeating exactly the same line and I have to stop. ( There is no point testing engines strength in one specific line/ position again and again).
User avatar
Ovyron
Posts: 4556
Joined: Tue Jul 03, 2007 4:30 am

Re: TCEC Question

Post by Ovyron »

Nay Lin Tun wrote: Thu Jul 02, 2020 6:01 am ( There is no point testing engines strength in one specific line/ position again and again).
As I said, people put tremendous work into their chess lines, you may have a decent book and use it for 3 days and have people already developed counter-lines against whatever you have.

Lines keep evolving as people keep changing the moves for better ones. Back in the "pre-alpha zero" times theory had stagnated and even my analysis of the opening position hit 0.00 and was there for years, after Leela and these events it has fallen to a solid 0.13 and it has become a dangerous place where you make one slip and get crunched positionally by a lc0 or tactically by a Stockfish.

No chess line is repeated, at least not at the top where people are actively looking to play into hard positions for potential opponents or trying to find holes in people's books.

The edge of chess theory is getting tested and that's the only relevant chess, because all the other lines that aren't getting played have been proved as inferior.
Milos
Posts: 4190
Joined: Wed Nov 25, 2009 1:47 am

Re: TCEC Question

Post by Milos »

dkappe wrote: Wed Jul 01, 2020 12:52 am Just to throw some more fuel on the fire, the GPU server was rebooted after 26 games because admins thought there might be something amiss. Before reboot, SF +4. After reboot: SF +1. Who knows.
Before reboot, SF +4. After reboot: SF +4. Fanboys of NN usually don't know. :lol:
Milos
Posts: 4190
Joined: Wed Nov 25, 2009 1:47 am

Re: TCEC Question

Post by Milos »

Dann Corbit wrote: Mon Jun 29, 2020 5:30 am
Cornfed wrote: Mon Jun 29, 2020 2:28 am
Dann Corbit wrote: Sun Jun 28, 2020 4:16 am In less than 1000 games practically any outcome is possible amongst approximate equals.
I guess that they are very close to equal, but SF had some fortunate outcomes.
And if SF is stronger, it is not by an enormous margain, as evidenced by the draw count.
I think the proverbial 'sample size' answer just kind of begs the question.

What does "fortunate" mean? Did LZ0 stay out late partying the night before?
If I flip a fair coin 100 times, 50 heads and 50 tails as the actual outcome is not likely[*]. The possible outcomes form a Gaussian curve and a 1 SD wide swath holds lots of different possibilities.
After 71 games SF leads 37.5 vs 33.5 (and game 72 looks like it will end in the Fish's favor as well...). A 4 to 5 pt lead at this point is actually reasonably significant. That said, there are more games to be played and Game 72 has LCZero defending the Latvian Counter Gambit...which is bad. Has SF yet to defend it? I don't know. The Devil is in the details.

EDIT: It has, the game before and SF lost...just as LCZero looks to at the moment.
[*] OK, it is the most likely SINGLE outcome, but the probability is enormously close to 49/51 and 51/49 and 48/52 and 52/48, etc, with the probability tailing off gradually.

To convince yourself, get a PRNG that generates random numbers between zero and one and run it one hundred times for 1000 cycles and record the different outcomes (numbers above and below one half) that actually occur. You will see some 50/50 outcomes, but you will also see some off a bit and a few that are way off. Remember, now that the "opponents" of "above a half" and "below a half" have exactly the same strength.

For another really funny outcome, see how many numbers are exactly one half with your generator. If it is an 8 byte floating point number and the values are uniformly distributed I would guess zero results of exactly one half for an individual value will show up in all 100,000 emitted elements. Of course, I would insist on testing for equality using == rather than the more usual definition because 1/2 is a special number that can be represented exactly and that is the odd outcome I refer to.
Man, how can you be so clueless on the statistics? What you write here and in the other thread about LoS is frankly laughable. Try reading a bit what Kai is writing.
Milos
Posts: 4190
Joined: Wed Nov 25, 2009 1:47 am

Re: TCEC Question

Post by Milos »

ernest wrote: Wed Jul 01, 2020 11:35 pm
Leo wrote: Wed Jul 01, 2020 9:12 pm Much to my surprise LCO is really getting beat up. AB engines are not dead. Not even close.
I do repeat my question : what about the LeelaRatio of the current TCEC configuration ?
Maybe this is a clue to the match outcome...
What is LeelaRatio? Imaginary number used by Leela fanboys. Give me a break. A/B engines are playing on noticeably weaker hardware than 3990x that costs 3400$ and uses 280W of power. Moneywise in current hardware Lc0 has 2x advantage (and that assuming 2080Ti not V100 that Lc0 is actually using). Powerwise 4x.
It's actually embarrassing how privileged are NN engines compared to A/B.