Larry Kaufmann chose a position similar to this: it is a very difficult task for a program to find a2-a4! Also for today's standards.
Only Komodo finds it, neither Fritz 11 or Stockfish
Lc0 with latest test30 nets is vastly superior positionally
Moderators: hgm, Rebel, chrisw
-
- Posts: 51
- Joined: Mon Feb 20, 2017 8:29 am
- Location: Rialto, Venice
-
- Posts: 51
- Joined: Mon Feb 20, 2017 8:29 am
- Location: Rialto, Venice
Re: Lc0 with latest test30 nets is vastly superior positionally
Another position that I suggest to add is this. Very very difficult for a program to find b3 in order to reply on Nc5 with Rb1! and b4 (what does Leela play?)
-
- Posts: 204
- Joined: Tue Oct 15, 2013 2:34 am
- Location: US
- Full name: Mike Babigian
Re: Lc0 with latest test30 nets is vastly superior positionally
More powerful hardware will mask over some of the search problem; however, as anyone that has let lc0 think to millions of nodes will tell you, it is not enough.
AB may not be necessary but a vastly improved MCTS IS at a minimum. In the past decades nearly all elo was gained through breakthroughs in search and pruning. It is time to put that vast treasure trove of knowledge to work for NN's. When we do, the tactical weakness will disappear.
AB may not be necessary but a vastly improved MCTS IS at a minimum. In the past decades nearly all elo was gained through breakthroughs in search and pruning. It is time to put that vast treasure trove of knowledge to work for NN's. When we do, the tactical weakness will disappear.
Last edited by mbabigian on Wed Jan 09, 2019 5:41 pm, edited 1 time in total.
“Censorship is telling a man he can't have a steak just because a baby can't chew it.” ― Mark Twain
-
- Posts: 204
- Joined: Tue Oct 15, 2013 2:34 am
- Location: US
- Full name: Mike Babigian
Re: Lc0 with latest test30 nets is vastly superior positionally
More powerful hardware will mask over some of the search problem; however, as anyone that has let lc0 think to millions of nodes will tell you, it is not enough.
AB may not be necessary but a vastly improved MCTS IS at a minimum. In the past decades nearly all elo was gained through breakthroughs in search and pruning. It is time to put that vast treasure trove of knowledge to work for NN's. When we do, the tactical weakness will disappear.
That's not to say they won't beat SF with a tactically weak version, but the weakness needs fixing regardless.
AB may not be necessary but a vastly improved MCTS IS at a minimum. In the past decades nearly all elo was gained through breakthroughs in search and pruning. It is time to put that vast treasure trove of knowledge to work for NN's. When we do, the tactical weakness will disappear.
That's not to say they won't beat SF with a tactically weak version, but the weakness needs fixing regardless.
“Censorship is telling a man he can't have a steak just because a baby can't chew it.” ― Mark Twain
-
- Posts: 204
- Joined: Tue Oct 15, 2013 2:34 am
- Location: US
- Full name: Mike Babigian
Re: Lc0 with latest test30 nets is vastly superior positionally
Sorry about the double post. Was replying with my phone browser and something went haywire. Can't delete the first one...
“Censorship is telling a man he can't have a steak just because a baby can't chew it.” ― Mark Twain
-
- Posts: 1766
- Joined: Wed Jun 03, 2009 12:14 am
Re: Lc0 with latest test30 nets is vastly superior positionally
absolutely, but that goes for everything. it's only now that the leela team is able to replicate deepmind's parameters; everything before was guesswork in a lot of places. and imo the current approach should be enough to equal or surpass SF10, but test40 should reveal whether or not that's the case.mbabigian wrote: ↑Wed Jan 09, 2019 6:31 am I don't believe more training will substantially improve tactical strength. It appears the search technique used is plain weak. I theorize that some improved search method will do more to add elo than better training can at this point. Perhaps a hybrid MCTS, AB search like Mark and Larry have tried will work better, but I don't believe LC0's weak tactics can be solved via smarter networks. New search methods should be tried.
My two cents.
as i understand it albert silver is doing a lot of experimentation re non-zero approaches for deusX, which is another vast area to explore.
one thing i haven't understood is why (& i may be wrong) tactics have peaked before the first LR drop, & iirc actually regressed after. if that's the case it seems like it should be a solvable & maybe reversible issue.
-
- Posts: 3293
- Joined: Wed Mar 08, 2006 8:15 pm
Re: Lc0 with latest test30 nets is vastly superior positionally
I am sceptical for all "positional" test suites and I removed STS from my testsuites! Reason: Houdini 6 was still the best and that suite can't detect any progress from SF8 -> SF9 -> SF10 so quite useless. BTW in Kai's 200 position set with 3s limit I got: Houdini6 109 Komodo123 103 and SF10 98. Houdini is a real positional master . May be there are no such thing like positional play at all - only score means.
Jouni
-
- Posts: 10948
- Joined: Wed Jul 26, 2006 10:21 pm
- Full name: Kai Laskos
Re: Lc0 with latest test30 nets is vastly superior positionally
You have a pretty large variance in the test, engines, especially on many threads, switch often the best move on this positional test suite. Also, I recommended to use the solution found in between time/2 to time/1 (in Polyglot this is easily doable). I used the suite five times, to have a more reliable picture, and the results are (5 x 200 = 1000 positions)Jouni wrote: ↑Sat Jan 12, 2019 5:49 pm I am sceptical for all "positional" test suites and I removed STS from my testsuites! Reason: Houdini 6 was still the best and that suite can't detect any progress from SF8 -> SF9 -> SF10 so quite useless. BTW in Kai's 200 position set with 3s limit I got: Houdini6 109 Komodo123 103 and SF10 98. Houdini is a real positional master . May be there are no such thing like positional play at all - only score means.
Code: Select all
Lc0 v20.1 ID32458: 712/1000
Houdini 6.03: 558/1000
Komodo 12.3: 556/1000
Stockfish 10: 524/1000
Ethereal 11.0: 457/1000
-
- Posts: 3657
- Joined: Wed Nov 18, 2015 11:41 am
- Location: hungary
Re: Lc0 with latest test30 nets is vastly superior positionally
Your results are confirm the earlier experiences Leela is stronger in positional play then top AB engines andLaskos wrote: ↑Sun Jan 13, 2019 2:10 pm ...
I used the suite five times, to have a more reliable picture, and the results are (5 x 200 = 1000 positions)The standard deviation of the result seems to be about 15, so Houdini and Komodo do seem a bit stronger than SF, but all of them far behind Leela. Observe also that Ethereal is significantly lower than the top 3 regular engines. And again, Leela seems vastly superior to any regular engine on this suite, despite being only on par with top regular engines in games in my conditions (CPU/GPU).Code: Select all
Lc0 v20.1 ID32458: 712/1000 Houdini 6.03: 558/1000 Komodo 12.3: 556/1000 Stockfish 10: 524/1000 Ethereal 11.0: 457/1000
this is the behavior what compensates its weakness in tactical/endgame play.
But the main question is how far Leela can go relative to AB engines with this unbalanced chess knowledge.
-
- Posts: 204
- Joined: Tue Oct 15, 2013 2:34 am
- Location: US
- Full name: Mike Babigian
Re: Lc0 with latest test30 nets is vastly superior positionally
Even though nodes are counted differently per program, it would be interesting to see a fixed node count test done on tactical test suites. I think it would be just as illuminating. As node counts double I'd expect AB engines to tactically improve faster than LC0.
If this is true, smarter networks will be held back by the weak search until the problem is taken seriously.
I'd also be curious to see which approach solves more problems at similar node counts (despite the difficulties of comparing node counts between programs).
If this is true, smarter networks will be held back by the weak search until the problem is taken seriously.
I'd also be curious to see which approach solves more problems at similar node counts (despite the difficulties of comparing node counts between programs).
“Censorship is telling a man he can't have a steak just because a baby can't chew it.” ― Mark Twain