Pi4Chess wrote: ↑Fri May 21, 2021 11:56 pm
Very interesting results. So the new net architecture seems to be better with less time /cpu power compared the older one?
It's really counter - intuitive. So How to know it is the right direction to have global elo gain in LTC in the future ?
It is not counter intuitive to get less elo with longer time control.
Practically when the level is higher then it is harder to get more elo.
The problem is that the results suggest that earning less elo is not only because of more draws.
When I compare test 1 and test 2 I find that there are less wins and more losses in test 2 that is not something natural that I expect to find
with a good improvement.
test 1:Total: 10000 W: 1559 L: 934 D: 7507 Elo +21.74
test 2:Total: 20000 W: 1381 L: 1044 D: 17575 Elo +5.85
What i mean to be counter-intuitive is that a bigger/more complex net is faster and better with less computer power.
And your point of wins/losses relates to the draw convergence problem with higher elo.
Last edited by Pi4Chess on Mon May 24, 2021 12:03 pm, edited 4 times in total.
Pi4Chess wrote: ↑Fri May 21, 2021 11:56 pm
Very interesting results. So the new net architecture seems to be better with less time /cpu power compared the older one?
It's really counter - intuitive. So How to know it is the right direction to have global elo gain in LTC in the future ?
Indeed, for Pi Chess level hardware, this is a significant major improvement> I have already ported it to Droidfish and I see a major improvement in play at very fast tc. Which makes it more entertaining for human play since to makes more suitable moves for human play at very fast time control. The "old" nnue-stockfish would make weaker moves at very fast tc, this now plays more like human like moves at very fast tc. As an example, without an opening book. at very fast tc, Stockfish would have played 1. Nc3 at depth 6 in the opening start position. , now it plays 1.e4 at depth 6.
Will add some play by depth, play by node count options and few other features (like opening book and move randomization) and then port it to the Harmon Chess app.
I will do some tournament tests to see if there is an improvement for Pi 4 with 2m+1s or 3m+1s time control between nets. Thanks for your input.
I guess that there is a curve.
Suppose you have some minimal number of neurons like seven. They probably won't play great chess no matter how much you train them,
Now suppose you have one hundred trillion neurons. They can make really smart choices but will take very long to train and the calculations will be slow.
I guess it will be a somewhat parabolic or gaussian curve and I doubt if we have found the sweet spot
It is well known that on good hardware the bigger nets do better than the smaller ones.
But now that the news is out, no doubt people will be clamoring to praise Albert Silver for his double sized net idea.
Maybe they should ask him what he did to train it too, because hey, http://www.cegt.net/40_40%20Rating%20Li ... liste.html
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
I see the Stockfish developers' latest work as validation that Albert Silver first had the correct methodology for NNUE architecture and net training. Chess engine progress is being made, and it's being done by following Albert's approach.
Stephen Ham wrote: ↑Mon May 24, 2021 8:06 pm
Well stated, Dann!
I see the Stockfish developers' latest work as validation that Albert Silver first had the correct methodology for NNUE architecture and net training. Chess engine progress is being made, and it's being done by following Albert's approach.
Dann Corbit wrote: ↑Mon May 24, 2021 11:59 am
I guess that there is a curve.
Suppose you have some minimal number of neurons like seven. They probably won't play great chess no matter how much you train them,
Now suppose you have one hundred trillion neurons. They can make really smart choices but will take very long to train and the calculations will be slow.
I guess it will be a somewhat parabolic or gaussian curve and I doubt if we have found the sweet spot
It is well known that on good hardware the bigger nets do better than the smaller ones.
But now that the news is out, no doubt people will be clamoring to praise Albert Silver for his double sized net idea.
Maybe they should ask him what he did to train it too, because hey, http://www.cegt.net/40_40%20Rating%20Li ... liste.html
Are you implying that AS was the first to experiment with, or have a modicum of success with, 512x nets?
. Testing the new Stockfish 45Mb net
. SF 21-05-24-15 vs Stockfish 13
. Time Control : 40 moves in 2 minutes repeating
. Games : 100
. Cores : 20
. Hash table : 2Gb
. Openings : gambits.pgn
Stephen Ham wrote: ↑Mon May 24, 2021 8:06 pm
Well stated, Dann!
I see the Stockfish developers' latest work as validation that Albert Silver first had the correct methodology for NNUE architecture and net training. Chess engine progress is being made, and it's being done by following Albert's approach.
Bigger really is better.
All the best,
-Steve-
I'm sorry, this is complete bullshit. Jjosh (known for Leelenstein) and me both showed that larger nets can work months before AlbertSilver even knew about NNUE. Also, this is a completely new architecture and not just more neurons. Albert didn't contribute to this innovation at all. We've also tried even bigger nets in the ballpark of 120mb way before FatFritz2 arrived (and those were just as strong as FatFritz2). Funnily enough I was the one on discord that specifically told Albert that bigger nets are worth a shot. Others and me even helped him to get his trainer set up.
Saying that it's Alberts approach is disrespectful to many other net trainers that did it months before NNUE was merged into official SF, including myself.
Albert is getting credit for work he simply didn't do.
Stephen Ham wrote: ↑Mon May 24, 2021 8:06 pm
Well stated, Dann!
I see the Stockfish developers' latest work as validation that Albert Silver first had the correct methodology for NNUE architecture and net training. Chess engine progress is being made, and it's being done by following Albert's approach.
Bigger really is better.
All the best,
-Steve-
You do realize that HalfKaV2 is very different from Halfkp right?
Can you point out exactly where "Albert's methodology" helped make this possible.
Here's an image, edit it by pointing at the parts influenced by Albert.
I'll help you, Albert didn't influence anything at all.