Any news of a Komodo update in sight?

mjlef · Post by **mjlef** » Wed Dec 13, 2017 9:14 pm

Michael Sherwin wrote:
mjlef wrote:
shrapnel wrote:
Jesse Gersenson wrote:If people want learning in Komodo, let your voice be heard; Mark and Larry are open to feature requests, especially those requested by a lot of people.
'.
So Reinforcement Learning will be introduced by Komodo team (if they are capable of it) only on REQUEST.
Looks like the thrashing stockfish received still hasn't convinced them.
*** Unbelievable.
Of course we listen to requests. But I do not think the Romi learning is anything like the learning Google/DeepMind did. They used 5000 special TensorFlow Processing units (TPUs), each costing thousands of dollars. Right now, this is way beyond our resources. Romi learning is most likely simply saving important positions in a persistant hash. In future games, these are reloaded into the main hash table, so the new game, if ir encouters one of these positions, alread has deep search information ofr them. The thing is, this helps not at all if a different lien is played. My old program NOW had this feature, but I did it not so much to make it stronger, but instead to avoid losing lines during tournaments. You can think of it as a self-correcting book, which benefits if the same line it tried on Komodo.

Larry and I often discuss Monte Carlo Tree Search, and are interested in trying this. We have also discussed uses for neural networks. Small nns could be useful in present PCs, but the massive nn used in AlphaGo Zero is currently beyond what we, and most chess engine users can afford.

We listen, and try to add what we think people want. But we do not have endless resources. We can afford to buy roughly 1 new server each year. It is not a matter of being "convinced". We just cannot afford it currently. Hopefully, GPUs in graphics cards will get faster and perhaps nn hardware will get added to future PCs or at least be much less expensive.
Mark Lefler was kind enough to ask me in a pm to error check the above!

But I do not think the Romi learning is anything like the learning Google/DeepMind did.
There is circumstantial evidence that it is similar in nature. Reinforcement learning involves accumulated rewards (and penalties) for every position in a database. It could be a persistent hash like suggested.

Romi learning is most likely simply saving important positions in a persistant hash.
Or the way Romi actually does it in a tree of all played games. This is superior to a persistent hash as only the subtree from the current position is loaded into the game hash. This has the advantage that only useful information ends up in the game hash.

The thing is, this helps not at all if a different lien is played.
Not true. Rewards and penalties are greater near the leaves and over time are back propagated to the root. Since every node is a root to a subtree, every move benefits from backpropagation. This results in a meaningful differentiation resulting in a gradual determination of which move gives better results. So for example at the actual root Romi will settle on say 1.e4 or 1.d4 etc. as being best and will always play that opening move. Just like AlphaZ always played 1.d4.

You can think of it as a self-correcting book, which benefits if the same line it tried on Komodo.
Philosophically I would argue that what RomiChess does has nothing to do with a book. In a book an engine usually can choose randomly between acceptable moves. In RomiChess only the learned best move is played in its Monkey See Monkey Do "book". That is separate from Romi's reinforcement learning. It just so happens that the stored tree of all Romi's games can handle both a "book" and the rewards/penalties for reinforcement learning.

Thanks Mark for inviting me to share some details!

First, thanks for supplying more details. My speculations were based on what you had reported before.

So this is a type of persistent hash, although instead of loading many positions, it has a database and load positions based on the current board position. You are right that any position leading to the positions in the database would benefit, but it practicality, the game diverges quickly, expect for some very narrow forced lines.

This type of learning is very different from a neural network which funds generalized patterns based on many positions, and is not tied to a specific node. Of course the same reinforced learning would also train the program to prefer some opening lines over other as well. And learn which endings are wins, losses or draws, like a slightly inaccurate tablebase.

I simply do not understand this "Since every node is a root to a subtree, every move benefits from backpropagation." Only positions encountered i the current search tree that were in the database before can benefit. Once something changes (a piece is captured), no moves past that point can benefit unless they transpose to a position in the database.

You give no details on how it "learns". the simplest thing is o save as many positions as possible. trying to use stats of how winning or losing a position is could be useful in deciding which positions to keep (also depth, how popular a position is, how much the score changed compared with siblings).

In NOW my system was quite simple. It had a relatively small, fixed sized table save to disk containing hash positions (keys, scores, bounds..). It kept them sorted by the hash key, for fast updating. Before each search, then positions were placed at the appropriate point in the main hash table. After each completed search, hash positions were selected to add to the table, or update the table entry if they had better information (better bounds, more depth...). Positions that change the score, especially at the root were given priority in replacement. It added positions based on the PV. Given this was a 32 bit program, long ago, having a large database was not practical. It did change future play, which was the point, and when fed the same opening moves, did gain strength. Overall, transpositions to other positions were pretty rare. The branching factor is chess is just so large. but now that we have bigger memory, maybe it would help more.

Paloma · Post by **Paloma** » Wed Dec 13, 2017 9:50 pm

Tto bring this thread to its origins,

i think / hope it will come shortly for x-mas.

mjlef · Post by **mjlef** » Wed Dec 13, 2017 10:23 pm

Werewolf wrote:72 draws and 28 wins. What elo difference is that? About 100 elo?

Changing Stockfish to:
- A faster dual xeon with HT off
- A great opening book
- Bigger hash
- More flexible time control

How much elo does that add? It must be quite a bit. I wonder if a really deep line in the opening could catch Alpha Zero out, who knows?

The elo difference between them would be smaller and AZ is using much more powerful hardware.

I think all serious tournaments use Syzygy, so I suggest 6 piece Syzygy on a fast SSD. I suspect us engine authors of not trying very hard to improve simple endings past the point where Syzygy was commonly used. And a neural network would certainly learn the importance of which endings are lost, won, or drawn.

Nordlandia · Post by **Nordlandia** » Thu Dec 14, 2017 1:46 pm

Just wanted to mention that latest android update broke K11 on Galaxy S8+

I can't load Komodo.

Ulpius · Post by **Ulpius** » Thu Dec 14, 2017 5:11 pm

Paloma wrote:Tto bring this thread to its origins,

i think / hope it will come shortly for x-mas.

Thank you for the first answer that fits to my question; I hope you are right!!??

I simply like the Komodo playing style and love to observe Komodo's improvements over the years.

leavenfish · Post by **leavenfish** » Sat Dec 16, 2017 3:57 am

I really do not think Christmas will come to we subscribers.

Paloma · Post by **Paloma** » Sat Dec 16, 2017 10:43 am

Yes, the time now is really short

Ulpius · Post by **Ulpius** » Sat Dec 16, 2017 1:34 pm

Paloma wrote:Yes, the time now is really short

Anyway, I'm convinced the Komodo-team has found lots of little improvements compared to v11.2.2 that make it senseful to publish a new update.

shrapnel · Post by **shrapnel** » Sat Dec 16, 2017 2:35 pm

Ulpius wrote:Anyway, I'm convinced the Komodo-team has found lots of little improvements compared to v11.2.2 that make it senseful to publish a new update.

Pathetic !
C'mon Komodo people, don't be so cruel, take pity on poor Ulf and throw him some dev version

leavenfish · Post by **leavenfish** » Sat Dec 16, 2017 3:39 pm

I've asked Mark in chat on TCEC...at least to give subscribers the versions they are using in, say TCEC. But no...afraid of 'bugs' and it not being a supported version.

But...just call them say Komodo 10.3Betam, etc...and post that those versions are not supported. But no, they do not seem to want to do this for subscribers.

Subscribers might could help in testing....and if not, at least it would make us happy.

Any news of a Komodo update in sight?

Re: Any news of a Komodo update in sight?

Re: Any news of a Komodo update in sight?

Re: Any news of a Komodo update in sight?

Re: Any news of a Komodo update in sight?

Re: Any news of a Komodo update in sight?

Re: Any news of a Komodo update in sight?

Re: Any news of a Komodo update in sight?

Re: Any news of a Komodo update in sight?

Re: Any news of a Komodo update in sight?

Re: Any news of a Komodo update in sight?