catastrophic forgetting
Posted: Thu May 09, 2019 6:16 pm
I am trying to train neural networks for my chess variant playing program Nebiyu. It supports > 10 chess variants and can also play Go/Hex/Reversi/Amazons/Checkers. Focusing only on the chess variants, if I set out to train a separate network for each variant, it will be cumbersome. So I am thinking to train one network for all chess variants. However, there is a well known issue, which is NN tend to forget old data
while learning new information, so called "catastrophic forgetting". That is if I train the network first for standard chess, and then for suicide chess it will essentially remember only the weights necessary for playing suicide chess well. One way to solve this is to play training games for all variants simultaneously and train the network with all variants simultaneously. I could feed in the variant type by an input plane (which I am planning to do). Deepmind has already investigated this problem with their Atari and came up with a different solution that allows you to train games in sequence and be able to remember past data. I don't fully understand this paper but here it is : https://arxiv.org/pdf/1612.00796.pdf .
I think this is the correct path towards general AI i.e. one brain for multiple tasks, instead of a specialized brain (neural network) for each task.
Any thoughts ?
while learning new information, so called "catastrophic forgetting". That is if I train the network first for standard chess, and then for suicide chess it will essentially remember only the weights necessary for playing suicide chess well. One way to solve this is to play training games for all variants simultaneously and train the network with all variants simultaneously. I could feed in the variant type by an input plane (which I am planning to do). Deepmind has already investigated this problem with their Atari and came up with a different solution that allows you to train games in sequence and be able to remember past data. I don't fully understand this paper but here it is : https://arxiv.org/pdf/1612.00796.pdf .
I think this is the correct path towards general AI i.e. one brain for multiple tasks, instead of a specialized brain (neural network) for each task.
Any thoughts ?