I made engine for the breakthrough game and succesfully used NN for eval with it. The NN is simple fully connected MLP with one hidden layer. I made the NN and learning in C++ from scratch, but rather using simple vector<> iterations, not SIMD/AVX and whatnot and the algorithm is simple SGD with momentum, so nothing fancy. Because the number of positions is increasing, the training is getting slow so I decided to use some library or tool suitable for the task. Also to quickly test different NN architectures.
The problem is input is very big and sparse. My input implementation is [64][24] so each square can have 24 different states. So always exactly 64 ones are activated, out of 1536. My C++ code looks something like this:
Code: Select all
#define HIDDEN 32
#define SQUARES 64
#define UNITS 24
float getScore(const vector<int> &squares) {
float output = 0.0;
for (int i=0; i < HIDDEN; i++) {
float score = 0.0;
for (int j=0; j < SQUARES; j++) {
score += hiddenWeights[(i*SQUARES+j)*UNITS+squares[j]];
}
score = relu(score);
output += outputWeights[i] * score;
}
return tanh(output);
}
I'm only at the beginning of learning keras/tensorflow. I tried to manually 'flatten' the [SQUARES][UNITS] input into vector of 1536. For model I used something like this:
Code: Select all
model = Sequential()
model.add(Dense(32, input_dim=1536))
model.add(Activation('relu'))
model.add(Dense(1))
model.add(Activation('tanh'))