Keras/Tensforflow for very sparse inputs

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

derjack
Posts: 16
Joined: Fri Dec 27, 2019 8:47 pm
Full name: Jacek Dermont

Keras/Tensforflow for very sparse inputs

Post by derjack »

Hello,

I made engine for the breakthrough game and succesfully used NN for eval with it. The NN is simple fully connected MLP with one hidden layer. I made the NN and learning in C++ from scratch, but rather using simple vector<> iterations, not SIMD/AVX and whatnot and the algorithm is simple SGD with momentum, so nothing fancy. Because the number of positions is increasing, the training is getting slow so I decided to use some library or tool suitable for the task. Also to quickly test different NN architectures.

The problem is input is very big and sparse. My input implementation is [64][24] so each square can have 24 different states. So always exactly 64 ones are activated, out of 1536. My C++ code looks something like this:

Code: Select all

#define HIDDEN 32
#define SQUARES 64
#define UNITS 24

    float getScore(const vector<int> &squares) {
        float output = 0.0;
        for (int i=0; i < HIDDEN; i++) {
            float score = 0.0;
            for (int j=0; j < SQUARES; j++) {
                score += hiddenWeights[(i*SQUARES+j)*UNITS+squares[j]];
            }
            score = relu(score);
            output += outputWeights[i] * score;
        }

        return tanh(output);
    }
So nested loop that will be called HIDDEN * SQUARES times. So far so good.

I'm only at the beginning of learning keras/tensorflow. I tried to manually 'flatten' the [SQUARES][UNITS] input into vector of 1536. For model I used something like this:

Code: Select all

model = Sequential()
model.add(Dense(32, input_dim=1536))
model.add(Activation('relu'))
model.add(Dense(1))
model.add(Activation('tanh'))
It works, sanity check on small data confirms loss similar to my implementation. But the speed is only comparable and it's very memory hungry. I can fit only tens of thousands of positions instead of millions. I have csv where each row has 1536 inputs and the target and most of them are zeroes. I read something about SparseTensor and/or one-hots but I can't wrap up my head around this. Preferably I would use vector of indexes and the keras would convert them into that sparse vector on the fly and compute efficiently. Or maybe I can use the 2d input, which isn't CNN, after all. I would appreciate any direction, maybe even some simplest example to achieve my goal.
Rein Halbersma
Posts: 741
Joined: Tue May 22, 2007 11:13 am

Re: Keras/Tensforflow for very sparse inputs

Post by Rein Halbersma »

I have done something very similar in 10x10 draughts, explained here: http://laatste.info/bb3/viewtopic.php?f=53&t=8327
The key is understanding tf.gather and tf.reduce_sum
derjack
Posts: 16
Joined: Fri Dec 27, 2019 8:47 pm
Full name: Jacek Dermont

Re: Keras/Tensforflow for very sparse inputs

Post by derjack »

Nice, I actually saw that post before. Couldn't ask on that forum because my account is still inactivated :P. So in your training data you have only the indexes?
Rein Halbersma
Posts: 741
Joined: Tue May 22, 2007 11:13 am

Re: Keras/Tensforflow for very sparse inputs

Post by Rein Halbersma »

derjack wrote: Sun Jan 17, 2021 8:59 am Nice, I actually saw that post before. Couldn't ask on that forum because my account is still inactivated :P. So in your training data you have only the indexes?
Yes, as described in that forum post, each pattern can take on 3**8 values, and each position only has one active pattern instance stored as an index.