Matchbox learning

sje · Post by **sje** » Sun Mar 16, 2008 7:24 pm

A few decades ago, Martin Gardner wrote about a manually operated computer that played hexapawn (3x3 board, two rows of pawns) and included an effective learning algorithm. The computer was built from a couple of dozen small matchboxes and a bunch of colored jellybeans. Each matchbox was labeled with a possible hexapawn board position and contained jellybeans that corresponded to the available moves. Operation of the computer involved selectively removing jellybeans that represented moves in lost games.

Symbolic is now using a tangentially related algorithm for book learning. After a couple of hundred seed games using a conventional book, it has been building its current book based only on its past ICC play. This book now contains about 120,000 positions based on about 2,500 games and is increasing daily.

Each matchbox consists of a position hash and the jellybeans are the historical win/lose/draw counts. The position list and jellybean distribution are updated semi-automatically (I have to run a batch file) every day or so.

The working hypothesis is that a small book tuned to a program's peculiarities is operationally superior to a much larger book generated from play in a general, high strength population.

The two main disadvantages of this approach are:

1) A need to have a good variety of opponents,

2) Significant modifications of the search and evaluation code will likely degrade effectiveness of a book tuned to prior behavior.

Mark · Post by **Mark** » Tue Mar 18, 2008 2:38 am

sje wrote:A few decades ago, Martin Gardner wrote about a manually operated computer that played hexapawn (3x3 board, two rows of pawns) and included an effective learning algorithm. The computer was built from a couple of dozen small matchboxes and a bunch of colored jellybeans. Each matchbox was labeled with a possible hexapawn board position and contained jellybeans that corresponded to the available moves. Operation of the computer involved selectively removing jellybeans that represented moves in lost games.

I remember having a great time making Gardner's matchbox computer when I was a kid! It didn't take long before the computer learned to make the best moves. I had always wanted to apply it to a more complicated game, but never got around to it.

Regards,

Mark

sje · Post by **sje** » Tue Mar 18, 2008 5:22 am

I recall that one of Gardner's readers went on to produce a physical octopawn computer; it must have needed hundreds of matchboxes.

Perhaps it's not stretching the truth too much to say that Gardner created the first six man tablebase for a board game.

bob · Post by **bob** » Wed Mar 19, 2008 10:21 pm

sje wrote:A few decades ago, Martin Gardner wrote about a manually operated computer that played hexapawn (3x3 board, two rows of pawns) and included an effective learning algorithm. The computer was built from a couple of dozen small matchboxes and a bunch of colored jellybeans. Each matchbox was labeled with a possible hexapawn board position and contained jellybeans that corresponded to the available moves. Operation of the computer involved selectively removing jellybeans that represented moves in lost games.

Symbolic is now using a tangentially related algorithm for book learning. After a couple of hundred seed games using a conventional book, it has been building its current book based only on its past ICC play. This book now contains about 120,000 positions based on about 2,500 games and is increasing daily.

Each matchbox consists of a position hash and the jellybeans are the historical win/lose/draw counts. The position list and jellybean distribution are updated semi-automatically (I have to run a batch file) every day or so.

The working hypothesis is that a small book tuned to a program's peculiarities is operationally superior to a much larger book generated from play in a general, high strength population.

The two main disadvantages of this approach are:

1) A need to have a good variety of opponents,

2) Significant modifications of the search and evaluation code will likely degrade effectiveness of a book tuned to prior behavior.

Here's the down-side... If I notice you using this on ICC, you can expect crafty to teach you a terrible lesson that will come home to roost when we next meet in a CCT/ACCA event.

One has to be careful how you use things you learn, lest there be traps set.

sje · Post by **sje** » Thu Mar 20, 2008 12:12 am

bob wrote:Here's the down-side... If I notice you using this on ICC, you can expect crafty to teach you a terrible lesson that will come home to roost when we next meet in a CCT/ACCA event.

One has to be careful how you use things you learn, lest there be traps set.

Yes, I've foreseen this, and that's why there's a need for a variety of opponents and an assumption that they're not collaborating.

Oh, and I might have a surprise or two of my own come tournament day.

mathmoi · Post by **mathmoi** » Thu Mar 20, 2008 3:16 pm

bob wrote:
sje wrote:A few decades ago, Martin Gardner wrote about a manually operated computer that played hexapawn (3x3 board, two rows of pawns) and included an effective learning algorithm. The computer was built from a couple of dozen small matchboxes and a bunch of colored jellybeans. Each matchbox was labeled with a possible hexapawn board position and contained jellybeans that corresponded to the available moves. Operation of the computer involved selectively removing jellybeans that represented moves in lost games.

Symbolic is now using a tangentially related algorithm for book learning. After a couple of hundred seed games using a conventional book, it has been building its current book based only on its past ICC play. This book now contains about 120,000 positions based on about 2,500 games and is increasing daily.

Each matchbox consists of a position hash and the jellybeans are the historical win/lose/draw counts. The position list and jellybean distribution are updated semi-automatically (I have to run a batch file) every day or so.

The working hypothesis is that a small book tuned to a program's peculiarities is operationally superior to a much larger book generated from play in a general, high strength population.

The two main disadvantages of this approach are:

1) A need to have a good variety of opponents,

2) Significant modifications of the search and evaluation code will likely degrade effectiveness of a book tuned to prior behavior.
Here's the down-side... If I notice you using this on ICC, you can expect crafty to teach you a terrible lesson that will come home to roost when we next meet in a CCT/ACCA event.

One has to be careful how you use things you learn, lest there be traps set.

Hi,

I don't understand why you think this learning technique is flawed. Can you explain?

bob · Post by **bob** » Thu Mar 20, 2008 4:51 pm

mathmoi wrote:
bob wrote:
sje wrote:A few decades ago, Martin Gardner wrote about a manually operated computer that played hexapawn (3x3 board, two rows of pawns) and included an effective learning algorithm. The computer was built from a couple of dozen small matchboxes and a bunch of colored jellybeans. Each matchbox was labeled with a possible hexapawn board position and contained jellybeans that corresponded to the available moves. Operation of the computer involved selectively removing jellybeans that represented moves in lost games.

Symbolic is now using a tangentially related algorithm for book learning. After a couple of hundred seed games using a conventional book, it has been building its current book based only on its past ICC play. This book now contains about 120,000 positions based on about 2,500 games and is increasing daily.

Each matchbox consists of a position hash and the jellybeans are the historical win/lose/draw counts. The position list and jellybean distribution are updated semi-automatically (I have to run a batch file) every day or so.

The working hypothesis is that a small book tuned to a program's peculiarities is operationally superior to a much larger book generated from play in a general, high strength population.

The two main disadvantages of this approach are:

1) A need to have a good variety of opponents,

2) Significant modifications of the search and evaluation code will likely degrade effectiveness of a book tuned to prior behavior.
Here's the down-side... If I notice you using this on ICC, you can expect crafty to teach you a terrible lesson that will come home to roost when we next meet in a CCT/ACCA event.

One has to be careful how you use things you learn, lest there be traps set.
Hi,

I don't understand why you think this learning technique is flawed. Can you explain?

Easily. Suppose We play an opening line where white (my side) has an easy winning position. But I make my program intentionally play badly so that yours learns that this is a good opening for black? Then, on tournament day, you walk right into this opening line that I "taught you" but this time, I don't have my program playing like an idiot, it is playing to win, and does so easily.

That's the down-side t- learning, and it is why I don't depend on it for preparing a book for tournament play at all.

mathmoi · Post by **mathmoi** » Thu Mar 20, 2008 6:05 pm

bob wrote:
mathmoi wrote:
bob wrote:
sje wrote:A few decades ago, Martin Gardner wrote about a manually operated computer that played hexapawn (3x3 board, two rows of pawns) and included an effective learning algorithm. The computer was built from a couple of dozen small matchboxes and a bunch of colored jellybeans. Each matchbox was labeled with a possible hexapawn board position and contained jellybeans that corresponded to the available moves. Operation of the computer involved selectively removing jellybeans that represented moves in lost games.

Symbolic is now using a tangentially related algorithm for book learning. After a couple of hundred seed games using a conventional book, it has been building its current book based only on its past ICC play. This book now contains about 120,000 positions based on about 2,500 games and is increasing daily.

Each matchbox consists of a position hash and the jellybeans are the historical win/lose/draw counts. The position list and jellybean distribution are updated semi-automatically (I have to run a batch file) every day or so.

The working hypothesis is that a small book tuned to a program's peculiarities is operationally superior to a much larger book generated from play in a general, high strength population.

The two main disadvantages of this approach are:

1) A need to have a good variety of opponents,

2) Significant modifications of the search and evaluation code will likely degrade effectiveness of a book tuned to prior behavior.
Here's the down-side... If I notice you using this on ICC, you can expect crafty to teach you a terrible lesson that will come home to roost when we next meet in a CCT/ACCA event.

One has to be careful how you use things you learn, lest there be traps set.
Hi,

I don't understand why you think this learning technique is flawed. Can you explain?
Easily. Suppose We play an opening line where white (my side) has an easy winning position. But I make my program intentionally play badly so that yours learns that this is a good opening for black? Then, on tournament day, you walk right into this opening line that I "taught you" but this time, I don't have my program playing like an idiot, it is playing to win, and does so easily.

That's the down-side t- learning, and it is why I don't depend on it for preparing a book for tournament play at all.

Hi,

Ok, this is a problem common to any learning chess engine. I though you were pointing a flaw in the "matchbox learning applied to chess opening" approach.

bob · Post by **bob** » Thu Mar 20, 2008 6:23 pm

mathmoi wrote:
bob wrote:
mathmoi wrote:
bob wrote:
sje wrote:A few decades ago, Martin Gardner wrote about a manually operated computer that played hexapawn (3x3 board, two rows of pawns) and included an effective learning algorithm. The computer was built from a couple of dozen small matchboxes and a bunch of colored jellybeans. Each matchbox was labeled with a possible hexapawn board position and contained jellybeans that corresponded to the available moves. Operation of the computer involved selectively removing jellybeans that represented moves in lost games.

Symbolic is now using a tangentially related algorithm for book learning. After a couple of hundred seed games using a conventional book, it has been building its current book based only on its past ICC play. This book now contains about 120,000 positions based on about 2,500 games and is increasing daily.

Each matchbox consists of a position hash and the jellybeans are the historical win/lose/draw counts. The position list and jellybean distribution are updated semi-automatically (I have to run a batch file) every day or so.

The working hypothesis is that a small book tuned to a program's peculiarities is operationally superior to a much larger book generated from play in a general, high strength population.

The two main disadvantages of this approach are:

1) A need to have a good variety of opponents,

2) Significant modifications of the search and evaluation code will likely degrade effectiveness of a book tuned to prior behavior.
Here's the down-side... If I notice you using this on ICC, you can expect crafty to teach you a terrible lesson that will come home to roost when we next meet in a CCT/ACCA event.

One has to be careful how you use things you learn, lest there be traps set.
Hi,

I don't understand why you think this learning technique is flawed. Can you explain?
Easily. Suppose We play an opening line where white (my side) has an easy winning position. But I make my program intentionally play badly so that yours learns that this is a good opening for black? Then, on tournament day, you walk right into this opening line that I "taught you" but this time, I don't have my program playing like an idiot, it is playing to win, and does so easily.

That's the down-side t- learning, and it is why I don't depend on it for preparing a book for tournament play at all.
Hi,

Ok, this is a problem common to any learning chess engine. I though you were pointing a flaw in the "matchbox learning applied to chess opening" approach.

No, sorry. I was talking about exactly what you concluded, that _any_ learning can be used against you if you expose it to the public thru a server like ICC or whatever.

Humans are devious creatures...

I will add that this approach is probably more exploitable than the usual approach, since this learns which moves are bad or good in a more absolute way, where traditional learning might well factor in other such things as how the evaluation changed after leaving book, and use other factors such as static evaluation and a bit of randomness as well.

Tony · Post by **Tony** » Thu Mar 20, 2008 6:30 pm

bob wrote:
mathmoi wrote:
bob wrote:
sje wrote:A few decades ago, Martin Gardner wrote about a manually operated computer that played hexapawn (3x3 board, two rows of pawns) and included an effective learning algorithm. The computer was built from a couple of dozen small matchboxes and a bunch of colored jellybeans. Each matchbox was labeled with a possible hexapawn board position and contained jellybeans that corresponded to the available moves. Operation of the computer involved selectively removing jellybeans that represented moves in lost games.

Symbolic is now using a tangentially related algorithm for book learning. After a couple of hundred seed games using a conventional book, it has been building its current book based only on its past ICC play. This book now contains about 120,000 positions based on about 2,500 games and is increasing daily.

Each matchbox consists of a position hash and the jellybeans are the historical win/lose/draw counts. The position list and jellybean distribution are updated semi-automatically (I have to run a batch file) every day or so.

The working hypothesis is that a small book tuned to a program's peculiarities is operationally superior to a much larger book generated from play in a general, high strength population.

The two main disadvantages of this approach are:

1) A need to have a good variety of opponents,

2) Significant modifications of the search and evaluation code will likely degrade effectiveness of a book tuned to prior behavior.
Here's the down-side... If I notice you using this on ICC, you can expect crafty to teach you a terrible lesson that will come home to roost when we next meet in a CCT/ACCA event.

One has to be careful how you use things you learn, lest there be traps set.
Hi,

I don't understand why you think this learning technique is flawed. Can you explain?
Easily. Suppose We play an opening line where white (my side) has an easy winning position. But I make my program intentionally play badly so that yours learns that this is a good opening for black? Then, on tournament day, you walk right into this opening line that I "taught you" but this time, I don't have my program playing like an idiot, it is playing to win, and does so easily.

That's the down-side t- learning, and it is why I don't depend on it for preparing a book for tournament play at all.

If the learning didn't notice you played an idiot move then I wouldn't call it learning but rather repeating.

Tony

Matchbox learning

Matchbox learning

Re: Matchbox learning

Re: Matchbox learning

Re: Matchbox learning

Re: Matchbox learning

Re: Matchbox learning

Re: Matchbox learning

Re: Matchbox learning

Re: Matchbox learning

Re: Matchbox learning