Matchbox learning

Discussion of chess software programming and technical issues.

Moderator: Ras

User avatar
sje
Posts: 4675
Joined: Mon Mar 13, 2006 7:43 pm

Matchbox learning

Post by sje »

A few decades ago, Martin Gardner wrote about a manually operated computer that played hexapawn (3x3 board, two rows of pawns) and included an effective learning algorithm. The computer was built from a couple of dozen small matchboxes and a bunch of colored jellybeans. Each matchbox was labeled with a possible hexapawn board position and contained jellybeans that corresponded to the available moves. Operation of the computer involved selectively removing jellybeans that represented moves in lost games.

Symbolic is now using a tangentially related algorithm for book learning. After a couple of hundred seed games using a conventional book, it has been building its current book based only on its past ICC play. This book now contains about 120,000 positions based on about 2,500 games and is increasing daily.

Each matchbox consists of a position hash and the jellybeans are the historical win/lose/draw counts. The position list and jellybean distribution are updated semi-automatically (I have to run a batch file) every day or so.

The working hypothesis is that a small book tuned to a program's peculiarities is operationally superior to a much larger book generated from play in a general, high strength population.

The two main disadvantages of this approach are:

1) A need to have a good variety of opponents,

2) Significant modifications of the search and evaluation code will likely degrade effectiveness of a book tuned to prior behavior.
Mark
Posts: 216
Joined: Thu Mar 09, 2006 9:54 pm

Re: Matchbox learning

Post by Mark »

sje wrote:A few decades ago, Martin Gardner wrote about a manually operated computer that played hexapawn (3x3 board, two rows of pawns) and included an effective learning algorithm. The computer was built from a couple of dozen small matchboxes and a bunch of colored jellybeans. Each matchbox was labeled with a possible hexapawn board position and contained jellybeans that corresponded to the available moves. Operation of the computer involved selectively removing jellybeans that represented moves in lost games.
I remember having a great time making Gardner's matchbox computer when I was a kid! It didn't take long before the computer learned to make the best moves. I had always wanted to apply it to a more complicated game, but never got around to it.

Regards,

Mark
User avatar
sje
Posts: 4675
Joined: Mon Mar 13, 2006 7:43 pm

Re: Matchbox learning

Post by sje »

I recall that one of Gardner's readers went on to produce a physical octopawn computer; it must have needed hundreds of matchboxes.

Perhaps it's not stretching the truth too much to say that Gardner created the first six man tablebase for a board game.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Matchbox learning

Post by bob »

sje wrote:A few decades ago, Martin Gardner wrote about a manually operated computer that played hexapawn (3x3 board, two rows of pawns) and included an effective learning algorithm. The computer was built from a couple of dozen small matchboxes and a bunch of colored jellybeans. Each matchbox was labeled with a possible hexapawn board position and contained jellybeans that corresponded to the available moves. Operation of the computer involved selectively removing jellybeans that represented moves in lost games.

Symbolic is now using a tangentially related algorithm for book learning. After a couple of hundred seed games using a conventional book, it has been building its current book based only on its past ICC play. This book now contains about 120,000 positions based on about 2,500 games and is increasing daily.

Each matchbox consists of a position hash and the jellybeans are the historical win/lose/draw counts. The position list and jellybean distribution are updated semi-automatically (I have to run a batch file) every day or so.

The working hypothesis is that a small book tuned to a program's peculiarities is operationally superior to a much larger book generated from play in a general, high strength population.

The two main disadvantages of this approach are:

1) A need to have a good variety of opponents,

2) Significant modifications of the search and evaluation code will likely degrade effectiveness of a book tuned to prior behavior.
Here's the down-side... If I notice you using this on ICC, you can expect crafty to teach you a terrible lesson that will come home to roost when we next meet in a CCT/ACCA event. :)

One has to be careful how you use things you learn, lest there be traps set. :)
User avatar
sje
Posts: 4675
Joined: Mon Mar 13, 2006 7:43 pm

Re: Matchbox learning

Post by sje »

bob wrote:Here's the down-side... If I notice you using this on ICC, you can expect crafty to teach you a terrible lesson that will come home to roost when we next meet in a CCT/ACCA event. :)

One has to be careful how you use things you learn, lest there be traps set. :)
Yes, I've foreseen this, and that's why there's a need for a variety of opponents and an assumption that they're not collaborating.

Oh, and I might have a surprise or two of my own come tournament day.
mathmoi
Posts: 290
Joined: Mon Mar 13, 2006 5:23 pm
Location: Québec
Full name: Mathieu Pagé

Re: Matchbox learning

Post by mathmoi »

bob wrote:
sje wrote:A few decades ago, Martin Gardner wrote about a manually operated computer that played hexapawn (3x3 board, two rows of pawns) and included an effective learning algorithm. The computer was built from a couple of dozen small matchboxes and a bunch of colored jellybeans. Each matchbox was labeled with a possible hexapawn board position and contained jellybeans that corresponded to the available moves. Operation of the computer involved selectively removing jellybeans that represented moves in lost games.

Symbolic is now using a tangentially related algorithm for book learning. After a couple of hundred seed games using a conventional book, it has been building its current book based only on its past ICC play. This book now contains about 120,000 positions based on about 2,500 games and is increasing daily.

Each matchbox consists of a position hash and the jellybeans are the historical win/lose/draw counts. The position list and jellybean distribution are updated semi-automatically (I have to run a batch file) every day or so.

The working hypothesis is that a small book tuned to a program's peculiarities is operationally superior to a much larger book generated from play in a general, high strength population.

The two main disadvantages of this approach are:

1) A need to have a good variety of opponents,

2) Significant modifications of the search and evaluation code will likely degrade effectiveness of a book tuned to prior behavior.
Here's the down-side... If I notice you using this on ICC, you can expect crafty to teach you a terrible lesson that will come home to roost when we next meet in a CCT/ACCA event. :)

One has to be careful how you use things you learn, lest there be traps set. :)
Hi,

I don't understand why you think this learning technique is flawed. Can you explain?
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Matchbox learning

Post by bob »

mathmoi wrote:
bob wrote:
sje wrote:A few decades ago, Martin Gardner wrote about a manually operated computer that played hexapawn (3x3 board, two rows of pawns) and included an effective learning algorithm. The computer was built from a couple of dozen small matchboxes and a bunch of colored jellybeans. Each matchbox was labeled with a possible hexapawn board position and contained jellybeans that corresponded to the available moves. Operation of the computer involved selectively removing jellybeans that represented moves in lost games.

Symbolic is now using a tangentially related algorithm for book learning. After a couple of hundred seed games using a conventional book, it has been building its current book based only on its past ICC play. This book now contains about 120,000 positions based on about 2,500 games and is increasing daily.

Each matchbox consists of a position hash and the jellybeans are the historical win/lose/draw counts. The position list and jellybean distribution are updated semi-automatically (I have to run a batch file) every day or so.

The working hypothesis is that a small book tuned to a program's peculiarities is operationally superior to a much larger book generated from play in a general, high strength population.

The two main disadvantages of this approach are:

1) A need to have a good variety of opponents,

2) Significant modifications of the search and evaluation code will likely degrade effectiveness of a book tuned to prior behavior.
Here's the down-side... If I notice you using this on ICC, you can expect crafty to teach you a terrible lesson that will come home to roost when we next meet in a CCT/ACCA event. :)

One has to be careful how you use things you learn, lest there be traps set. :)
Hi,

I don't understand why you think this learning technique is flawed. Can you explain?
Easily. Suppose We play an opening line where white (my side) has an easy winning position. But I make my program intentionally play badly so that yours learns that this is a good opening for black? Then, on tournament day, you walk right into this opening line that I "taught you" but this time, I don't have my program playing like an idiot, it is playing to win, and does so easily.

That's the down-side t- learning, and it is why I don't depend on it for preparing a book for tournament play at all.
mathmoi
Posts: 290
Joined: Mon Mar 13, 2006 5:23 pm
Location: Québec
Full name: Mathieu Pagé

Re: Matchbox learning

Post by mathmoi »

bob wrote:
mathmoi wrote:
bob wrote:
sje wrote:A few decades ago, Martin Gardner wrote about a manually operated computer that played hexapawn (3x3 board, two rows of pawns) and included an effective learning algorithm. The computer was built from a couple of dozen small matchboxes and a bunch of colored jellybeans. Each matchbox was labeled with a possible hexapawn board position and contained jellybeans that corresponded to the available moves. Operation of the computer involved selectively removing jellybeans that represented moves in lost games.

Symbolic is now using a tangentially related algorithm for book learning. After a couple of hundred seed games using a conventional book, it has been building its current book based only on its past ICC play. This book now contains about 120,000 positions based on about 2,500 games and is increasing daily.

Each matchbox consists of a position hash and the jellybeans are the historical win/lose/draw counts. The position list and jellybean distribution are updated semi-automatically (I have to run a batch file) every day or so.

The working hypothesis is that a small book tuned to a program's peculiarities is operationally superior to a much larger book generated from play in a general, high strength population.

The two main disadvantages of this approach are:

1) A need to have a good variety of opponents,

2) Significant modifications of the search and evaluation code will likely degrade effectiveness of a book tuned to prior behavior.
Here's the down-side... If I notice you using this on ICC, you can expect crafty to teach you a terrible lesson that will come home to roost when we next meet in a CCT/ACCA event. :)

One has to be careful how you use things you learn, lest there be traps set. :)
Hi,

I don't understand why you think this learning technique is flawed. Can you explain?
Easily. Suppose We play an opening line where white (my side) has an easy winning position. But I make my program intentionally play badly so that yours learns that this is a good opening for black? Then, on tournament day, you walk right into this opening line that I "taught you" but this time, I don't have my program playing like an idiot, it is playing to win, and does so easily.

That's the down-side t- learning, and it is why I don't depend on it for preparing a book for tournament play at all.
Hi,

Ok, this is a problem common to any learning chess engine. I though you were pointing a flaw in the "matchbox learning applied to chess opening" approach.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Matchbox learning

Post by bob »

mathmoi wrote:
bob wrote:
mathmoi wrote:
bob wrote:
sje wrote:A few decades ago, Martin Gardner wrote about a manually operated computer that played hexapawn (3x3 board, two rows of pawns) and included an effective learning algorithm. The computer was built from a couple of dozen small matchboxes and a bunch of colored jellybeans. Each matchbox was labeled with a possible hexapawn board position and contained jellybeans that corresponded to the available moves. Operation of the computer involved selectively removing jellybeans that represented moves in lost games.

Symbolic is now using a tangentially related algorithm for book learning. After a couple of hundred seed games using a conventional book, it has been building its current book based only on its past ICC play. This book now contains about 120,000 positions based on about 2,500 games and is increasing daily.

Each matchbox consists of a position hash and the jellybeans are the historical win/lose/draw counts. The position list and jellybean distribution are updated semi-automatically (I have to run a batch file) every day or so.

The working hypothesis is that a small book tuned to a program's peculiarities is operationally superior to a much larger book generated from play in a general, high strength population.

The two main disadvantages of this approach are:

1) A need to have a good variety of opponents,

2) Significant modifications of the search and evaluation code will likely degrade effectiveness of a book tuned to prior behavior.
Here's the down-side... If I notice you using this on ICC, you can expect crafty to teach you a terrible lesson that will come home to roost when we next meet in a CCT/ACCA event. :)

One has to be careful how you use things you learn, lest there be traps set. :)
Hi,

I don't understand why you think this learning technique is flawed. Can you explain?
Easily. Suppose We play an opening line where white (my side) has an easy winning position. But I make my program intentionally play badly so that yours learns that this is a good opening for black? Then, on tournament day, you walk right into this opening line that I "taught you" but this time, I don't have my program playing like an idiot, it is playing to win, and does so easily.

That's the down-side t- learning, and it is why I don't depend on it for preparing a book for tournament play at all.
Hi,

Ok, this is a problem common to any learning chess engine. I though you were pointing a flaw in the "matchbox learning applied to chess opening" approach.
No, sorry. I was talking about exactly what you concluded, that _any_ learning can be used against you if you expose it to the public thru a server like ICC or whatever. :)

Humans are devious creatures... :)

I will add that this approach is probably more exploitable than the usual approach, since this learns which moves are bad or good in a more absolute way, where traditional learning might well factor in other such things as how the evaluation changed after leaving book, and use other factors such as static evaluation and a bit of randomness as well.
Tony

Re: Matchbox learning

Post by Tony »

bob wrote:
mathmoi wrote:
bob wrote:
sje wrote:A few decades ago, Martin Gardner wrote about a manually operated computer that played hexapawn (3x3 board, two rows of pawns) and included an effective learning algorithm. The computer was built from a couple of dozen small matchboxes and a bunch of colored jellybeans. Each matchbox was labeled with a possible hexapawn board position and contained jellybeans that corresponded to the available moves. Operation of the computer involved selectively removing jellybeans that represented moves in lost games.

Symbolic is now using a tangentially related algorithm for book learning. After a couple of hundred seed games using a conventional book, it has been building its current book based only on its past ICC play. This book now contains about 120,000 positions based on about 2,500 games and is increasing daily.

Each matchbox consists of a position hash and the jellybeans are the historical win/lose/draw counts. The position list and jellybean distribution are updated semi-automatically (I have to run a batch file) every day or so.

The working hypothesis is that a small book tuned to a program's peculiarities is operationally superior to a much larger book generated from play in a general, high strength population.

The two main disadvantages of this approach are:

1) A need to have a good variety of opponents,

2) Significant modifications of the search and evaluation code will likely degrade effectiveness of a book tuned to prior behavior.
Here's the down-side... If I notice you using this on ICC, you can expect crafty to teach you a terrible lesson that will come home to roost when we next meet in a CCT/ACCA event. :)

One has to be careful how you use things you learn, lest there be traps set. :)
Hi,

I don't understand why you think this learning technique is flawed. Can you explain?
Easily. Suppose We play an opening line where white (my side) has an easy winning position. But I make my program intentionally play badly so that yours learns that this is a good opening for black? Then, on tournament day, you walk right into this opening line that I "taught you" but this time, I don't have my program playing like an idiot, it is playing to win, and does so easily.

That's the down-side t- learning, and it is why I don't depend on it for preparing a book for tournament play at all.
If the learning didn't notice you played an idiot move then I wouldn't call it learning but rather repeating.

Tony