Some collections of randomly generated PGN game files

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

User avatar
sje
Posts: 4675
Joined: Mon Mar 13, 2006 7:43 pm

Some collections of randomly generated PGN game files

Post by sje »

Some collections of randomly generated PGN game files just for you.

Here are the gzip versions of random PGN game collections with 1, 10, 100, 1K, 10K, and 100K games.

These can be useful for testing a PGN parser and game termination identification code. Each game has appropriate values for all PGN tags including the Termination tag.

https://dl.dropboxusercontent.com/u/316 ... rg1.pgn.gz
https://dl.dropboxusercontent.com/u/316 ... g10.pgn.gz
https://dl.dropboxusercontent.com/u/316 ... 100.pgn.gz
https://dl.dropboxusercontent.com/u/316 ... 000.pgn.gz
https://dl.dropboxusercontent.com/u/316 ... 000.pgn.gz
https://dl.dropboxusercontent.com/u/316 ... 000.pgn.gz

Sizes (compressed):

Code: Select all

-rw-r--r--@ 1 sje  staff      1707 May 16 02:25 rg1.pgn.gz
-rw-r--r--@ 1 sje  staff      8594 May 16 02:26 rg10.pgn.gz
-rw-r--r--@ 1 sje  staff     90550 May 16 02:26 rg100.pgn.gz
-rw-r--r--@ 1 sje  staff    865555 May 16 02:26 rg1000.pgn.gz
-rw-r--r--@ 1 sje  staff   8685618 May 16 02:27 rg10000.pgn.gz
-rw-r--r--@ 1 sje  staff  86508026 May 16 02:36 rg100000.pgn.gz
Sizes (uncompressed):

Code: Select all

-rw-r--r--@ 1 sje  staff       3386 May 16 02:25 rg1.pgn
-rw-r--r--@ 1 sje  staff      20167 May 16 02:26 rg10.pgn
-rw-r--r--@ 1 sje  staff     235747 May 16 02:26 rg100.pgn
-rw-r--r--@ 1 sje  staff    2292827 May 16 02:26 rg1000.pgn
-rw-r--r--@ 1 sje  staff   23044160 May 16 02:27 rg10000.pgn
-rw-r--r--@ 1 sje  staff  229739008 May 16 02:36 rg100000.pgn
The average compression factor is about 62.3%.
User avatar
Look
Posts: 365
Joined: Thu Jun 05, 2014 2:14 pm
Location: Iran
Full name: Mehdi Amini

Re: Some collections of randomly generated PGN game files

Post by Look »

[...]
sje wrote: The average compression factor is about 62.3%.
AFAIK the best format for compression is .zipx ; But I do not know any windows freeware that would compress files to it. (Though some freewares would open it.)
Farewell.
matthewlai
Posts: 793
Joined: Sun Aug 03, 2014 4:48 am
Location: London, UK

Re: Some collections of randomly generated PGN game files

Post by matthewlai »

Look wrote:[...]
sje wrote: The average compression factor is about 62.3%.
AFAIK the best format for compression is .zipx ; But I do not know any windows freeware that would compress files to it. (Though some freewares would open it.)
zipx is just WinZIP's special container format that's seldom used (http://kb.winzip.com/kb/entry/7).

XZ/LZMA2 is the best actual compression method from the list of algorithms supported in zipx.

You can use LZMA2 in 7z format, and it's an open format much better supported by many free and open source programs on all OSes.
Disclosure: I work for DeepMind on the AlphaZero project, but everything I say here is personal opinion and does not reflect the views of DeepMind / Alphabet.
User avatar
sje
Posts: 4675
Joined: Mon Mar 13, 2006 7:43 pm

Re: Some collections of randomly generated PGN game files

Post by sje »

I've used gzip because it's well supported and it plus its pals have been included by default in every Unix-like system for many years.

----

For those authors who have written a PGN parser, it might be interesting to compare speed benchmarks for a digestion of the 219 MiB rg100000.pgn file.

----

If a randomly selected move has a 10% chance of being reasonable, then in a 100,000 collection of random games only about one of those games will have its first six moves be reasonable. The phrase "Shakespeare from monkeys on typewriters" comes to mind.