Nalimov errors

Discussion of chess software programming and technical issues.

Moderator: Ras

jkominek
Posts: 69
Joined: Tue Sep 04, 2018 5:33 am
Full name: John Kominek

Re: Nalimov errors

Post by jkominek »

Hmm. The potential of bit-rot of the Nalimov generator is curious (i.e. ranging from unexpected to mildly alarming). Given that Jonathan said he ran datacomp, and that his md5sum command is running against an .emd file, my suspicion is that datacomp has not survived the transition from 32-bit to 64-bit architectures.

To start, partition the problem in half. Using the example of kbbk, take the downloaded versions of the file pair, decompress them, then compare the uncompressed files. If the uncompressed files match with what you have generated afresh, then the problem is not tbgen, but datacomp.

And, that's what I find.

Code: Select all

$ ls -l OLD
total 2076
-rw-rw-r-- 1 jkominek jkominek 873642 May  4 11:54 kbbk.nbb
-r--r--r-- 1 jkominek jkominek 205549 Feb 24  2002 kbbk.nbb.emd
-rw-rw-r-- 1 jkominek jkominek 789885 May  4 11:54 kbbk.nbw
-r--r--r-- 1 jkominek jkominek 249216 Feb 24  2002 kbbk.nbw.emd

$ ls -l NEW
total 2152
-rw-rw-r-- 1 jkominek jkominek 873642 May  4 11:40 kbbk.nbb
-rw-rw-r-- 1 jkominek jkominek 176132 May  4 11:41 kbbk.nbb.emd
-rw-rw-r-- 1 jkominek jkominek 789885 May  4 11:40 kbbk.nbw
-rw-rw-r-- 1 jkominek jkominek 289158 May  4 11:41 kbbk.nbw.emd
-rw-rw-r-- 1 jkominek jkominek   1386 May  4 11:40 kbbk.tbs
-rw-rw-r-- 1 jkominek jkominek  28644 May  4 11:40 kbk.nbb
-rw-rw-r-- 1 jkominek jkominek  27243 May  4 11:40 kbk.nbw
-rw-rw-r-- 1 jkominek jkominek     99 May  4 11:40 kbk.tbs
Notably, the sizes of the kbbk.*.emd files between the versions archived from 2002 and ones now just generated on a 64 bit Intel machine differ, so there you are. As a double-check, an md5sum check (or diff) between the old and new .nbb/.nbw confirms that they are identical.

The Kadatach compression code is difficult to read and comprehend -- at least for me it is. It would take a programmer of the caliber of Dann or Ronald to narrow down the issue and debug. Not that I would expect either of them to try, but this fragment from complib.c caught my eye, even if it's not the culprit.

Code: Select all

  if (p->bits > 15)
  {
    /* I am lazy bastard -- package-merge algorithms is too complicated */
    /* so I'll simply scale frequences down; sooner or later maximal    */
    /* bit length will not exceed 31                                    */
    /* In the production-quality library I used [-1 +3 -2] transform    */
    /* but as long as 31-bit boundary is unlikely to be reached it is   */
    /* not necessary to bother about better [but complicated] solutions */
    assert (p == first_sorted);
    q = p;
    do
      q->freq = (q->freq + 1) >> 1;
    while ((q = q->son[0]) != 0);
    goto restart;
  }
Non-portability of datacomp aside, there is a practical question: will the compressed files you generated be interpreted correctly in an application, even though they differ from the reference files? Hard to say without close scrutiny, but probably not.

For Jonathan, are you sure you want to generate the tables rather than just download?
jkominek
Posts: 69
Joined: Tue Sep 04, 2018 5:33 am
Full name: John Kominek

Re: Nalimov errors

Post by jkominek »

In my post above I basically walked through Dann's debugging advice. To contribute something further that may help a little, attached is an md5sum file of the uncompressed 3 to 5 men Nalimov tablebases.

As far as I am aware, there are no checksum of the uncompressed file available to download, for example from http://kirill-kryukov.com/chess/tablebases-online/ or http://tablebase.sesse.net/. No one thought it worth the bother, I suppose. In retrospect that was an oversight. If all your uncompressed files pass the comparison then things are very likely correct on the generation side.
jkominek
Posts: 69
Joined: Tue Sep 04, 2018 5:33 am
Full name: John Kominek

Re: Nalimov errors

Post by jkominek »

P.S. Oddly I'm not seeing the file attachment. Maybe there is an attacment filter it did not make it through. Here is a cut and paste of the text.

Code: Select all

7f6f2e5a74f7c7a06c9a014936906b3f  kbbbk.nbb
6a7a4dff964b114e6ed10da9120d000b  kbbbk.nbw
c5cf40d81d1bb23d90b743528edb2260  kbbkb.nbb
2935b7a48d0300e466796088efac46b4  kbbkb.nbw
1cbec2a73fe82cb82ccd0db32fd1a750  kbbk.nbb
9b328b9c3ecb98fca904e9dcba91d1fa  kbbk.nbw
5bb80367ee21ad8ec9b549bfce3a50b9  kbbkn.nbb
46c2936dcd79c49d22c7e85499211556  kbbkn.nbw
aea878ebf3cd3ab9ac30b965b0d4a4dd  kbbkp.nbb
dc73149cf68c69e642175e52ed4f958b  kbbkp.nbw
2fabb5dfbc944dedadb85bce8ea9a49b  kbbkq.nbb
1e809add34d9e479c055cc356e53eeb6  kbbkq.nbw
77696fa2899af833aceec0957dec8c91  kbbkr.nbb
ae05ae5150eff54409c812dd467afb56  kbbkr.nbw
6ffc23625aa5f48ff98339a65fe75ba2  kbbnk.nbb
0039a9a885eb1c425440744a4f9a5e3e  kbbnk.nbw
f03f47c8140cbb7f68bfbef73d912eb9  kbbpk.nbb
ca3555dae79b5a75620538c01122b6d1  kbbpk.nbw
d1fc4f697af0b6f509d15ebb8f2721e2  kbkb.nbb
d1fc4f697af0b6f509d15ebb8f2721e2  kbkb.nbw
677e2576de3d0af6aedbcc965df5d4d2  kbk.nbb
cfbf5e26ccaa06a6113ed2279a8f8573  kbk.nbw
f55536ef803458d1fb360d2f483fbb3c  kbkn.nbb
fee0e89bd04cb4aebf0d57e4c8bd6ed5  kbkn.nbw
2c65b6f3a8ed662eeeddfc957bca0a49  kbkp.nbb
285e61a87c8e96d2e73cada81e1f399c  kbkp.nbw
82ae9415e9f51bfba8b74e841ed1b6dc  kbnkb.nbb
dbfc9df4bebbd3a9521356402d240022  kbnkb.nbw
1b59f54192c44aa80dd0ce7a8dcbcb22  kbnk.nbb
d3d129125cc899447c7386765b369b87  kbnk.nbw
b058bae0ac4ef70a9a0aaa8af42c7772  kbnkn.nbb
6e02cc7b7296a0323b63f0a35f1f4a24  kbnkn.nbw
9d36a4af539dba7213cce546a5cd1411  kbnkp.nbb
1ac65ccd3c4e023427d9a9a111d72599  kbnkp.nbw
427bb0e15a7d0150cddc284e3e401506  kbnkq.nbb
41681f9ccef83e47fbb86d3edb5c819d  kbnkq.nbw
86de2f50e0a021e9423d0bd046b9b8aa  kbnkr.nbb
70c03b29ecc020dd1231deb09aaa0d06  kbnkr.nbw
46d78e9d1fd489c8d9c9e58475c460ef  kbnnk.nbb
a5f7e256479f15ec049942424fcab29b  kbnnk.nbw
c642da6926d40211cea1672aeaf2565e  kbnpk.nbb
3911608822c07171f1fe4411ea9eea2a  kbnpk.nbw
e5d799a7c6afe70dab3d99295b428a27  kbpkb.nbb
34bb77a9f73c43bd245acd3ab878337e  kbpkb.nbw
4cdc47fdb66f953b69e637c4bcd6b45a  kbpk.nbb
fa908a18169448c47b2f1c26b2db31a7  kbpk.nbw
d8b80dcc4cbdc38afd1feb04dbdfedc5  kbpkn.nbb
64898061f54697ee9b26f7f9220df1de  kbpkn.nbw
7aaf86036ebd921df33edcc857639677  kbpkp.nbb
29ea775bb4f3529b91b5ee705374b008  kbpkp.nbw
6c4047f06e8bdd0e41fc92a672164936  kbpkq.nbb
4e2f154784441a03772c81090b9841f7  kbpkq.nbw
f5973871f95f57c0d4019064986fbc3f  kbpkr.nbb
ec4eb808a8d0650e44268f8a704c2e00  kbpkr.nbw
00baf9e1c848d777fd33881dd45dde61  kbppk.nbb
fa7aa1555da700105b5386f68dc804f9  kbppk.nbw
677e2576de3d0af6aedbcc965df5d4d2  knk.nbb
4c12d7b3a0169b25c32b46acfb3d4cfb  knk.nbw
645158ed3b0eae03b27f967cfd17b84a  knkn.nbb
645158ed3b0eae03b27f967cfd17b84a  knkn.nbw
13edabc4b8fe45d83f95e4c0125d3812  knkp.nbb
d2b43bd3ecb682ac9533439afccbeff3  knkp.nbw
4829595a404f24b4a747599aef096a30  knnkb.nbb
5fa3193f909bf97bea251584691c428f  knnkb.nbw
55e77b1c9cf3e9969739efc443a6fb1d  knnk.nbb
1cfd84c85c8b2e9893c3888d8d86f121  knnk.nbw
db341b748161b935a718cd147f79b68a  knnkn.nbb
c41b34b8df957f15eb74c1cd85573c48  knnkn.nbw
b4c09bd8beed3910ad64dce65e2ad1f6  knnkp.nbb
e972d5e6fbecc8b2a46d01d12648452d  knnkp.nbw
63a85d044582a74c43eb03763c7d4a90  knnkq.nbb
bf673abeed419896b29a32861fcae41f  knnkq.nbw
be56f1d3a8fa3a69bd4894091c302fb6  knnkr.nbb
eea0b52d168df46d6e05e9bbc6a1f7c8  knnkr.nbw
444837d76eae14fa28e1a32d54a06d87  knnnk.nbb
0276f0f53be05c7dabf5a76a18f3fbfe  knnnk.nbw
d14e78945545b97c1144f21e4f1c645d  knnpk.nbb
5350a876a0ecdec1388947560dd58c7e  knnpk.nbw
239a3355e3e1db3e573c5055be642f7b  knpkb.nbb
a77f3565d84135b24109edbc638752fb  knpkb.nbw
ce65734fc5d0d47c2b719c2510b85b82  knpk.nbb
18a59bbe90f12ad829d7120237457be3  knpk.nbw
eee08b5cb8b6b751bf3aa7465304f20a  knpkn.nbb
3571ae64f72f4d5417312a37eb6d058d  knpkn.nbw
743b43a24717d41fc7d48c3ff1e6a13e  knpkp.nbb
1117e8ee7029a7192a4902e1780aabe3  knpkp.nbw
44c42da15607860d1decbf83ecc95e1d  knpkq.nbb
7b95df8f835d7271e6f56a4664134bce  knpkq.nbw
82062c3e686a5c50ea39b50334873cbf  knpkr.nbb
6dbe0137455437a7c520796070971950  knpkr.nbw
0bb6a4505a0f5bcedffb411f1353f41a  knppk.nbb
7742c7c47049959d9fc3bee2d69f5eb4  knppk.nbw
8b74ec78182f98fe6f86b32cf3a8e544  kpk.nbb
754275fe3e54f61cc2f4bb90cf9a9bdf  kpk.nbw
b28c7928551aac1753a05b6b2e3f3b8d  kpkp.nbb
d638d0138b7bc0a98292182953c6040d  kpkp.nbw
d0051d3403393ae949f8aa28447e2218  kppkb.nbb
865f27f02d8743c2923a22bae5a62f8c  kppkb.nbw
f30e4d0fb05023a35e685c855926cbb7  kppk.nbb
d140b5c11db710831d543d633b69a663  kppk.nbw
78a1c682e616e9239f4d9dfb8c5c0686  kppkn.nbb
9220d3ce5738d6ab9842fe4d548f26ad  kppkn.nbw
6d1d366de735594473a766184af93e4e  kppkp.nbb
878cbad09fd3c85492903fe2efab1df9  kppkp.nbw
b76bd6836698df74da2be1eb9d51ab66  kppkq.nbb
b1c86f1bc6f940e42ed135c2af1aa52e  kppkq.nbw
17241dd0a43c83a7b934ed9cff92af00  kppkr.nbb
e9c03670e54697d832a6ac3690bdc844  kppkr.nbw
b11c5cc5c5b8d847f8029a810239b94f  kpppk.nbb
f40b5343b24ac42e7b05df6cb31b9722  kpppk.nbw
74e2624c7b8e29a79085582047c251db  kqbbk.nbb
2e607c7da9d209236d622a8d3bcdc7c1  kqbbk.nbw
26fb3e9754c77bb6e14878ac14781770  kqbkb.nbb
bd0a3fcdad2a927b326f530d769882aa  kqbkb.nbw
4991ed48ad3b0040aa5eb42a0277e4b9  kqbk.nbb
4feb08dab9cf875e67644f9309aaf589  kqbk.nbw
030d2f22c6fee4dc4588409faf12ffb2  kqbkn.nbb
e63f8e1bd37e6b4f2dd8c963c46c4bf0  kqbkn.nbw
3b1c87a28175b70cb412e1b42deef8d3  kqbkp.nbb
d3bbb9047eef000cda18364e931f15ed  kqbkp.nbw
3a11757ab71f24739db2e24f03456b50  kqbkq.nbb
011ed29ebd4f2a265e6e41c0b22ffd0b  kqbkq.nbw
712d2825593f860bba055ee36cbde0a7  kqbkr.nbb
50da9210c229d672d79da5939a86b6db  kqbkr.nbw
34347b5d45e8cc51bc1eccea36c232e2  kqbnk.nbb
dbf9aed906775c889bd97b398f4add4e  kqbnk.nbw
293afced9ebd3c4e6f67ee64b785ee71  kqbpk.nbb
ba4f105f685c7b7fd1da91b41506088b  kqbpk.nbw
27a68e43aa39559ba3fef9ced6df2ae2  kqkb.nbb
1cba83f652de71f6c981d069fb64bb35  kqkb.nbw
c97594cafea155acc81e1f9c437f4c6c  kqk.nbb
4d22e2d9a48f9f391b847405ca280a60  kqk.nbw
36730b4f220c11d7f78398810ed6a616  kqkn.nbb
4abfab4eed5af52210f7645cc96baa6a  kqkn.nbw
f8bcd713a24e10401a13d32e54021380  kqkp.nbb
2fe77b74132dc33aacb7b5a93acfae8b  kqkp.nbw
3f0af353b0ad818260aef7d56b2a5d78  kqkq.nbb
3f0af353b0ad818260aef7d56b2a5d78  kqkq.nbw
a802eee14f1dcce2ef17883194cf46c0  kqkr.nbb
2429ff09fff231039450de9f9ee687d2  kqkr.nbw
42ded7948cbfc77b67b6cb0e86ad9171  kqnkb.nbb
0e6310c5598e08e7b33da9c4736becdf  kqnkb.nbw
a8acba7bec03c55a491c2693edacd363  kqnk.nbb
5887bea4d4acfb6a5dee327e41351f27  kqnk.nbw
89d5d56982d5fb0de43d654934c4381b  kqnkn.nbb
51d690bb71dc27c173d21b1d9491f01c  kqnkn.nbw
05f4587fe7a77df6d49f8da6a72b4c78  kqnkp.nbb
da5b6160cdfbd6b32ae903f8472dbcea  kqnkp.nbw
d4f848e50272669f4aeeb653ff741520  kqnkq.nbb
9c34ae652ad7070c1c937c2ba9d917c5  kqnkq.nbw
8170b72a906f313d7718c8864c53d43e  kqnkr.nbb
c145f429e71a4e6454165dfd2c9461af  kqnkr.nbw
b4977c4f85f7cb1385f88d5dc1eadd52  kqnnk.nbb
a478fe19e8ed5f0c693a653f0d61f899  kqnnk.nbw
8ba1977dd939ce6eb2186ac734ac0bb8  kqnpk.nbb
3b8634e4c32e014af1396db985adf71c  kqnpk.nbw
ab288e5e6f210f0f8c73955ef5e879e0  kqpkb.nbb
275e61462feeaaa160078c2e51a13ccc  kqpkb.nbw
f0cee9b50f7c351b40ab38240b85c2e2  kqpk.nbb
7addc0868502f065d8946fd7d0fee066  kqpk.nbw
e2c1c05b184f619503f27228f6a5e377  kqpkn.nbb
c46ece69f191a6e452cb42db1effa65c  kqpkn.nbw
b4519b69bb32a1650773760dac96ffef  kqpkp.nbb
a0a2fc565be61c0a4e1784edba9415d6  kqpkp.nbw
2f8ea11aa1fa99a1a6a741c6b122fdc1  kqpkq.nbb
d402860c2def6fd5c8a6c6e4aa54595a  kqpkq.nbw
66c74d88c98e24f11195c00d051bc0ab  kqpkr.nbb
cbc8e8501ba9d266f7f1a1bfe346f20a  kqpkr.nbw
a11e244281265109c1a3f22e2823647b  kqppk.nbb
10a790053c8a4fdeec63a54d24c54cd7  kqppk.nbw
05e96f22233b65f7a5ad3e1ca223ebfe  kqqbk.nbb
d49d51b7eaf06c0816318d4b99e4265e  kqqbk.nbw
a3a27015679bbe235facdc66644fd274  kqqkb.nbb
d61255b775ca4323dcc7a399c4379816  kqqkb.nbw
7e5e6d9d38111abf4edb4c34081675b4  kqqk.nbb
6745b9ad4ea0ac982d9fc2b4db127994  kqqk.nbw
a957476a59433e7c65429b43fadff517  kqqkn.nbb
9138493dec0298624aa01339dbdd3dc4  kqqkn.nbw
1bb499e8b2219fa1098624da4aaa1a54  kqqkp.nbb
467f4343fd4bea07d1e14d4597acb04b  kqqkp.nbw
e80bc2c25b5501cecf4b1629cf054549  kqqkq.nbb
3ef00bebeb5d576e06d50154758184e0  kqqkq.nbw
239a70ba7d27b8372dac0a82c2dd555f  kqqkr.nbb
2b2c55190ddb7389e6a3cb3189c792eb  kqqkr.nbw
2233e05e18f636750b3a6a7859c2dd28  kqqnk.nbb
eaa8d3b28b95c9450fc857c5b3afe935  kqqnk.nbw
4d798357096f21880c3df8b0bc397908  kqqpk.nbb
9fa3b6f64e6742497f2dcb31c62b701b  kqqpk.nbw
9abf5ec29eaae0ac1d90cfd8e7f391f7  kqqqk.nbb
2a3732cd5577b83aa51c7f1aeeb07977  kqqqk.nbw
4ceb645ea88ef16199ec3ba2eaa19f5d  kqqrk.nbb
6574932dd70c9c86560b99e49413351a  kqqrk.nbw
1e6b710f5230410cda5af804f51cb581  kqrbk.nbb
8552419aa538ebc2892f1a2dee436f3e  kqrbk.nbw
853072bc6cc25dc1c06c25d545f16820  kqrkb.nbb
00344e91317781703aeccfa9594b10ac  kqrkb.nbw
1de03809b42dcdb70e63c480f6ad3c9d  kqrk.nbb
2b171c21617f65ee3c86735efec8d425  kqrk.nbw
59304aabd957c3849b5952f7c6f1d13e  kqrkn.nbb
bc5bb1968e3876769b4a14c096114ec0  kqrkn.nbw
5806002518eeadbd19220408b1d30199  kqrkp.nbb
a421f24e4a1c109418c2063de062ccd1  kqrkp.nbw
f00d0a3d30107b4070ba22b23f6b4e89  kqrkq.nbb
131e610188b3bdea84b713b98767dedc  kqrkq.nbw
108c0febbd4e4612fc7b58d511e1b121  kqrkr.nbb
a8346595f81f56e3321631130dca4dfa  kqrkr.nbw
e555485bc4ba6c3d47f37f84553d79e7  kqrnk.nbb
82f76e270fe2e850923a50bee502b8d1  kqrnk.nbw
079201d1cb2d5ea53e910d4973dba207  kqrpk.nbb
2cfda83fbf568e69b7fc0340c740f848  kqrpk.nbw
8b3d707d8074cf9e60f0df3264cb511a  kqrrk.nbb
307845ea2b366bb9c4596201eac68230  kqrrk.nbw
3b90b5ac3059910f05a234bae7fe398b  krbbk.nbb
5171dfe44c5aeed35eb9e9f436bd3dc7  krbbk.nbw
4191b7ffd9eb3c67faa380238c71dcb7  krbkb.nbb
eb255516135db54a3a18919fc58dd7fa  krbkb.nbw
c556227adf775d6793ae916b8e1c79f7  krbk.nbb
ee71e3db3c7cf369e181cdfa24eb6d8c  krbk.nbw
5dc97d11e21311489a53e5f8a3871bf8  krbkn.nbb
c404ace24fad5023ec5d180dc53792fd  krbkn.nbw
fcaf772cf0665f61572cf53c1362fd8c  krbkp.nbb
b2e746ea35315548848df144c71fcfc6  krbkp.nbw
e1923a428df810591936ba160149e5c8  krbkq.nbb
ed35132d6111fcc1153dbfb549861684  krbkq.nbw
5c5ec578e14286528869764b042787c4  krbkr.nbb
8c64b97aa75f879e8ebf2b83f4626f74  krbkr.nbw
bece5e02042e782ac132d123ab683967  krbnk.nbb
e566985caef50b067485d2087ab35b24  krbnk.nbw
8c7c6f5a52b4ced1ffb20bf2f691ec1e  krbpk.nbb
4bb8e93038bb70e814efe65cdd226d35  krbpk.nbw
b276e6f5f0cd6a3293c34ac0d7685dfb  krkb.nbb
e96957e27ddcd6110e5a97e2352022a6  krkb.nbw
67a33cc90adaca43067ed08c9ccea386  krk.nbb
2d90e96edc95e729a56922e98958399a  krk.nbw
2cdcc1ceecf5f0cc8790ecb0d2226722  krkn.nbb
63ef6d59a5bb3163bc7f37f6cb0890a8  krkn.nbw
b74892dbdb345564519c1038515f6b8f  krkp.nbb
e6848fdea557a13ec1cf8bdb7df0205e  krkp.nbw
9bc8cf05fc17aa2d299566a2ba6fcbd1  krkr.nbb
9bc8cf05fc17aa2d299566a2ba6fcbd1  krkr.nbw
476f453996511d09d3d6d468c9e9151d  krnkb.nbb
c50eb2d59f51df06a60cefd76aa9f136  krnkb.nbw
52e9668bf3034747ed9c96846c3cf5a6  krnk.nbb
1d3e4dfacdf08c4e5afe93fd98e3ab7a  krnk.nbw
b16ae4961eca4bfbc81c1a3c7a9a701c  krnkn.nbb
abc69b8358fca2487c1e2a2b814cf5ff  krnkn.nbw
5780853de7ee29b59b243547eb45a28c  krnkp.nbb
e6eeea5347d8f30ab96ec3f6039e29b3  krnkp.nbw
5bd53acfb8aa3af58cefe6950dd5402f  krnkq.nbb
d6e9770f73bf2289638b221e324ae125  krnkq.nbw
a04877ae8dd6bc812ef2761d981c2ae7  krnkr.nbb
7f0c0620e1c62624cec9ac6992b3b726  krnkr.nbw
d8f2cbddd6bfa9e005f35b169a97df4c  krnnk.nbb
fed4ddc8e79d9a3e6375fd07603acd65  krnnk.nbw
66ca26510b1f81eb78db30e86b80c61b  krnpk.nbb
c7769251bff3053ae71097d25aedcd5c  krnpk.nbw
9370bbb949a6fa024b6b4e3d9aa4b2de  krpkb.nbb
9bd21d4adaa0d286c2a799c4fac70550  krpkb.nbw
85671e29dd8c9a123300f055c4a37b1e  krpk.nbb
7a49300ea5a69c3981446fb2f73c514b  krpk.nbw
0f552a75e3e6de685fbc3ab1ba1b3ba9  krpkn.nbb
ce672a0f6bdebc746f87e173eafdec50  krpkn.nbw
9e5fa61be94d264b60c189e4ecc2069e  krpkp.nbb
1064421189dec434f5b8826d97f11c21  krpkp.nbw
7097247e1f3601f8990689ab5768abee  krpkq.nbb
c8ff18f31306701c96fa88eaf45065ac  krpkq.nbw
ef2c2a909681819baba253a692034f95  krpkr.nbb
8d542df285e4eaa73c2318b9bc79cbcd  krpkr.nbw
b319479712fdf3eb7a0ebcb13d365a9d  krppk.nbb
dafc3cd7731d039461cc5f3828f71ac7  krppk.nbw
e7a9ffd1b66e1f1ac194edccbd7a1a47  krrbk.nbb
0efa04cebccdfe3db99b3e8133c7e9f4  krrbk.nbw
5832c0606fa82ee5b56b1069da34f037  krrkb.nbb
996a9aa4b87d5145fc932f9cf3ed87dd  krrkb.nbw
7bc3d263af0d844e14c5f4a5055a41b6  krrk.nbb
dd1dc62050ffac32d783e5ed2144607b  krrk.nbw
29c9a47dcaf26052a148f4c7140f815d  krrkn.nbb
55112f8d7e8ecff88ee94bd8e7ea8e15  krrkn.nbw
3ce929621b32e5d01e70e9e9b3ab0b34  krrkp.nbb
0a8e0a707fb773e7b1bcdf2dbd5052a8  krrkp.nbw
785ade030d322dc7d0d93f6b7c6e7851  krrkq.nbb
091f13fefd2510fd40feaaa9f8c54cd8  krrkq.nbw
286c136d0b0715ecaa99d0aac6989e09  krrkr.nbb
ab189d8b843996bcddb9ea8c9ed38846  krrkr.nbw
d84027e7f4d63318a950fc9de3ba4141  krrnk.nbb
412250658f6163fd02ead174f3935e12  krrnk.nbw
99638be13dde8c86bb6eef81c860ffc4  krrpk.nbb
fdfd002bc7fc70d4ebfb06f4d237cae0  krrpk.nbw
1c0e8222d877501667e9d2fef7aa0012  krrrk.nbb
843eee90b5a93795d6b7321fccd81abe  krrrk.nbw
syzygy
Posts: 5683
Joined: Tue Feb 28, 2012 11:56 pm

Re: Nalimov errors

Post by syzygy »

jkominek wrote: Tue May 04, 2021 6:47 pm Hmm. The potential of bit-rot of the Nalimov generator is curious (i.e. ranging from unexpected to mildly alarming). Given that Jonathan said he ran datacomp, and that his md5sum command is running against an .emd file, my suspicion is that datacomp has not survived the transition from 32-bit to 64-bit architectures.
Doesn't datacomp.exe take a block size as compression parameter? Different block size, different md5sum...

If datacomp.exe produced different files on 64-bit systems, those files would not work. It seems unlikely that the OP did not test his files.
jkominek
Posts: 69
Joined: Tue Sep 04, 2018 5:33 am
Full name: John Kominek

Re: Nalimov errors

Post by jkominek »

That is a good point, and one I had overlooked. Yes, block size is an optional parameter to DATACOMP.EXE. Importantly the default value is 64k, not the 8k which was the value, if I remember correctly, that was used to compress for the official release. Bob Hyatt has posted multiple times about his experimentation determining that 8k blocks were the optimal speed vs. size tradeoff value (for technology of the late 90s).

Code: Select all

$ wine ../../DATACOMP.EXE 
datacomp -- multidimentional data compression utility
copyright (c) 1991--1998 by Andrew Kadatach
usage: datacomp command file1 file2 file3 ...
  commands:
  e[:NNN] -- encode files using block of NNN bytes (default is 65536)
             (all files will be encoded using first file statistics)
             suffix ".emd" will be added to each name of file with encoded data
  d       -- decode original files out of ".emd"-files
             last suffix will be removed from each name
  t       -- test integrity of compressed ".emd"-files
  v       -- test integrity of compressed ".emd"-files and decompression speed
The test results confirm an 8k block size.

Code: Select all

for BLK in 1024 2048 4096 8196 16384 32768 65536
do
  for STM in b w
  do
    FILE=kbbk.nb${STM}
    wine ../../DATACOMP.EXE e:$BLK $FILE
    mv $FILE.emd $BLK.$FILE.emd
  done  
done

$ md5sum *.emd
1578fb7c564e9d5a46d2b433a2a24b77  1024.kbbk.nbb.emd
5bc4fc8876e6ee603e9fc59c577850df  1024.kbbk.nbw.emd
958b733bd3ca93239d1bb3ad8b9fdae3  16384.kbbk.nbb.emd
8172f3a15de739752438e55c51ea4041  16384.kbbk.nbw.emd
f0d681f23232b14676930c2ad8632cbb  2048.kbbk.nbb.emd
59434e6d12f74e2faf0a4d352f8406a3  2048.kbbk.nbw.emd
8cc992d847521f6c6b4b9aff47bcb8fd  32768.kbbk.nbb.emd
ee28ee94e0b08e2b12ae27d9a8c04545  32768.kbbk.nbw.emd
03f6c3760467fc3bf7f2233d8c650405  4096.kbbk.nbb.emd
d3ff9f8645273000e41803b80c4495d3  4096.kbbk.nbw.emd
12671d86e3ca8b08308a2a8f8763300c  65536.kbbk.nbb.emd
0a061f7f7888dae45cd11a7e472e2fd5  65536.kbbk.nbw.emd
6d3bee7a798a4829fa97f93e52bc1f56  8196.kbbk.nbb.emd
b99e42a83337a23545b4ac7518aed92d  8196.kbbk.nbw.emd
Reference checksums cut and pasted from sesse.net:

Code: Select all

6d3bee7a798a4829fa97f93e52bc1f56  kbbk.nbb.emd
b99e42a83337a23545b4ac7518aed92d  kbbk.nbw.emd
Thank you for setting us straight on this discrepancy.
jkominek
Posts: 69
Joined: Tue Sep 04, 2018 5:33 am
Full name: John Kominek

Re: Nalimov errors

Post by jkominek »

One more post as a word of warning, for this caught me by surprise. A usage gotcha: submitting multiple files on the command line is a mistake.

Little test script:

Code: Select all

$ cat doit.sh
cat md5.kbbk

echo
wine ../../DATACOMP.EXE e:8192 kbbk.nbb
wine ../../DATACOMP.EXE e:8192 kbbk.nbw
echo
md5sum -c md5.kbbk
echo
\rm *.emd

wine ../../DATACOMP.EXE e:8192 kbbk.nbb kbbk.nbw
echo
md5sum -c md5.kbbk
echo
\rm *.emd

wine ../../DATACOMP.EXE e:8192 kbbk.nbw kbbk.nbb
echo
md5sum -c md5.kbbk
echo
\rm *.emd
Results;

Code: Select all

$ ./doit.sh
6d3bee7a798a4829fa97f93e52bc1f56  kbbk.nbb.emd
b99e42a83337a23545b4ac7518aed92d  kbbk.nbw.emd

analyzing "kbbk.nbb" pass 1: 873642 bytes parsed
analyzing "kbbk.nbb" pass 2: 873642 bytes parsed
  -1891,      -7,      -9,      -2,   -3782,     -15,     -16,    -364,
    -14,    -315,    -495,      -1,     -21,    -301,    -414,    -113,
   -405,    -504,    -357,      -5,    -486,    -468,    -351,    -423
pass 1: 873642 bytes parsed
pass 2: 873642 bytes read, 205549 bytes written (4.250 : 1)
analyzing "kbbk.nbw" pass 1: 789885 bytes parsed
analyzing "kbbk.nbw" pass 2: 789885 bytes parsed
     -7,      -9,      -6,   -1653,      -1,      -2,      -8,     -18,
    -13,   -1770,      -5,     -14,    -315,     -12,     -27,     -17,
   -273,      -4,     -15,    -252,      -3,     -21,    -245,     -16
pass 1: 789885 bytes parsed
pass 2: 789885 bytes read, 249216 bytes written (3.169 : 1)

kbbk.nbb.emd: OK
kbbk.nbw.emd: OK

analyzing "kbbk.nbb" pass 1: 873642 bytes parsed
analyzing "kbbk.nbb" pass 2: 873642 bytes parsed
  -1891,      -7,      -9,      -2,   -3782,     -15,     -16,    -364,
    -14,    -315,    -495,      -1,     -21,    -301,    -414,    -113,
   -405,    -504,    -357,      -5,    -486,    -468,    -351,    -423
pass 1: 873642 bytes parsed
pass 2: 873642 bytes read, 205549 bytes written (4.250 : 1)
pass 1: 789885 bytes parsed
pass 2: 789885 bytes read, 262657 bytes written (3.007 : 1)

kbbk.nbb.emd: OK
kbbk.nbw.emd: FAILED
md5sum: WARNING: 1 computed checksum did NOT match

analyzing "kbbk.nbw" pass 1: 789885 bytes parsed
analyzing "kbbk.nbw" pass 2: 789885 bytes parsed
     -7,      -9,      -6,   -1653,      -1,      -2,      -8,     -18,
    -13,   -1770,      -5,     -14,    -315,     -12,     -27,     -17,
   -273,      -4,     -15,    -252,      -3,     -21,    -245,     -16
pass 1: 789885 bytes parsed
pass 2: 789885 bytes read, 249216 bytes written (3.169 : 1)
pass 1: 873642 bytes parsed
pass 2: 873642 bytes read, 262061 bytes written (3.334 : 1)

kbbk.nbb.emd: FAILED
kbbk.nbw.emd: OK
md5sum: WARNING: 1 computed checksum did NOT match
Evidently state information continues from the end of one file to the beginning of the next.
syzygy
Posts: 5683
Joined: Tue Feb 28, 2012 11:56 pm

Re: Nalimov errors

Post by syzygy »

jkominek wrote: Tue May 04, 2021 9:39 pm That is a good point, and one I had overlooked. Yes, block size is an optional parameter to DATACOMP.EXE. Importantly the default value is 64k, not the 8k which was the value, if I remember correctly, that was used to compress for the official release. Bob Hyatt has posted multiple times about his experimentation determining that 8k blocks were the optimal speed vs. size tradeoff value (for technology of the late 90s).
Thanks for confirming my suspicion :)

I remember those posts about 8k blocks, but I didn't know the default block size.