JGN: A PGN Replacement

mar · Post by **mar** » Sun Nov 14, 2021 12:31 pm

hgm wrote: ↑Sun Nov 14, 2021 11:45 am The problem is that it would pretty much require a complete rewrite of the WinBoard front-end to make it use wide characters. If the back-end is to remain using UTF-8 and normal characters (as would be required for XBoard), you would have to do back and forth conversions at any point where these interact.

actually this exactly what I do in my recent projects: internally I store everything in utf-8 and do on-the-fly conversion through wrapped calls to wide char buffer on stack when I want to open a file and so on. I also store paths in a canonical (normalized) form where I swap backslashes with forward slashes, but that's another problem.

originally I was concerned that with wide chars you can index a character at a specific position directly (assuming UCS-2 or UTF-32, UTF-16 is a bit more complicated due to surrogate pairs - but BMP fits nicely and I don't think anyone really needs code points outside that), but this can be solved as well by iterating and/or caching where performance is critical

overall I'm happy with this decision (using utf-8 for everything) as it simplified a lot of things for me. on unix-based systems, I don't have to do any conversions whatsoever.

Whether this should be UTF-8 or UTF-16, and whether this should be announced through a BOM, is really outside the scope of a standard for game notation: it is an OS property. It is unfortunate that different encodings still exist, but as long as they do, one can expect there will be file-conversion tools for these formats.

utf-8 seems nicer, since it's backward-compatible with ascii. so a proper pgn that doesn't violate the standard by only using ascii character would still load fine as utf-8.
BOM at the beginning of the file would't hurt either as comparing 1st 3 bytes of a file should be easy enough - and even that isn't necessary because it's easy to check whether a sequence of non-ascii bytes is a valid utf-8 sequence

I have not looked into this lately, but the problem used to be that Windows API supported UTF-16, and not UTF-8. So to properly display the non-ascii characters in dialogs, or allow their entry in text edits there, you would have to use the wchar versions of the API calls.

yes, absolutely. it's unfortunate that Microsoft decided to go down this path back then. I guess in text edits it might work out of the box (I mean editing itself), though. I haven't done any WinAPI-based UI programming in ages though so I'm not sure

Ras · Post by **Ras** » Sun Nov 14, 2021 3:24 pm

hgm wrote: ↑Sun Nov 14, 2021 11:45 amI have not looked into this lately, but the problem used to be that Windows API supported UTF-16, and not UTF-8.

That's because back in the 1990s when MS designed NT, UTF-8 existed only as loose idea, so they went with UCS-2 which had provisions for extensions, and that would become UTF-16 starting with Windows 2000. However, the rest of the world has basically moved on to UTF-8, in particular the whole WWW.

The Windows API has conversion functions MultiByteToWideChar() and WideCharToMultiByte(), but they should only be used via a wrapper because you need to call them two times: once for figuring out what buffer size you need, and then again to do the actual conversion.

dangi12012 · Post by **dangi12012** » Sun Nov 14, 2021 3:44 pm

Ras wrote: ↑Sun Nov 14, 2021 3:24 pm
hgm wrote: ↑Sun Nov 14, 2021 11:45 amI have not looked into this lately, but the problem used to be that Windows API supported UTF-16, and not UTF-8.
That's because back in the 1990s when MS designed NT, UTF-8 existed only as loose idea, so they went with UCS-2 which had provisions for extensions, and that would become UTF-16 starting with Windows 2000. However, the rest of the world has basically moved on to UTF-8, in particular the whole WWW.

The Windows API has conversion functions MultiByteToWideChar() and WideCharToMultiByte(), but they should only be used via a wrapper because you need to call them two times: once for figuring out what buffer size you need, and then again to do the actual conversion.

You guys know what really would solve much problems with pgn? Just use db3 which is sql.
You have a standard - the file can be queried with existing software - you can print it in the shell etc...

All problems you are discovering now are solved there - and you dont need to parse anything since its a binary format. You can instantly use it in any programming language because all have backends for sqlite. You can insert and edit like pgn.
Best of all: Its smaller than pgn - can be queried instantly - and is industry tested and ready to be used in any programming language.

We even have a git repo: http://www.talkchess.com/forum3/viewtop ... =7&t=78464

Ras · Post by **Ras** » Sun Nov 14, 2021 4:01 pm

dangi12012 wrote: ↑Sun Nov 14, 2021 3:44 pmJust use db3 which is sql.

See the xkcd I posted above.

mar · Post by **mar** » Sun Nov 14, 2021 4:02 pm

dangi12012 wrote: ↑Sun Nov 14, 2021 3:44 pm You guys know what really would solve much problems with pgn? Just use db3 which is sql.
You have a standard - the file can be queried with existing software - you can print it in the shell etc...

All problems you are discovering now are solved there - and you dont need to parse anything since its a binary format. You can instantly use it in any programming language because all have backends for sqlite. You can insert and edit like pgn.
Best of all: Its smaller than pgn - can be queried instantly - and is industry tested and ready to be used in any programming language.

We even have a git repo: http://www.talkchess.com/forum3/viewtop ... =7&t=78464

you seem to misunderstand what PGN actually is meant for - a portable chess game interchange format.
it's compact, easily readable/editable by humans, text format is easily extensible.

if you want fast queries - go with a custom binary format and you'll beat any sql out there, that's not what PGN was designed for.

instead you propose to use sqlite3 - which is very good - but it's a full blown sql engine, do you realize that?!
so you propose to depend on 230k lines of third party code just because of that?
you need orders of magnitude of less code to actually parse pgn.

Sopel · Post by **Sopel** » Sun Nov 14, 2021 10:19 pm

dangi12012 wrote: ↑Sun Nov 14, 2021 3:44 pm
Ras wrote: ↑Sun Nov 14, 2021 3:24 pm
hgm wrote: ↑Sun Nov 14, 2021 11:45 amI have not looked into this lately, but the problem used to be that Windows API supported UTF-16, and not UTF-8.
That's because back in the 1990s when MS designed NT, UTF-8 existed only as loose idea, so they went with UCS-2 which had provisions for extensions, and that would become UTF-16 starting with Windows 2000. However, the rest of the world has basically moved on to UTF-8, in particular the whole WWW.

The Windows API has conversion functions MultiByteToWideChar() and WideCharToMultiByte(), but they should only be used via a wrapper because you need to call them two times: once for figuring out what buffer size you need, and then again to do the actual conversion.
You guys know what really would solve much problems with pgn? Just use db3 which is sql.
You have a standard - the file can be queried with existing software - you can print it in the shell etc...

All problems you are discovering now are solved there - and you dont need to parse anything since its a binary format. You can instantly use it in any programming language because all have backends for sqlite. You can insert and edit like pgn.
Best of all: Its smaller than pgn - can be queried instantly - and is industry tested and ready to be used in any programming language.

We even have a git repo: http://www.talkchess.com/forum3/viewtop ... =7&t=78464

You have no clue what pgn is and what problems it's solving apparently.

dangi12012 · Post by **dangi12012** » Sun Nov 14, 2021 10:22 pm

Sopel wrote: ↑Sun Nov 14, 2021 10:19 pm
dangi12012 wrote: ↑Sun Nov 14, 2021 3:44 pm
Ras wrote: ↑Sun Nov 14, 2021 3:24 pm
hgm wrote: ↑Sun Nov 14, 2021 11:45 amI have not looked into this lately, but the problem used to be that Windows API supported UTF-16, and not UTF-8.
That's because back in the 1990s when MS designed NT, UTF-8 existed only as loose idea, so they went with UCS-2 which had provisions for extensions, and that would become UTF-16 starting with Windows 2000. However, the rest of the world has basically moved on to UTF-8, in particular the whole WWW.

The Windows API has conversion functions MultiByteToWideChar() and WideCharToMultiByte(), but they should only be used via a wrapper because you need to call them two times: once for figuring out what buffer size you need, and then again to do the actual conversion.
You guys know what really would solve much problems with pgn? Just use db3 which is sql.
You have a standard - the file can be queried with existing software - you can print it in the shell etc...

All problems you are discovering now are solved there - and you dont need to parse anything since its a binary format. You can instantly use it in any programming language because all have backends for sqlite. You can insert and edit like pgn.
Best of all: Its smaller than pgn - can be queried instantly - and is industry tested and ready to be used in any programming language.

We even have a git repo: http://www.talkchess.com/forum3/viewtop ... =7&t=78464
You have no clue what pgn is and what problems it's solving apparently.

Yes please teach me how textfiles work. It will be so enlightening.

Sopel · Post by **Sopel** » Sun Nov 14, 2021 10:25 pm

dangi12012 wrote: ↑Sun Nov 14, 2021 10:22 pm
Sopel wrote: ↑Sun Nov 14, 2021 10:19 pm
dangi12012 wrote: ↑Sun Nov 14, 2021 3:44 pm
Ras wrote: ↑Sun Nov 14, 2021 3:24 pm
hgm wrote: ↑Sun Nov 14, 2021 11:45 amI have not looked into this lately, but the problem used to be that Windows API supported UTF-16, and not UTF-8.
That's because back in the 1990s when MS designed NT, UTF-8 existed only as loose idea, so they went with UCS-2 which had provisions for extensions, and that would become UTF-16 starting with Windows 2000. However, the rest of the world has basically moved on to UTF-8, in particular the whole WWW.

The Windows API has conversion functions MultiByteToWideChar() and WideCharToMultiByte(), but they should only be used via a wrapper because you need to call them two times: once for figuring out what buffer size you need, and then again to do the actual conversion.
You guys know what really would solve much problems with pgn? Just use db3 which is sql.
You have a standard - the file can be queried with existing software - you can print it in the shell etc...

All problems you are discovering now are solved there - and you dont need to parse anything since its a binary format. You can instantly use it in any programming language because all have backends for sqlite. You can insert and edit like pgn.
Best of all: Its smaller than pgn - can be queried instantly - and is industry tested and ready to be used in any programming language.

We even have a git repo: http://www.talkchess.com/forum3/viewtop ... =7&t=78464
You have no clue what pgn is and what problems it's solving apparently.
Yes please teach me how textfiles work. It will be so enlightening.

You're fixated on sqlite as a solution to every problem on Earth. You cannot be tought anything. I'm just saying you should refrain from providing input in threads you have no clue about.

dangi12012 · Post by **dangi12012** » Mon Nov 15, 2021 12:25 am

Sopel wrote: ↑Sun Nov 14, 2021 10:25 pm You're fixated on sqlite as a solution to every problem on Earth. You cannot be tought anything. I'm just saying you should refrain from providing input in threads you have no clue about.

Teach me with your insane knowledge about... Textfiles. Such advanced topics require intense discussion.

JGN: A PGN Replacement

Re: JGN: A PGN Replacement

Re: JGN: A PGN Replacement

Re: JGN: A PGN Replacement

Re: JGN: A PGN Replacement

Re: JGN: A PGN Replacement

Re: JGN: A PGN Replacement

Re: JGN: A PGN Replacement

Re: JGN: A PGN Replacement

Re: JGN: A PGN Replacement