delete informant symbols before a given move or ply nr in big pgn database

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

Jonathan003
Posts: 239
Joined: Fri Jul 06, 2018 4:23 pm
Full name: Jonathan Cremers

delete informant symbols before a given move or ply nr in big pgn database

Post by Jonathan003 »

Hi,
Does someone know if it is possible somehow to delete informant symbols before a given move or ply number in a big pgn database, while preserving the informant symbols after this move or ply number?
If it's not possible may I ask some good programmer like Ferdy do make such a tool please.
Like Ferdy made the tool 'move annotation modifier' for me to remove the informant symbols on for one color in a big pgn database.
I want the tool to be able to handle big pgn databases with hundreds of thousands of games.
The reason why I want this tool to be able to remove sidelines, of a big pgn created with obk2pgn, only later in the games, after a given move or ply number, by removing games with a ? in the annotation.

This is an example of how the output of obkpgn looks like for a white book:


[Event "?"]
[Site "?"]
[Date "????.??.??"]
[Round "-"]
[White "?"]
[Black "?"]
[Result "1/2-1/2"]

1.Nc3 c5? 2.Nf3? Nc6? 3.e4!! g6? 4.Bb5!! Bg7? 5.O-O!! d6? 6.Re1!! Bd7? 7.
a4!! Rc8? 8.d3!! Nf6? 9.Nd5!! O-O? 10.Bg5!! a6? 11.Bxf6!! exf6? 12.Bc4!!
f5? 13.c3!! fxe4? 14.dxe4!! 1/2-1/2

[Event "?"]
[Site "?"]
[Date "????.??.??"]
[Round "-"]
[White "?"]
[Black "?"]
[Result "1/2-1/2"]

1.Nc3 c5? 2.e4!! a6? 3.Nge2!! d6? 4.g3!! Nc6? 5.Bg2!! g6? 6.d3!! Bg7? 7.
Be3!! Nf6? 8.h3!! Bd7? 9.Qd2!! b5? 10.Bh6!! Bxh6? 11.Qxh6!! Nd4? 12.Nxd4!!
cxd4? 13.Ne2!! e5? 14.O-O!! 1/2-1/2

[Event "?"]
[Site "?"]
[Date "????.??.??"]
[Round "-"]
[White "?"]
[Black "?"]
[Result "1/2-1/2"]

1.Nc3 c5? 2.e4!! a6? 3.Nge2!! d6? 4.g3!! g6? 5.Bg2!! Nc6? 1/2-1/2

[Event "?"]
[Site "?"]
[Date "????.??.??"]
[Round "-"]
[White "?"]
[Black "?"]
[Result "1/2-1/2"]

1.Nc3 c5? 2.e4!! a6? 3.Nge2!! d6? 4.g3!! g6? 5.Bg2!! Bg7? 6.d4!! cxd4? 7.
Nxd4!! Nf6? 8.O-O!! O-O? 9.b3!! Nc6? 10.Nxc6!! bxc6? 11.Bb2!! Qa5? 12.Na4
!! Bg4? 13.Qe1!! Qh5? 14.f3!! 1/2-1/2
Ras
Posts: 2487
Joined: Tue Aug 30, 2016 8:19 pm
Full name: Rasmus Althoff

Re: delete informant symbols before a given move or ply nr in big pgn database

Post by Ras »

You could try (CLI):

Code: Select all

sed 's/\([a-zA-Z0-8]\|^\)\([!\?]\+\)/\1/g' input.pgn > output.pgn
Works under Linux, should work under Windows with Cygwin or maybe also WSL.

Example input:

Code: Select all

[Event "?"]
[Site "?"]
[Date "????.??.??"]
[Round "-"]
[White "?"]
[Black "?"]
[Result "1/2-1/2"]

1.Nc3 c5? 2.e4!! a6? 3.Nge2!! d6? 4.g3!! g6? 5.Bg2!! Bg7? 6.d4!! cxd4? 7.
Nxd4!! Nf6? 8.O-O!! O-O? 9.b3!! Nc6? 10.Nxc6!! bxc6? 11.Bb2!! Qa5? 12.Na4
!! Bg4? 13.Qe1!! Qh5? 14.f3!! 1/2-1/2
Example output:

Code: Select all

[Event "?"]
[Site "?"]
[Date "????.??.??"]
[Round "-"]
[White "?"]
[Black "?"]
[Result "1/2-1/2"]

1.Nc3 c5 2.e4 a6 3.Nge2 d6 4.g3 g6 5.Bg2 Bg7 6.d4 cxd4 7.
Nxd4 Nf6 8.O-O O-O 9.b3 Nc6 10.Nxc6 bxc6 11.Bb2 Qa5 12.Na4
 Bg4 13.Qe1 Qh5 14.f3 1/2-1/2
Rasmus Althoff
https://www.ct800.net
Jonathan003
Posts: 239
Joined: Fri Jul 06, 2018 4:23 pm
Full name: Jonathan Cremers

Re: delete informant symbols before a given move or ply nr in big pgn database

Post by Jonathan003 »

There are no informant symbols anymore in the example? I don't want al informant symbols to be removed. Only before a given move or ply number. I use a Windows 64bit system.
Ras
Posts: 2487
Joined: Tue Aug 30, 2016 8:19 pm
Full name: Rasmus Althoff

Re: delete informant symbols before a given move or ply nr in big pgn database

Post by Ras »

Jonathan003 wrote: Tue Oct 20, 2020 5:11 pmThere are no informant symbols anymore in the example? I don't want al informant symbols to be removed. Only before a given move or ply number.
Oh, I thought it's about the '?' and '!' in the game part itself. Then I don't understand what exactly you want to be removed. In the example game I posted, can you manually remove what you don't want to have and post it?
I use a Windows 64bit system.
Cygwin is always an option - I used that when I still was on Windows.
Rasmus Althoff
https://www.ct800.net
User avatar
Guenther
Posts: 4605
Joined: Wed Oct 01, 2008 6:33 am
Location: Regensburg, Germany
Full name: Guenther Simon

Re: delete informant symbols before a given move or ply nr in big pgn database

Post by Guenther »

Ras wrote: Tue Oct 20, 2020 5:33 pm
Jonathan003 wrote: Tue Oct 20, 2020 5:11 pmThere are no informant symbols anymore in the example? I don't want al informant symbols to be removed. Only before a given move or ply number.
Oh, I thought it's about the '?' and '!' in the game part itself. Then I don't understand what exactly you want to be removed. In the example game I posted, can you manually remove what you don't want to have and post it?
I use a Windows 64bit system.
Cygwin is always an option - I used that when I still was on Windows.
He wants it only removed for a certain number of early moves (he did not tell the number though).
In this case the regex could be adopted to add the move number(s) in front, but then it also needs to check for line breaks.
Still best done with an editor capable of regex search/replace in Windows.
https://rwbc-chess.de

trollwatch:
Chessqueen + chessica + AlexChess + Eduard + Sylwy
Jonathan003
Posts: 239
Joined: Fri Jul 06, 2018 4:23 pm
Full name: Jonathan Cremers

Re: delete informant symbols before a given move or ply nr in big pgn database

Post by Jonathan003 »

That's right that's what I want.
I have no experience with regex.

I want to be able to choose how many first ply's have no informant symbol('s) in the output.
If I choose to remove informant symbol('s)s before the 10th ply for example

I want this:

Code: Select all

[Event "?"]
[Site "?"]
[Date "????.??.??"]
[Round "-"]
[White "?"]
[Black "?"]
[Result "1/2-1/2"]

1.Nc3 c5? 2.e4!! a6? 3.Nge2!! d6? 4.g3!! g6? 5.Bg2!! Bg7? 6.d4!! cxd4? 7.
Nxd4!! Nf6? 8.O-O!! O-O? 9.b3!! Nc6? 10.Nxc6!! bxc6? 11.Bb2!! Qa5? 12.Na4
!! Bg4? 13.Qe1!! Qh5? 14.f3!! 1/2-1/2
To look like this:

Code: Select all

[Event "?"]
[Site "?"]
[Date "????.??.??"]
[Round "-"]
[White "?"]
[Black "?"]
[Result "1/2-1/2"]

1.Nc3 c5 2.e4 a6 3.Nge2 d6 4.g3 g6 5.Bg2 Bg7? 6.d4!! cxd4? 7.
Nxd4!! Nf6? 8.O-O!! O-O? 9.b3!! Nc6? 10.Nxc6!! bxc6? 11.Bb2!! Qa5? 12.Na4
!! Bg4? 13.Qe1!! Qh5? 14.f3!! 1/2-1/2
Jonathan003
Posts: 239
Joined: Fri Jul 06, 2018 4:23 pm
Full name: Jonathan Cremers

Re: delete informant symbols before a given move or ply nr in big pgn database

Post by Jonathan003 »

Maybe it can be done somehow with an advanced Text editor like UltraEdit?

And maybe it helps to first change the format with pgn-extract to something like this:

Code: Select all

1. Nc3 c5 $2 2. Nf3 $2 Nc6 $2 3. e4 $3 g6 $2 4. Bb5 $3 Bg7 $2 5. O-O $3 d6
$2 6. Re1 $3 Bd7 $2 7. a4 $3 Rc8 $2 8. d3 $3 Nf6 $2 9. Nd5 $3 O-O $2 10.
Bg5 $3 a6 $2 11. Bxf6 $3 exf6 $2 12. Bc4 $3 f5 $2 13. c3 $3 fxe4 $2 14.
dxe4 $3 1/2-1/2

1. Nc3 c5 $2 2. e4 $3 a6 $2 3. Nge2 $3 d6 $2 4. g3 $3 Nc6 $2 5. Bg2 $3 g6
$2 6. d3 $3 Bg7 $2 7. Be3 $3 Nf6 $2 8. h3 $3 Bd7 $2 9. Qd2 $3 b5 $2 10. Bh6
$3 Bxh6 $2 11. Qxh6 $3 Nd4 $2 12. Nxd4 $3 cxd4 $2 13. Ne2 $3 e5 $2 14. O-O
$3 1/2-1/2

1. Nc3 c5 $2 2. e4 $3 a6 $2 3. Nge2 $3 d6 $2 4. g3 $3 g6 $2 5. Bg2 $3 Nc6
$2 1/2-1/2

1. Nc3 c5 $2 2. e4 $3 a6 $2 3. Nge2 $3 d6 $2 4. g3 $3 g6 $2 5. Bg2 $3 Bg7
$2 6. d4 $3 cxd4 $2 7. Nxd4 $3 Nf6 $2 8. O-O $3 O-O $2 9. b3 $3 Nc6 $2 10.
Nxc6 $3 bxc6 $2 11. Bb2 $3 Qa5 $2 12. Na4 $3 Bg4 $2 13. Qe1 $3 Qh5 $2 14.
f3 $3 1/2-1/2
The symbol '$2' stands for '?', the symbol '$1' stands for '!', and the symbol '$3' stands for '!!'
QuoteEdit
Ferdy
Posts: 4833
Joined: Sun Aug 10, 2008 3:15 pm
Location: Philippines

Re: delete informant symbols before a given move or ply nr in big pgn database

Post by Ferdy »

Uploaded a python script using python-chess lib.

Code: Select all

* Install python 3.8 or newer.
* Install python-chess lib

  * Open command prompt
  * type pip install python-chess

* Download the repo, or pc_0001.py file
Usage:

Code: Select all

python pc_0001.py --input in.pgn --no-nag-ply 10
The output will be saved in out_in.pgn

type python pc_0001.py -h to see the help description.