A PGN parser

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

mcostalba
Posts: 2684
Joined: Sat Jun 14, 2008 9:17 pm

Re: A PGN parser

Post by mcostalba »

G.B. Harms wrote:Oh I misread the opening post in my hurry. First I see it takes a little more than 22 seconds, and in that case my parser is probably much faster, although, secondly, I generate pseudo legal moves only and don't make/unmake to check legality. However the move genertator can take a LegalOnly flag and then it makes/unmakes in some edge cases only, so probably still fast enough. I'm also pretty sure that SF move generator is quite a bit faster than mine.
How can you avoid to make move if, for check legality, you have to play all the PGN game?
G.B. Harms
Posts: 42
Joined: Sat Aug 08, 2009 2:18 pm
Location: Almere

Re: A PGN parser

Post by G.B. Harms »

Thanks for correction, I haven't looked at it a long time.
Indeed it does makeMove, see PgnPlayer.h line 114. It means generating pseudo legal as I do is enough. On my system it parses and plays, including legality checking, about 50000 games in 3-4 seconds.
mcostalba
Posts: 2684
Joined: Sat Jun 14, 2008 9:17 pm

Re: A PGN parser

Post by mcostalba »

G.B. Harms wrote:Thanks for correction, I haven't looked at it a long time.
Indeed it does makeMove, see PgnPlayer.h line 114. It means generating pseudo legal as I do is enough. On my system it parses and plays, including legality checking, about 50000 games in 3-4 seconds.
Thanks for the links!

I have seen that although move generation is needed, it can be restricted to only the sensible subset with just a look at the san move, for instance if the PGN move contains 'x' then it is a capture and you can generate just captures.

As you can guess, I have not started from scratch with the move generator :-) I was using the vanilla SF one and instructing it to generate all the moves, I thought it would have been fast enough, and actually it was, but he problem was checking for the san move match that, thanks to profiling, I found it was the slowest part. So I have used the low-level move generator's functions to generate only what is needed and indirectly reduce the load for the san comparison checking. It took me a while to adapt the move generator, but now times are less than half!

Code: Select all

$ ../parser/parser fics.pgn 

Analizing...done
Processing...done
Sorting...done
Writing to files...done

Games: 129207
Moves: 8186293
Games/second: 13093
Moves/second: 829579
MBytes/second: 10.351
Size of positions index (MB): 93.6847
Size of games index (MB): 1.97154
Positions index: fics.pgn.kidx
Games index: fics.pgn.gidx
Processing time (ms): 9868
G.B. Harms
Posts: 42
Joined: Sat Aug 08, 2009 2:18 pm
Location: Almere

Re: A PGN parser

Post by G.B. Harms »

Glad it helped. Indeed just generating the (capture) moves for the specific piece type also gave me the most speed-up. I can probably further refine it but for my purpose it is fast enough now.

Based on the source of the pgn I should also not always call makeMove() with parameter check_legal=true, since the source already checked it (in my case LittleBlitzer). That will probably give a nice speed up too.

And I must take a look at SF's move generation code since its perft is so much faster than mine :shock:
Sven
Posts: 4052
Joined: Thu May 15, 2008 9:57 pm
Location: Berlin, Germany
Full name: Sven Schüle

Re: A PGN parser

Post by Sven »

mcostalba wrote:I have seen that although move generation is needed, it can be restricted to only the sensible subset with just a look at the san move, for instance if the PGN move contains 'x' then it is a capture and you can generate just captures.
Wouldn't it be sufficient to only generate moves to the given target square, and also only for pieces of the given type? For instance "e4" -> only pawn moves to e4 or "Ng5" -> only knight moves to g5? A simple intersection of two bitboards in the latter case ...
MahmoudUthman
Posts: 234
Joined: Sat Jan 17, 2015 11:54 pm

Re: A PGN parser

Post by MahmoudUthman »

mcostalba wrote: I have seen that although move generation is needed, it can be restricted to only the sensible subset with just a look at the san move, for instance if the PGN move contains 'x' then it is a capture and you can generate just captures
why do you need move generation , wouldn't a full legality check with shortcuts be faster ?
Sven
Posts: 4052
Joined: Thu May 15, 2008 9:57 pm
Location: Berlin, Germany
Full name: Sven Schüle

Re: A PGN parser

Post by Sven »

MahmoudUthman wrote:
mcostalba wrote: I have seen that although move generation is needed, it can be restricted to only the sensible subset with just a look at the san move, for instance if the PGN move contains 'x' then it is a capture and you can generate just captures
why do you need move generation , wouldn't a full legality check with shortcuts be faster ?
:D We had the same thought in the same minute!
MahmoudUthman
Posts: 234
Joined: Sat Jan 17, 2015 11:54 pm

Re: A PGN parser

Post by MahmoudUthman »

Sven Schüle wrote:
MahmoudUthman wrote:
mcostalba wrote: I have seen that although move generation is needed, it can be restricted to only the sensible subset with just a look at the san move, for instance if the PGN move contains 'x' then it is a capture and you can generate just captures
why do you need move generation , wouldn't a full legality check with shortcuts be faster ?
:D We had the same thought in the same minute!
:D
jwes
Posts: 778
Joined: Sat Jul 01, 2006 7:11 am

Re: A PGN parser

Post by jwes »

Sven Schüle wrote:
mcostalba wrote:I have seen that although move generation is needed, it can be restricted to only the sensible subset with just a look at the san move, for instance if the PGN move contains 'x' then it is a capture and you can generate just captures.
Wouldn't it be sufficient to only generate moves to the given target square, and also only for pieces of the given type? For instance "e4" -> only pawn moves to e4 or "Ng5" -> only knight moves to g5? A simple intersection of two bitboards in the latter case ...
If you need to check for legality, you would also need to check if the moving piece is pinned. You could also need to check for pins to disambiguate moves, e.g. Ng5 is not ambiguous in this position.
[d]rnb1kbnr/ppp1qppp/8/8/3pN3/5N2/PPP2PPP/R1BQKB1R w KQkq - 2 6
User avatar
hgm
Posts: 27828
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: A PGN parser

Post by hgm »

WinBoard used to employ the stupid way for SAN disambiguation: generate all legal moves, and see if there was one that matched the info given in SAN, where missing from-square and disambiguators were treated as wild cards. Legality was checked by making the move, and generating all pseudo-legal moves for the other side to see if any of those captured the King.

This was OK for checking legality while playing games, but a real pain when loading large collections of PGN. So I sped it up by generating only moves for the given piece, and only checking them for legality if the pseudo-legal move is ambiguous. I felt no need to make a pedantic parser that would flag unambiguous moves that moved into check. If a move is ambiguous you really don't know how to proceed, but unambiguous moves you can always make, no matter how illegal they are, and continue to reconstruct the game beyond them. This seemed preferable over ignoring the rest of the game. If there is only one of the mentioned piece type, even generating the pseudo-legal moves is unnecessary; if the SAN says Qd4 you just move the Queen to d4, and who cares that it came from e1? The purpose was to reconstruct the game so positions could be searched in it, not to check the game for legality or pseudo-legality.

If you know you are dealing with orthodox Chess the fastest way to disambiguate moves of pieces from which there is more than one on the board is to do retrograde move generation of the mentioned piece from the given to-square, to see which copy of the sought piece you hit. Only if you hit more than one you must decide based on the SAN disambiguator, and if that is not decisive (e.g. because there is none), you have to test which of the remaining candidates is not pinned.

You could use magic bitboard to generate retrograde Rook and Bishop moves, but alignment would already be a very good filter. E.g. if you encounter Rf2 you could just look up the attack set of a Rook on f2 on an empty board in a 64-entry table, and if the other Rook is outside that set you are done. If a disambiguator was contained in the move you are even done without generating moves. This leaves so few cases that it doesn't matter much what method you use to solve the remaining ones. Bishop moves are of course 'never' ambiguous, and Pawn moves are always disambiguated in SAN (even if they were not ambiguous in the first place).