"letter characters (" A-Za-z "), digit characters (" 0-9 "), the underscore (" _ "), the plus sign (" + "), the octothorpe sign (" # "), the equal sign ("="), the colon (":"), and the hyphen ("-"). "
Parsing a PGN file is done reading a char and using its category to identify the end of the token.
Some tokens consist of only one character (in the standard "self terminating").
Some of those are usually simply ignored:
spaces (' ' '\t' '\v' '\r' '\n')
angle brackets ('<' '>')
period ('.')
others represent a token:
( --> variation start
) --> variation end
* --> end of a game with an unknown or otherwise unavailable result
Other tokens are instead composed of several characters and the first character is used to identify the last:
{ --> read next chars until char! = '}' --> comment
; --> read next chars until char! = '\n' --> comment
$ --> read next chars until char is a digit --> NAG
a symbol (see above) --> read next chars until char is a symbol
- if the first char is not a digit --> a move
if all the chars are digit --> a move number
if it is "0-1" or "1-0" or "1/2-1/2" --> end of the game
As for the pairs tag:
[ --> skip spaces
--> read next chars until char is a symbol --> tag name
--> skip spaces
--> read next chars until char ! = ']' and [char-1] != '\' --> tag value
That's all.
Then it is actually necessary to interpret the SAN moves, which can be annoying for example if a move is unambiguous only because one of the pieces is pinned.
And decide how tolerant you want to be in case of non-compliant inputs.
But at its core it's really simple.