Till now whenever working with pgn files I use a 3rd party tool to convert it to algebraic notation (e2e4, etc).
However I'm wondering how parsers get around the following scenario.
Say this is your board:
4k3/8/6K1/2n5/8/3P4/5n2/8 b KQkq - 0 1
and black makes this move in the PGN file
Nxd3 how should it know which Knight actually moved, since both knights can legally move to d3? Seems to me a full pgn parser would almost have to track legal moves and various other chess rules just to keep the board state accurate.
Using just algebraic representation it's easy since it's a literal move x to y.
Any comments or suggestions is greatly appreciated.
-Josh
PGN parsing conundrum
Moderators: hgm, Rebel, chrisw
-
- Posts: 1342
- Joined: Wed Mar 08, 2006 9:41 pm
- Location: Morgantown, WV, USA
-
- Posts: 292
- Joined: Tue Jul 07, 2009 4:56 am
Re: PGN parsing conundrum
Nxd3 is not a legally formatted move in that position. The relevant moves in standard algebraic notation are Ncxd3 and Nfxd3. Of course a pgn parser can't be expected to deal with ambiguous moves.
-
- Posts: 750
- Joined: Mon Mar 27, 2006 7:45 pm
- Location: Finland
Re: PGN parsing conundrum
A bad PGN parser would probably just pick the first move that matches "Nxd3". A good parser should notice the ambiguity and throw an error.
TrueSeems to me a full pgn parser would almost have to track legal moves and various other chess rules just to keep the board state accurate.
-
- Posts: 792
- Joined: Wed Jul 19, 2006 9:58 am
Re: PGN parsing conundrum
What caused me a lot of headaches when debugging my PGN parser (for the database statistics maker), was the opposite... when a move was described as ambiguous when it wasn't.
e.g. Ndxf3, where the "d" was not needed. I never found away of working around it.
e.g. Ndxf3, where the "d" was not needed. I never found away of working around it.
-
- Posts: 750
- Joined: Mon Mar 27, 2006 7:45 pm
- Location: Finland
Re: PGN parsing conundrum
Did your PGN parser first convert legal moves to SAN format and then compare them to the parsed string? I used to do that, and while it's easy it's also very inflexible and slow. It's better to just do it the hard way by parsing the string properly.Richard Allbert wrote:What caused me a lot of headaches when debugging my PGN parser (for the database statistics maker), was the opposite... when a move was described as ambiguous when it wasn't.
e.g. Ndxf3, where the "d" was not needed. I never found away of working around it.
-
- Posts: 5228
- Joined: Thu Mar 09, 2006 9:40 am
- Full name: Vincent Lejeune
Re: PGN parsing conundrum
Or better : it corrects the move by looking at next moves wich knight moved
ilari wrote:A bad PGN parser would probably just pick the first move that matches "Nxd3". A good parser should notice the ambiguity and throw an error.
TrueSeems to me a full pgn parser would almost have to track legal moves and various other chess rules just to keep the board state accurate.
-
- Posts: 778
- Joined: Sat Jul 01, 2006 7:11 am
Re: PGN parsing conundrum
A quantum pgn parser The board is left in an indeterminate state until it is resolved by a subsequent observation.Vinvin wrote:Or better : it corrects the move by looking at next moves wich knight moved
-
- Posts: 792
- Joined: Wed Jul 19, 2006 9:58 am
Re: PGN parsing conundrum
It does this...
Which I'm sure is the most inefficient method possible.... .
(This isn't my job, it's a hobbby )
It then tries to "make" the returned move. If it's illegal, then I enter a slow mode where I generate the SAN internally for every legal move in the current position, and compare. If this fails, give up, move to next game.
It worked on debug checking of a database of 960k games on all but 28 moves... and these were all "incorrect" ambiguity, as described above.
Richard
Code: Select all
uint santomove(const string sanmove)
{
uint length = sanmove.length();
uint last = length-1;
ASS(length>1&&length<8);
// cout<<"\n move in "<<sanmove;
bool capture = false;
bool promote = false;
bool fileambig = false;
bool rankambig = false;
bool found = false;
string::size_type locater = sanmove.find("x",0);
if(locater != string::npos) capture = true;
locater = sanmove.find("=",0);
if(locater != string::npos) promote = true;
// if(capture) cout<<" capture "; else cout<<" not capture ";
// if(promote) cout<<" promote "; else cout<<" not promote ";
uint piece=0;
uint side = SIDENOW;
uint ff,tf,fr,tr,from=0,to=0,cap=0,prom=0,flag=0,move;
ASS(side>=cW&&side<=cB);
if(sanmove == "0-0" || sanmove == "O-O" || sanmove == "0-0+" || sanmove == "O-O+" || sanmove == "0-0#" || sanmove == "O-O#")
{
cap = pE; flag = FlagCA;
if(side==cW) { from = E1; to = G1; }
else { from = E8; to = G8; }
}
else if(sanmove == "0-0-0" || sanmove == "O-O-O" || sanmove == "0-0-0+" || sanmove == "O-O-O+" || sanmove == "0-0-0#" || sanmove == "O-O-O#")
{
cap = pE; flag = FlagCA;
if(side==cW) { from = E1; to = C1; }
else { from = E8; to = C8; }
}
else if(sanmove[0] >= 'a')//pawn move?
{
// cout<<" pawn move ";
ASS(sanmove[0] >= 'a' && sanmove[0] <= 'h');
if(side == cW) piece = pwP;
else piece = pbP;
//if we don't have a capture, the first letter will be the file of the target square
if(!capture) //e4 or e8=Q
{
ASS(sanmove[1] >= '1' && sanmove[1] <= '8');
tf = chartofile(sanmove[0]);
tr = chartorank(sanmove[1]);
to = fr2sq(tf,tr);
ASS(BRDPCE(to)==pE);
ASS(onbrd(to));
//could have been a 2 move start
if(side==cW)
{
if(ranks[to]==RANK4)
{
if(BRDPCE(to+S)==pE) from=to+S+S;
else from=to+S;
}
else
{
from = to+S;
}
}
else
{
if(ranks[to]==RANK5)
{
if(BRDPCE(to+N)==pE) from=to+N+N;
else from=to+N;
}
else
{
from = to+N;
}
}
}//end of if !capture
else //form will be exf4
{
//cout<<" pcap ";
tf = chartofile(sanmove[2]);
tr = chartorank(sanmove[3]);
if(side==cW) fr = tr-1;
else fr = tr+1;
ff = chartofile(sanmove[0]);
to = fr2sq(tf,tr);
// cout<<"\n to = "<<printsquare(to);
cap = BRDPCE(to);
// cout<<" f "<<ff<<" r "<<fr;
from = fr2sq(ff,fr);
// cout<<" ? "<<printsquare(to);
ASS(onbrd(from));
ASS(onbrd(to));
ASS( (cap==pE && flag==FlagEP) || piecegood(cap) );
if(BRDPCE(to)==pE) flag=FlagEP; //ep cap
}
}
else//not pawn move
{
//identify the moving piece
switch(sanmove[0])
{
case 'K': if (side==cW) piece=pwK;else piece=pbK; break;
case 'Q': if (side==cW) piece=pwQ;else piece=pbQ; break;
case 'R': if (side==cW) piece=pwR;else piece=pbR; break;
case 'B': if (side==cW) piece=pwB;else piece=pbB; break;
case 'N': if (side==cW) piece=pwN;else piece=pbN; break;
default: {cout<<"\n piece unidentified, move "<<sanmove<<endl<<endl;exit(1);}
}
ASS(piecegood(piece));
// cout<<" piece move "<<piecechar(piece)<<" ";
//idendtify ambiguity
if(sanmove[1] >='1' && sanmove[1] <= '8')
{
rankambig=true;
// cout<<" rankambig ";
}
else
{
if(capture)
{
if(sanmove[1] >='a' && sanmove[1] <= 'h')
{
fileambig=true;
// cout<<" fileambig ";
}
}
else
{
if((sanmove[1] >='a' && sanmove[1] <= 'h') && (sanmove[2] >='a' && sanmove[2] <= 'h') )
{
fileambig=true;
// cout<<" fileambig2 ";
}
}
}
//now identify the to sq
if(!capture)
{
if(!rankambig && !fileambig)
{
tf = chartofile(sanmove[1]);
tr = chartorank(sanmove[2]);
to = fr2sq(tf,tr);
ASS(onbrd(to));
ASS(BRDPCE(to)==pE);
}
else
{
tf = chartofile(sanmove[2]);
tr = chartorank(sanmove[3]);
to = fr2sq(tf,tr);
ASS(onbrd(to));
ASS(BRDPCE(to)==pE);
}
}
else//is capture
{
if(!rankambig && !fileambig) //Nxd4
{
tf = chartofile(sanmove[2]);
tr = chartorank(sanmove[3]);
to = fr2sq(tf,tr);
cap = BRDPCE(to);
ASS(onbrd(to));
ASS(BRDPCE(to)!=pE);
}
else //Nexd4
{
tf = chartofile(sanmove[3]);
tr = chartorank(sanmove[4]);
to = fr2sq(tf,tr);
cap = BRDPCE(to);
ASS(onbrd(to));
ASS(BRDPCE(to)!=pE);
}
}
uint ar;
uint af;
uint i,pcenum,sq,tsq;
const int *movedir;
//now need the from sqaure, start with ambig, it's easier, just look for the piece on relevant file or rank
if(rankambig)
{
ar = chartorank(sanmove[1]);
ASS(ranks[N*ar]==ar);
for(i = N*ar; !(i&0x88); i+=E)
{
if(BRDPCE(i)==piece) { from=i; break; }
}
}
else if(fileambig)
{
af = chartofile(sanmove[1]);
// cout<<" ambigfile = "<<af;
for(i = af; !(i&0x88); i+=N)
{
// cout<<" trying "<<printsquare(i);
if(BRDPCE(i)==piece) {
from=i; break; }
}
}
else//not ambig, need to find the piece using the to square and the known piece type
{
pcenum = PCENUM(piece);
while(pcenum)
{
sq = PCESQ(piece, pcenum);
// cout<<" onsq "<<printsquare(sq);
ASS(onbrd(sq));
for(movedir = dirPieces[piece]; *movedir ; movedir++)
{
if(ISSLIDE(piece))
{
for(tsq = sq+*movedir; !(tsq & 0x88); tsq += *movedir)
{
// cout<<" to tsq "<<printsquare(tsq);
if(tsq==to) {from=sq; found = true; break;}
if(BRDPCE(tsq)!=pE) break;
}
}
else
{
if(sq+*movedir==to) {from=sq; found = true; break;}
}
}
if(found) break;
pcenum--;
}
}
}
if(promote)
{
ASS(ranks[to]==RANK1 || ranks[to]==RANK8);
if(sanmove[last]=='#' || sanmove[last]=='+') last--;
switch (sanmove[last])
{
case 'Q': if(side==cW) prom = pwQ; else prom = pbQ; break;
case 'R': if(side==cW) prom = pwR; else prom = pbR; break;
case 'B': if(side==cW) prom = pwB; else prom = pbB; break;
case 'N': if(side==cW) prom = pwN; else prom = pbN; break;
default: cout<<"\n sanmove prom piece not found";exit(1);ASS(true==false);break;
}
}
move = (from | (to<<7) | (cap<<16) | (prom<<20) | flag);
// cout<<"\n move is "<<printmove(move)<<" "<<move;
// cout<<" from "<<FROM(move)<<" to "<<TO(move)<<" flag "<<FLAG(move)<<" cap "<<CAP(move);
#ifdef DEBUG
gen_all_moves(NULLMOVE);
found = false;
uint i;
//first loop, looking for move and if it gives check, or mate
for( i = 0; i < MOVECOUNT(PLYNOW); ++i)
{
if(MOVE(PLYNOW,i) == move)
found=true;
}
ASS(found);
#endif
return move;
}
(This isn't my job, it's a hobbby )
It then tries to "make" the returned move. If it's illegal, then I enter a slow mode where I generate the SAN internally for every legal move in the current position, and compare. If this fails, give up, move to next game.
It worked on debug checking of a database of 960k games on all but 28 moves... and these were all "incorrect" ambiguity, as described above.
Richard