Quite amazing. How did you exactly create the file? Fully randomly?
There was a whole talkchess thread about it and some documents created, see links at the end.
In short, the positions are "fully random" in the sense that every legal (reachable from the start position) position had the same probability to be included in the sample. It was constructed like this:
Create a set S that contains all legal chess positions and optionally some non-legal positions. John Tromp constructed such a set of size N ~= 8.7e45.
Create a one-to-one mapping between the positions in S and the integers 1, 2, ..., N. Tromp constructed such a mapping too.
Create a random sample from the integers 1, 2, ..., N. Convert each integer to the corresponding chess position.
For each position, determine if it is legal. I created a program that can do this in most cases. The remaining positions have to be analyzed by hand.
The file I linked to before contains all legal positions (and some non-legal positions) that resulted from applying the above procedure.
There will be people who who say you cannot say if a position is
drawn unless you can prove it with a tablebase.
Mathematically correct maybe, but not realistic because there's a drawing
margin, and within that margin a position will stay drawish. Sure i can think
of a position which looks drawish and only after >60 ply analysis then suddenly
it the position isn't a draws after all. But then with retro analysis of the (imaginary?)
game you can see that White of Black could have avoided such a positions.
Ergo, in general, if Black plays correctly, and White not too bad, then the percentage
of drawn positions in chess is 100 pct (unless one side makes a mistake).
adding to that, there are positions ofcourse which are not a result of a normal game;
first 1) arbitrary positions, without any requirement that it's a (legal) result of a normal move
sequence. Second, 2) positions which are a result of a 'game', ie legal sequence of moves
no matter how crazy this sequence may have been.
Rough subjective estimate of the nr of draw positions:
4+/-2 pct of category 1)
and 7 +/-3 for category 2).
Just a first estimate, i may be wrong, it's difficult to estimate this without looking at
the real nrs; which someone else might do (SF7 with 25 ply and a score within +/- 0.15
may count as 'draw' (although that ofcourse isn't always certain, but probably
in >99 pct of such positions).
TCEC positions are not relevant. They arise from biased openings and strong engines.
But if we are sampling uniformly from all legal positions, then my intuition is that almost all positions will be very imbalanced, and almost all imbalanced positions will be winning for one side.
If we sample uniformly, say at move around 20, from games between strong engines, with no opening book, then most positions will be drawish. But in an actual experiment one has to be careful and not sample exactly after n moves, because one side may have captured a piece at the end of n full moves and the other side may be going to capture it back, so sampling exactly after n moves will be biased.