Finally got the thing to run. Not exactly a fair comparison yet, as the new version is compiled using PGO while the old version is not. I will work on that next. But for the results... this is using just one cpu on the E5345 box I mentioned. I picked the first position from my test file and ran both. Had to (obviously) use different depths since the old program doesn't do LMR or anywhere near the forward-pruning 23.4 is doing. But check out the nps numbers:
log.001: time=9.63 mat=0 n=38507117 fh=91% nps=4.0M
log.002: time: 6.39 cpu:100% mat:0 n:25801056 nps:4017283
For this position, the NPS is almost exactly the same, which is pretty damned good. Likely would mean that the old program can reach 4.5M nps with PGO (assuming 10%, it could be a bit more). I sort of expected the old version to be faster since the new version has a bit more in the eval, So they end up close.
And when you add in SMP, where both versions go to over 30M nodes per second, it seems that at least for Crafty, that 1000x number is correct. Got some cleanup to do (old version only uses protocol version 1 stuff, but my referee expects "move xxx" so I have to get that to work next.
I will add that for the above position, new version (log.001) searched to depth=19, old version (log.002) searched to depth=12. Tried to find something close to comparable. Quite a difference in depth, but the plies are nowhere near equivalent. Be interesting to see how this old version performs on the cluster test...
old crafty vs new crafty on new hardware.
Moderators: hgm, Rebel, chrisw
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: old crafty vs new crafty on new hardware.
Ack. More than I thought. Need some protover 2 stuff (myname) which my referee needs for the pgn. Old crafty only used "setboard" while new will accept a straight FEN string by itself. Referee does not send setboard, looks easier to fix referee than old version. Most likely there are other things as well. Looks like something to play around with for the weekend.. Really want to get an Elo for this 1995 version, if I can...bob wrote:Finally got the thing to run. Not exactly a fair comparison yet, as the new version is compiled using PGO while the old version is not. I will work on that next. But for the results... this is using just one cpu on the E5345 box I mentioned. I picked the first position from my test file and ran both. Had to (obviously) use different depths since the old program doesn't do LMR or anywhere near the forward-pruning 23.4 is doing. But check out the nps numbers:
log.001: time=9.63 mat=0 n=38507117 fh=91% nps=4.0M
log.002: time: 6.39 cpu:100% mat:0 n:25801056 nps:4017283
For this position, the NPS is almost exactly the same, which is pretty damned good. Likely would mean that the old program can reach 4.5M nps with PGO (assuming 10%, it could be a bit more). I sort of expected the old version to be faster since the new version has a bit more in the eval, So they end up close.
And when you add in SMP, where both versions go to over 30M nodes per second, it seems that at least for Crafty, that 1000x number is correct. Got some cleanup to do (old version only uses protocol version 1 stuff, but my referee expects "move xxx" so I have to get that to work next.
I will add that for the above position, new version (log.001) searched to depth=19, old version (log.002) searched to depth=12. Tried to find something close to comparable. Quite a difference in depth, but the plies are nowhere near equivalent. Be interesting to see how this old version performs on the cluster test...
-
- Posts: 13447
- Joined: Wed Mar 08, 2006 9:02 pm
- Location: Dallas, Texas
- Full name: Matthew Hull
Re: old crafty vs new crafty on new hardware.
What version is old crafty, 9.x or thereabouts or earlier than that?bob wrote:Ack. More than I thought. Need some protover 2 stuff (myname) which my referee needs for the pgn. Old crafty only used "setboard" while new will accept a straight FEN string by itself. Referee does not send setboard, looks easier to fix referee than old version. Most likely there are other things as well. Looks like something to play around with for the weekend.. Really want to get an Elo for this 1995 version, if I can...bob wrote:Finally got the thing to run. Not exactly a fair comparison yet, as the new version is compiled using PGO while the old version is not. I will work on that next. But for the results... this is using just one cpu on the E5345 box I mentioned. I picked the first position from my test file and ran both. Had to (obviously) use different depths since the old program doesn't do LMR or anywhere near the forward-pruning 23.4 is doing. But check out the nps numbers:
log.001: time=9.63 mat=0 n=38507117 fh=91% nps=4.0M
log.002: time: 6.39 cpu:100% mat:0 n:25801056 nps:4017283
For this position, the NPS is almost exactly the same, which is pretty damned good. Likely would mean that the old program can reach 4.5M nps with PGO (assuming 10%, it could be a bit more). I sort of expected the old version to be faster since the new version has a bit more in the eval, So they end up close.
And when you add in SMP, where both versions go to over 30M nodes per second, it seems that at least for Crafty, that 1000x number is correct. Got some cleanup to do (old version only uses protocol version 1 stuff, but my referee expects "move xxx" so I have to get that to work next.
I will add that for the above position, new version (log.001) searched to depth=19, old version (log.002) searched to depth=12. Tried to find something close to comparable. Quite a difference in depth, but the plies are nowhere near equivalent. Be interesting to see how this old version performs on the cluster test...
Matthew Hull
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: old crafty vs new crafty on new hardware.
This is 10.18, which is all I have. This played in the 1996 WMCCC event during the Summer (Jakarta event). Back in 1995 I had the complete disk failure that lost all old versions, and we discovered that our tape backup system was merrily writing backup tapes that could not be read. Versions thru 9 were done very early in 1995. If you look at the comments, most new versions (major versions) were done quickly as major features were added.,..mhull wrote:What version is old crafty, 9.x or thereabouts or earlier than that?bob wrote:Ack. More than I thought. Need some protover 2 stuff (myname) which my referee needs for the pgn. Old crafty only used "setboard" while new will accept a straight FEN string by itself. Referee does not send setboard, looks easier to fix referee than old version. Most likely there are other things as well. Looks like something to play around with for the weekend.. Really want to get an Elo for this 1995 version, if I can...bob wrote:Finally got the thing to run. Not exactly a fair comparison yet, as the new version is compiled using PGO while the old version is not. I will work on that next. But for the results... this is using just one cpu on the E5345 box I mentioned. I picked the first position from my test file and ran both. Had to (obviously) use different depths since the old program doesn't do LMR or anywhere near the forward-pruning 23.4 is doing. But check out the nps numbers:
log.001: time=9.63 mat=0 n=38507117 fh=91% nps=4.0M
log.002: time: 6.39 cpu:100% mat:0 n:25801056 nps:4017283
For this position, the NPS is almost exactly the same, which is pretty damned good. Likely would mean that the old program can reach 4.5M nps with PGO (assuming 10%, it could be a bit more). I sort of expected the old version to be faster since the new version has a bit more in the eval, So they end up close.
And when you add in SMP, where both versions go to over 30M nodes per second, it seems that at least for Crafty, that 1000x number is correct. Got some cleanup to do (old version only uses protocol version 1 stuff, but my referee expects "move xxx" so I have to get that to work next.
I will add that for the above position, new version (log.001) searched to depth=19, old version (log.002) searched to depth=12. Tried to find something close to comparable. Quite a difference in depth, but the plies are nowhere near equivalent. Be interesting to see how this old version performs on the cluster test...
Version 10.0 was a new book format (learning etc) and was started in August/September 1995. In looking at the comments, most changes were related to that. We didn't release 10.18 until Jakarta was done, and at that point the new versions started to slow down. Thru the middle of the 10.x series, I was releasing a new version almost daily, fixing bugs or adding features that were requested, many of which did not improve the chess playing (annotate code, analyze mode for analysis, etc.)
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: old crafty vs new crafty on new hardware. Some results.
This is quite early, but is perhaps a bit surprising. Looks like I have this 10x version working on the cluster (will have to wait for a complete run and check the PGN for oddities to make sure it is not losing on time excessively or anything). Here is results compared to 23.4, and some of the lower-rated programs in my test group:
I was thinking this would be much worse. To clarify what the above is...
Everything is running on our cluster. This is the cluster I have used to post _all_ results here in recent years, it is hardware about 4 years old as previously mentioned. Crafty-10.18 is about 10% slower than it should be as I have yet to tackle the PGO stuff. Took a lot of work to make the old version work with more modern xboard protocol. Had a lot of fun with force and such.
I'll report the final results for this run, although this will not be the overall "final results." Got to make sure nothing odd is happening in the PGN, and then get the PGO working.
More later, but at least it seems to be playing... All this really measures is "how far behind is 1995 Crafty, giving everyone equal (and modern) hardware. I'd suspect it would not be as far behind if everyone was on a P5/90, will work on that angle later.
Code: Select all
Crafty-23.4 2703 4 4 30000 66% 2579 22%
Crafty-23.3 2693 4 4 30000 65% 2579 22%
Crafty-23.1 2622 4 4 30000 55% 2579 23%
Glaurung 2.2 2606 3 3 60277 46% 2636 22%
Toga2 2599 3 3 60275 45% 2636 23%
Fruit 2.1 2501 3 3 60248 32% 2636 21%
Glaurung 1.1 SMP 2444 3 3 60267 26% 2636 17%
Crafty-10.18 2326 19 19 1327 20% 2580 14%
Everything is running on our cluster. This is the cluster I have used to post _all_ results here in recent years, it is hardware about 4 years old as previously mentioned. Crafty-10.18 is about 10% slower than it should be as I have yet to tackle the PGO stuff. Took a lot of work to make the old version work with more modern xboard protocol. Had a lot of fun with force and such.
I'll report the final results for this run, although this will not be the overall "final results." Got to make sure nothing odd is happening in the PGN, and then get the PGO working.
More later, but at least it seems to be playing... All this really measures is "how far behind is 1995 Crafty, giving everyone equal (and modern) hardware. I'd suspect it would not be as far behind if everyone was on a P5/90, will work on that angle later.
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: old crafty vs new crafty on new hardware. Some results.
Did find one small bug and have re-started. En Passant target changed. In 1995 my FEN parser assumed that the target was the square the pawn stopped on, not the square the pawn passed over. This caused an occasional time loss as the few positions with EP captures would cause Crafty to lose on time. There were not many, statistically, but I have fixed it and have re-started the test. Will let it run for 15 minutes or so to see if I see any other time losses that should not happen in an increment game...
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: old crafty vs new crafty on new hardware. Some results.
Looks pretty good so far. 6000+ games, 6 lost on time, all by old crafty. In two of those it was winning, 1 was lost, and 3 were just games. Not going to try to fix this as this is what was in 1995...bob wrote:Did find one small bug and have re-started. En Passant target changed. In 1995 my FEN parser assumed that the target was the square the pawn stopped on, not the square the pawn passed over. This caused an occasional time loss as the few positions with EP captures would cause Crafty to lose on time. There were not many, statistically, but I have fixed it and have re-started the test. Will let it run for 15 minutes or so to see if I see any other time losses that should not happen in an increment game...
-
- Posts: 5106
- Joined: Tue Apr 29, 2008 4:27 pm
Re: old crafty vs new crafty on new hardware. Some results.
Ok, Are these running head to head with no time handicap?
bob wrote:This is quite early, but is perhaps a bit surprising. Looks like I have this 10x version working on the cluster (will have to wait for a complete run and check the PGN for oddities to make sure it is not losing on time excessively or anything). Here is results compared to 23.4, and some of the lower-rated programs in my test group:
I was thinking this would be much worse. To clarify what the above is...Code: Select all
Crafty-23.4 2703 4 4 30000 66% 2579 22% Crafty-23.3 2693 4 4 30000 65% 2579 22% Crafty-23.1 2622 4 4 30000 55% 2579 23% Glaurung 2.2 2606 3 3 60277 46% 2636 22% Toga2 2599 3 3 60275 45% 2636 23% Fruit 2.1 2501 3 3 60248 32% 2636 21% Glaurung 1.1 SMP 2444 3 3 60267 26% 2636 17% Crafty-10.18 2326 19 19 1327 20% 2580 14%
Everything is running on our cluster. This is the cluster I have used to post _all_ results here in recent years, it is hardware about 4 years old as previously mentioned. Crafty-10.18 is about 10% slower than it should be as I have yet to tackle the PGO stuff. Took a lot of work to make the old version work with more modern xboard protocol. Had a lot of fun with force and such.
I'll report the final results for this run, although this will not be the overall "final results." Got to make sure nothing odd is happening in the PGN, and then get the PGO working.
More later, but at least it seems to be playing... All this really measures is "how far behind is 1995 Crafty, giving everyone equal (and modern) hardware. I'd suspect it would not be as far behind if everyone was on a P5/90, will work on that angle later.
-
- Posts: 5106
- Joined: Tue Apr 29, 2008 4:27 pm
Re: old crafty vs new crafty on new hardware. Some results.
Hi Bob,
I think these numbers are proving that software is a bigger contributor to chess improvement than hardware. I am surprised that YOUR test (which I consider biased) is proving MY point in this case.
Don
I think these numbers are proving that software is a bigger contributor to chess improvement than hardware. I am surprised that YOUR test (which I consider biased) is proving MY point in this case.
Don
bob wrote:This is quite early, but is perhaps a bit surprising. Looks like I have this 10x version working on the cluster (will have to wait for a complete run and check the PGN for oddities to make sure it is not losing on time excessively or anything). Here is results compared to 23.4, and some of the lower-rated programs in my test group:
I was thinking this would be much worse. To clarify what the above is...Code: Select all
Crafty-23.4 2703 4 4 30000 66% 2579 22% Crafty-23.3 2693 4 4 30000 65% 2579 22% Crafty-23.1 2622 4 4 30000 55% 2579 23% Glaurung 2.2 2606 3 3 60277 46% 2636 22% Toga2 2599 3 3 60275 45% 2636 23% Fruit 2.1 2501 3 3 60248 32% 2636 21% Glaurung 1.1 SMP 2444 3 3 60267 26% 2636 17% Crafty-10.18 2326 19 19 1327 20% 2580 14%
Everything is running on our cluster. This is the cluster I have used to post _all_ results here in recent years, it is hardware about 4 years old as previously mentioned. Crafty-10.18 is about 10% slower than it should be as I have yet to tackle the PGO stuff. Took a lot of work to make the old version work with more modern xboard protocol. Had a lot of fun with force and such.
I'll report the final results for this run, although this will not be the overall "final results." Got to make sure nothing odd is happening in the PGN, and then get the PGO working.
More later, but at least it seems to be playing... All this really measures is "how far behind is 1995 Crafty, giving everyone equal (and modern) hardware. I'd suspect it would not be as far behind if everyone was on a P5/90, will work on that angle later.
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: complete results
I've explained that several times now. This is version 10.x, compiled on my E5345 box, running perfectly normally. Same time control that I used to produce the 23.4 results. No time handicap. Same (actually almost the same) hash sizes. 23.4 uses a power of 2 since bucketsize=4, 10.x uses the older Belle approach which means 3/4 of a power of 2. So 10.x is using 3/4 the hash size of 23.4... can't fix that without changing the hash, and then it would not be quite 10.x any more.Don wrote:Ok, Are these running head to head with no time handicap?
Here's the final numbers:bob wrote:This is quite early, but is perhaps a bit surprising. Looks like I have this 10x version working on the cluster (will have to wait for a complete run and check the PGN for oddities to make sure it is not losing on time excessively or anything). Here is results compared to 23.4, and some of the lower-rated programs in my test group:
I was thinking this would be much worse. To clarify what the above is...Code: Select all
Crafty-23.4 2703 4 4 30000 66% 2579 22% Crafty-23.3 2693 4 4 30000 65% 2579 22% Crafty-23.1 2622 4 4 30000 55% 2579 23% Glaurung 2.2 2606 3 3 60277 46% 2636 22% Toga2 2599 3 3 60275 45% 2636 23% Fruit 2.1 2501 3 3 60248 32% 2636 21% Glaurung 1.1 SMP 2444 3 3 60267 26% 2636 17% Crafty-10.18 2326 19 19 1327 20% 2580 14%
Everything is running on our cluster. This is the cluster I have used to post _all_ results here in recent years, it is hardware about 4 years old as previously mentioned. Crafty-10.18 is about 10% slower than it should be as I have yet to tackle the PGO stuff. Took a lot of work to make the old version work with more modern xboard protocol. Had a lot of fun with force and such.
I'll report the final results for this run, although this will not be the overall "final results." Got to make sure nothing odd is happening in the PGN, and then get the PGO working.
More later, but at least it seems to be playing... All this really measures is "how far behind is 1995 Crafty, giving everyone equal (and modern) hardware. I'd suspect it would not be as far behind if everyone was on a P5/90, will work on that angle later.
Code: Select all
Crafty-23.4-2 2749 3 3 30000 66% 2626 22%
Crafty-23.4-1 2746 3 3 30000 66% 2626 22%
Crafty-10.18-1 2388 4 4 30000 22% 2626 14%
Crafty-10.18-2 2387 4 4 30000 22% 2626 14%