Crafty tests show that Software has advanced more.

Discussion of chess software programming and technical issues.

Moderators: bob, hgm, Harvey Williamson

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
bob
Posts: 20914
Joined: Mon Feb 27, 2006 6:30 pm
Location: Birmingham, AL

Re: Crafty tests show that Software has advanced more.

Post by bob » Mon Sep 13, 2010 3:48 am

michiguel wrote:
bob wrote:
michiguel wrote:
bob wrote:
michiguel wrote:
bob wrote:
michiguel wrote:
bob wrote:
michiguel wrote:
mhull wrote:
Don wrote:
mhull wrote:
Don wrote:The main problem is the 64 bit vs 32 bit difference.
If you believe that, then why are you testing 32-bit Rebel on 64-bit modern hardware? That's rendering Rebel as cripple-ware, because its not optimized for 64-bit. So it's unfair by your definition.
A 32 bit program is not crippled on a 64 bit machine. Run a 32 bit program on a 32 bit machine and then time it on a 64 bit machine and you will see it runs just as well.

Then do the same experiment with a 64 bit program and your eyes will be opened.
Don wrote: There is this argument that 32 bit is not the right way to write a program that runs on a 64 bit machine. But I don't think anyone has actually proved that. It's difficult to prove because it's a whole different way of writing a program so you cannot just compare 2 programs.
Usually, sowing a little FUD is a resource for a not-very-strong argument. I'll sprinkle some agent orange on it by saying most people think crafty is one of the faster searchers. Sure, it's not proof, just like your doubts.
Don wrote:The primary argument in favor of 64 bit is Rybka,
It could hardly have been so in the 1990s when crafty was born.
Because there was none in the 90's!

Bob was heavily criticized for this design choice, and day after day we had to hear the argument "How many bitboard programs are in the top 10?" blah, blah, blah. Few believe in bitboards, or at least, it was a common strategy to disregard the technique as not practical. Many times I suspected it was a commercial strategy to disregard what you were not doing, but that was my impression. I followed this argument very closely over the years because I started to program my move generator by 1997 in bitboards. I never did anything different and I was very interested to hear the comments (now bob extrapolates that I criticize his choice :roll:).

Rybka was finally the break that FINALLY convinced people that the technique could lead a program to the top. If the #1 is a bitboard program, of course, the previous statement was proven.

Miguel


Sorry, but this is wrong. Chess 4.x was bitboard in 1974. As were the Russians with Kaissa. As was Tom Truscott with Duchess in 1977. Bitboards have been around since the beginning of computer chess. We had dozens of bitboard programs well before Rybka became yet another bitboard program. Rybka was not "the beginning". It was well after "the end" of the debate.
Very well known facts, but irrelevant to what I wrote.

Rybka was not after the end of the debate, Rybka was the end of the debate.

Miguel
What is that based on? We had dozens of bitboard programs prior to Rybka. I do not know of any after Rybka that were not there before it was modified to use bitboards... There is a limit to how much credit one can give to Rybka. It's certainly strong. It was not the "final coronation" for bitboard programs. It came to that party _way_ late.
Who is talking about credit?

When Rybka become #1, it was not possible anymore to come with any type of argument to demonstrate the alleged "inferiority" of bitboards (even if bitboards had nothing to do with R becoming #1). The fact is all the discussions we had seen in the past evaporated.

Miguel
Exactly _what_ discussions? The only person I recall in the past 10-15 years continually bashing bitboards was Vincent D. I've not seen anybody else suggest that they were not working, and I have seen dozens of new programs using them. The magic move generation development showed that many were using them. Etc. If Vincent carries that much weight with you, fine. But I have not seen anyone else suggest bitboards are no good. And I haven't seen anyone at all suggest such in the past 5 years, well before Rybka existed or used bitboards.

This is wishful thinking.
It took me few seconds to find a post from C. Theron criticizing bitboards.
Yes, Vincent too.

>>>>You have just explained why the bitboarders are less handicapped on 64 bits
>>>>machines.
>>>>
>>>>You have not explained why they are supposed to have "a bit of performance
>>>>advantage on 64 bits processors".
>>>>
>>>>
>>>>
>>>> Christophe

>>>
>>>Clearly, nothing beats the ugliness of bitboards.
>>>
>>>
>>>
>>> Christophe

I cannot believe you do not remember any of these posts.

Miguel
"ugliness" has nothing to do with performance. Any others you can recall? CT has not posted here in at least 5 years or so, so as I said, nothing recently.
That is the whole point!! because all of those discussions happened before Rybka became #1

You want more posts? Why? so you can keep ignoring them ;-) ? You answered the second but ignored the first one, which said

>>>>You have just explained why the bitboarders are less handicapped on 64 bits
>>>>machines.

That is directly related to performance.

Miguel
Sorry, but that is an exaggeration. OK, rybka using bitboards might have silenced CT. He's not been active for 10 years now. But bitboards were here to stay 10 years ago. We had an active group working on them when I started in late 1994. We now have hundreds of bitboard programs running around. This is not "post Rybka". In fact, one could quite logically make that case that all the bitboard use in the past convinced _Vas_ to switch. Not the other way around...

BubbaTough
Posts: 1154
Joined: Fri Jun 23, 2006 3:18 am

Re: Crafty tests show that Software has advanced more.

Post by BubbaTough » Mon Sep 13, 2010 3:50 am

bob wrote:
BubbaTough wrote:
bob wrote:
BubbaTough wrote:
bob wrote:
You don't like my using a program from 1995, that was designed to run on 1995 hardware and be fast, and now because it happens to be able to use the last 8 years of hardware improvements, that is a no-no. It is breaking some mythical rule that you attribute to me but I do not recall ever making such.
I don't understand why one would need to pick something that is still actively developed in both time periods. The odds that one thing is a good representative of the state of software at both points is pretty low. I know comparing the program I wrote in 1990 with my current program, while interesting (to me), would not really add much to this debate.

If you need source code to get things running, you could pick the best open source code from both periods. One advantage of this, is that the best open source is often extremely influential on other programs of the time. Stockfish from this period would be fine for representing top open source from this period. Not sure what represents the best open source from the past period you are trying to compare against...perhaps Crafty? If so, Crafty 1995 vs. Stockfish 2010 would be an interesting comparison.


-Sam
The point is that it is very hard to compare program A(1995) to program B(2010). It's simply easier to run all the tests when you have the source of both, and it is (to me) a bit easier to use the same program on both ends. Each program does things differently, each can easily produce different results in these tests. Some, for example, won't see the big speedup I see in Crafty. Comparing apples to oranges always makes things less clear.
I don't understand why its harder to compare top 1995 open source to top 2010 open source than top 1995 crafty to top 2010 crafty. The point is not to determine crafty software progress, the point is to determine software progress (redefined by me as open source software progress because I know you want source code).

I understand its easier to have the source code, luckily you do have the Stockfish 2010 source code (and Crafty 1995 source code). Regarding comparing apples to oranges...well whole point of the exercise is to compare apples to oranges, so I guess we are stuck there.

-Sam
Simple. there are two components. Hardware and software. If one gets more from hardware than the other, then comparing the software advantage is impossible because you will be measuring hardware when you test both on the same machine. We've already seen that Rebel sucks on new hardware. This affects its overall Elo as well.

Much easier if there is only one variable measured at a time. Same version on two different hardware platforms. Different versions on same hardware platform. Then you know which is hardware and which is software. If you mix, you won't.
Yes you want to change one variable at a time. I was kind of assuming you wanted to gather 4 results:

1. 1995 hardware with 2010 software
2. 2010 hardware with 1995 software
3. 2010 hardware with 2010 software
4. 1995 hardware with 1995 hardware

nice and simple assuming you can gather results comparing them. I am assuming you would be playing games on a private server or something so that the old hardware can play against new hardware to ensure the rating pools are joined. I guess an easier but less accurate way would be to take a few measurements, and then artificially cripple the nodes per second of the version supposedly running on old hardware but actually run it on new hardware.

Anyway, however you want to gather your results, my question was only regarding what software represents the 2010 software. My recommendation was Stockfish.

-Sam

User avatar
michiguel
Posts: 6401
Joined: Thu Mar 09, 2006 7:30 pm
Location: Chicago, Illinois, USA
Contact:

Re: Crafty tests show that Software has advanced more.

Post by michiguel » Mon Sep 13, 2010 3:52 am

bob wrote:
michiguel wrote:
bob wrote:
michiguel wrote:
bob wrote:
michiguel wrote:
bob wrote:
michiguel wrote:
bob wrote:
michiguel wrote:
mhull wrote:
Don wrote:
mhull wrote:
Don wrote:The main problem is the 64 bit vs 32 bit difference.
If you believe that, then why are you testing 32-bit Rebel on 64-bit modern hardware? That's rendering Rebel as cripple-ware, because its not optimized for 64-bit. So it's unfair by your definition.
A 32 bit program is not crippled on a 64 bit machine. Run a 32 bit program on a 32 bit machine and then time it on a 64 bit machine and you will see it runs just as well.

Then do the same experiment with a 64 bit program and your eyes will be opened.
Don wrote: There is this argument that 32 bit is not the right way to write a program that runs on a 64 bit machine. But I don't think anyone has actually proved that. It's difficult to prove because it's a whole different way of writing a program so you cannot just compare 2 programs.
Usually, sowing a little FUD is a resource for a not-very-strong argument. I'll sprinkle some agent orange on it by saying most people think crafty is one of the faster searchers. Sure, it's not proof, just like your doubts.
Don wrote:The primary argument in favor of 64 bit is Rybka,
It could hardly have been so in the 1990s when crafty was born.
Because there was none in the 90's!

Bob was heavily criticized for this design choice, and day after day we had to hear the argument "How many bitboard programs are in the top 10?" blah, blah, blah. Few believe in bitboards, or at least, it was a common strategy to disregard the technique as not practical. Many times I suspected it was a commercial strategy to disregard what you were not doing, but that was my impression. I followed this argument very closely over the years because I started to program my move generator by 1997 in bitboards. I never did anything different and I was very interested to hear the comments (now bob extrapolates that I criticize his choice :roll:).

Rybka was finally the break that FINALLY convinced people that the technique could lead a program to the top. If the #1 is a bitboard program, of course, the previous statement was proven.

Miguel


Sorry, but this is wrong. Chess 4.x was bitboard in 1974. As were the Russians with Kaissa. As was Tom Truscott with Duchess in 1977. Bitboards have been around since the beginning of computer chess. We had dozens of bitboard programs well before Rybka became yet another bitboard program. Rybka was not "the beginning". It was well after "the end" of the debate.
Very well known facts, but irrelevant to what I wrote.

Rybka was not after the end of the debate, Rybka was the end of the debate.

Miguel
What is that based on? We had dozens of bitboard programs prior to Rybka. I do not know of any after Rybka that were not there before it was modified to use bitboards... There is a limit to how much credit one can give to Rybka. It's certainly strong. It was not the "final coronation" for bitboard programs. It came to that party _way_ late.
Who is talking about credit?

When Rybka become #1, it was not possible anymore to come with any type of argument to demonstrate the alleged "inferiority" of bitboards (even if bitboards had nothing to do with R becoming #1). The fact is all the discussions we had seen in the past evaporated.

Miguel
Exactly _what_ discussions? The only person I recall in the past 10-15 years continually bashing bitboards was Vincent D. I've not seen anybody else suggest that they were not working, and I have seen dozens of new programs using them. The magic move generation development showed that many were using them. Etc. If Vincent carries that much weight with you, fine. But I have not seen anyone else suggest bitboards are no good. And I haven't seen anyone at all suggest such in the past 5 years, well before Rybka existed or used bitboards.

This is wishful thinking.
It took me few seconds to find a post from C. Theron criticizing bitboards.
Yes, Vincent too.

>>>>You have just explained why the bitboarders are less handicapped on 64 bits
>>>>machines.
>>>>
>>>>You have not explained why they are supposed to have "a bit of performance
>>>>advantage on 64 bits processors".
>>>>
>>>>
>>>>
>>>> Christophe

>>>
>>>Clearly, nothing beats the ugliness of bitboards.
>>>
>>>
>>>
>>> Christophe

I cannot believe you do not remember any of these posts.

Miguel
"ugliness" has nothing to do with performance. Any others you can recall? CT has not posted here in at least 5 years or so, so as I said, nothing recently.
That is the whole point!! because all of those discussions happened before Rybka became #1

You want more posts? Why? so you can keep ignoring them ;-) ? You answered the second but ignored the first one, which said

>>>>You have just explained why the bitboarders are less handicapped on 64 bits
>>>>machines.

That is directly related to performance.

Miguel
Sorry, but that is an exaggeration. OK, rybka using bitboards might have silenced CT.
An others. That was the whole point. Thank you.

Miguel
PS: What follows, it is irrelevant.

He's not been active for 10 years now. But bitboards were here to stay 10 years ago. We had an active group working on them when I started in late 1994. We now have hundreds of bitboard programs running around. This is not "post Rybka". In fact, one could quite logically make that case that all the bitboard use in the past convinced _Vas_ to switch. Not the other way around...

bob
Posts: 20914
Joined: Mon Feb 27, 2006 6:30 pm
Location: Birmingham, AL

Re: Crafty tests show that Software has advanced more.

Post by bob » Mon Sep 13, 2010 3:57 am

michiguel wrote:
bob wrote:
michiguel wrote:
bob wrote:
michiguel wrote:
bob wrote:
michiguel wrote:
bob wrote:
michiguel wrote:
bob wrote:
michiguel wrote:
mhull wrote:
Don wrote:
mhull wrote:
Don wrote:The main problem is the 64 bit vs 32 bit difference.
If you believe that, then why are you testing 32-bit Rebel on 64-bit modern hardware? That's rendering Rebel as cripple-ware, because its not optimized for 64-bit. So it's unfair by your definition.
A 32 bit program is not crippled on a 64 bit machine. Run a 32 bit program on a 32 bit machine and then time it on a 64 bit machine and you will see it runs just as well.

Then do the same experiment with a 64 bit program and your eyes will be opened.
Don wrote: There is this argument that 32 bit is not the right way to write a program that runs on a 64 bit machine. But I don't think anyone has actually proved that. It's difficult to prove because it's a whole different way of writing a program so you cannot just compare 2 programs.
Usually, sowing a little FUD is a resource for a not-very-strong argument. I'll sprinkle some agent orange on it by saying most people think crafty is one of the faster searchers. Sure, it's not proof, just like your doubts.
Don wrote:The primary argument in favor of 64 bit is Rybka,
It could hardly have been so in the 1990s when crafty was born.
Because there was none in the 90's!

Bob was heavily criticized for this design choice, and day after day we had to hear the argument "How many bitboard programs are in the top 10?" blah, blah, blah. Few believe in bitboards, or at least, it was a common strategy to disregard the technique as not practical. Many times I suspected it was a commercial strategy to disregard what you were not doing, but that was my impression. I followed this argument very closely over the years because I started to program my move generator by 1997 in bitboards. I never did anything different and I was very interested to hear the comments (now bob extrapolates that I criticize his choice :roll:).

Rybka was finally the break that FINALLY convinced people that the technique could lead a program to the top. If the #1 is a bitboard program, of course, the previous statement was proven.

Miguel


Sorry, but this is wrong. Chess 4.x was bitboard in 1974. As were the Russians with Kaissa. As was Tom Truscott with Duchess in 1977. Bitboards have been around since the beginning of computer chess. We had dozens of bitboard programs well before Rybka became yet another bitboard program. Rybka was not "the beginning". It was well after "the end" of the debate.
Very well known facts, but irrelevant to what I wrote.

Rybka was not after the end of the debate, Rybka was the end of the debate.

Miguel
What is that based on? We had dozens of bitboard programs prior to Rybka. I do not know of any after Rybka that were not there before it was modified to use bitboards... There is a limit to how much credit one can give to Rybka. It's certainly strong. It was not the "final coronation" for bitboard programs. It came to that party _way_ late.
Who is talking about credit?

When Rybka become #1, it was not possible anymore to come with any type of argument to demonstrate the alleged "inferiority" of bitboards (even if bitboards had nothing to do with R becoming #1). The fact is all the discussions we had seen in the past evaporated.

Miguel
Exactly _what_ discussions? The only person I recall in the past 10-15 years continually bashing bitboards was Vincent D. I've not seen anybody else suggest that they were not working, and I have seen dozens of new programs using them. The magic move generation development showed that many were using them. Etc. If Vincent carries that much weight with you, fine. But I have not seen anyone else suggest bitboards are no good. And I haven't seen anyone at all suggest such in the past 5 years, well before Rybka existed or used bitboards.

This is wishful thinking.
It took me few seconds to find a post from C. Theron criticizing bitboards.
Yes, Vincent too.

>>>>You have just explained why the bitboarders are less handicapped on 64 bits
>>>>machines.
>>>>
>>>>You have not explained why they are supposed to have "a bit of performance
>>>>advantage on 64 bits processors".
>>>>
>>>>
>>>>
>>>> Christophe

>>>
>>>Clearly, nothing beats the ugliness of bitboards.
>>>
>>>
>>>
>>> Christophe

I cannot believe you do not remember any of these posts.

Miguel
"ugliness" has nothing to do with performance. Any others you can recall? CT has not posted here in at least 5 years or so, so as I said, nothing recently.
That is the whole point!! because all of those discussions happened before Rybka became #1

You want more posts? Why? so you can keep ignoring them ;-) ? You answered the second but ignored the first one, which said

>>>>You have just explained why the bitboarders are less handicapped on 64 bits
>>>>machines.

That is directly related to performance.

Miguel
Sorry, but that is an exaggeration. OK, rybka using bitboards might have silenced CT.
An others. That was the whole point. Thank you.

Miguel
PS: What follows, it is irrelevant.
If you want to believe that, feel free. I consider it to be utter nonsense. Again, I have not seen any "bitboard detractors" in the last 5+ years, except perhaps for Vincent. Bitboard Rybka didn't exist then. I believe that most could read the discussions about bitboards, and make the decision about whether to try them or not, without worrying about whether just Rybka used bitboards or not.

I believe that Rybka uses bitboards because they work. Because so many were already working on them. Not the other way around. No technically competent person would believe bitboards are less efficient than a big array of squares. Some might not like the syntax, or the different way of thinking it requires. But I can't imagine anyone with any programming skills not seeing the obvious advantages they offer. Particularly on 64 bit hardware...

Rybka has nothing to do with that understanding. Other than the fact that Vas followed the discussions and also concluded that they work. To believe otherwise requires an over-active imagination.

He's not been active for 10 years now. But bitboards were here to stay 10 years ago. We had an active group working on them when I started in late 1994. We now have hundreds of bitboard programs running around. This is not "post Rybka". In fact, one could quite logically make that case that all the bitboard use in the past convinced _Vas_ to switch. Not the other way around...

bob
Posts: 20914
Joined: Mon Feb 27, 2006 6:30 pm
Location: Birmingham, AL

Re: Crafty tests show that Software has advanced more.

Post by bob » Mon Sep 13, 2010 4:02 am

BubbaTough wrote:
bob wrote:
BubbaTough wrote:
bob wrote:
BubbaTough wrote:
bob wrote:
You don't like my using a program from 1995, that was designed to run on 1995 hardware and be fast, and now because it happens to be able to use the last 8 years of hardware improvements, that is a no-no. It is breaking some mythical rule that you attribute to me but I do not recall ever making such.
I don't understand why one would need to pick something that is still actively developed in both time periods. The odds that one thing is a good representative of the state of software at both points is pretty low. I know comparing the program I wrote in 1990 with my current program, while interesting (to me), would not really add much to this debate.

If you need source code to get things running, you could pick the best open source code from both periods. One advantage of this, is that the best open source is often extremely influential on other programs of the time. Stockfish from this period would be fine for representing top open source from this period. Not sure what represents the best open source from the past period you are trying to compare against...perhaps Crafty? If so, Crafty 1995 vs. Stockfish 2010 would be an interesting comparison.


-Sam
The point is that it is very hard to compare program A(1995) to program B(2010). It's simply easier to run all the tests when you have the source of both, and it is (to me) a bit easier to use the same program on both ends. Each program does things differently, each can easily produce different results in these tests. Some, for example, won't see the big speedup I see in Crafty. Comparing apples to oranges always makes things less clear.
I don't understand why its harder to compare top 1995 open source to top 2010 open source than top 1995 crafty to top 2010 crafty. The point is not to determine crafty software progress, the point is to determine software progress (redefined by me as open source software progress because I know you want source code).

I understand its easier to have the source code, luckily you do have the Stockfish 2010 source code (and Crafty 1995 source code). Regarding comparing apples to oranges...well whole point of the exercise is to compare apples to oranges, so I guess we are stuck there.

-Sam
Simple. there are two components. Hardware and software. If one gets more from hardware than the other, then comparing the software advantage is impossible because you will be measuring hardware when you test both on the same machine. We've already seen that Rebel sucks on new hardware. This affects its overall Elo as well.

Much easier if there is only one variable measured at a time. Same version on two different hardware platforms. Different versions on same hardware platform. Then you know which is hardware and which is software. If you mix, you won't.
Yes you want to change one variable at a time. I was kind of assuming you wanted to gather 4 results:

1. 1995 hardware with 2010 software
2. 2010 hardware with 1995 software
3. 2010 hardware with 2010 software
4. 1995 hardware with 1995 hardware

nice and simple assuming you can gather results comparing them. I am assuming you would be playing games on a private server or something so that the old hardware can play against new hardware to ensure the rating pools are joined. I guess an easier but less accurate way would be to take a few measurements, and then artificially cripple the nodes per second of the version supposedly running on old hardware but actually run it on new hardware.

Anyway, however you want to gather your results, my question was only regarding what software represents the 2010 software. My recommendation was Stockfish.

-Sam
If you compare stockfish to something else, you still have that extra degree of freedom. If you run stockfish on old/new hardware it might do better or worse on either than a different program. Yet you end up comparing new stockfish to some other old program to measure software change. But is what you measure all software? If you take a 1995 program that was very good on that hardware and run it on current hardware and compare to stockfish, are you just measuring software improvement? Or software + some hardware since the old program won't use new hardware features while stockfish might.

In my case, there were no hidden issues, so it seemed easier and more accurate. I'd like to see a stockfish comparison since it seems to be almost equal to Rybka. But the problem is, what to compare it against to make sure there is no hardware interference?

User avatar
michiguel
Posts: 6401
Joined: Thu Mar 09, 2006 7:30 pm
Location: Chicago, Illinois, USA
Contact:

Re: Crafty tests show that Software has advanced more.

Post by michiguel » Mon Sep 13, 2010 4:39 am

bob wrote:
michiguel wrote:
bob wrote:
michiguel wrote:
bob wrote:
michiguel wrote:
bob wrote:
michiguel wrote:
bob wrote:
michiguel wrote:
bob wrote:
michiguel wrote:
mhull wrote:
Don wrote:
mhull wrote:
Don wrote:The main problem is the 64 bit vs 32 bit difference.
If you believe that, then why are you testing 32-bit Rebel on 64-bit modern hardware? That's rendering Rebel as cripple-ware, because its not optimized for 64-bit. So it's unfair by your definition.
A 32 bit program is not crippled on a 64 bit machine. Run a 32 bit program on a 32 bit machine and then time it on a 64 bit machine and you will see it runs just as well.

Then do the same experiment with a 64 bit program and your eyes will be opened.
Don wrote: There is this argument that 32 bit is not the right way to write a program that runs on a 64 bit machine. But I don't think anyone has actually proved that. It's difficult to prove because it's a whole different way of writing a program so you cannot just compare 2 programs.
Usually, sowing a little FUD is a resource for a not-very-strong argument. I'll sprinkle some agent orange on it by saying most people think crafty is one of the faster searchers. Sure, it's not proof, just like your doubts.
Don wrote:The primary argument in favor of 64 bit is Rybka,
It could hardly have been so in the 1990s when crafty was born.
Because there was none in the 90's!

Bob was heavily criticized for this design choice, and day after day we had to hear the argument "How many bitboard programs are in the top 10?" blah, blah, blah. Few believe in bitboards, or at least, it was a common strategy to disregard the technique as not practical. Many times I suspected it was a commercial strategy to disregard what you were not doing, but that was my impression. I followed this argument very closely over the years because I started to program my move generator by 1997 in bitboards. I never did anything different and I was very interested to hear the comments (now bob extrapolates that I criticize his choice :roll:).

Rybka was finally the break that FINALLY convinced people that the technique could lead a program to the top. If the #1 is a bitboard program, of course, the previous statement was proven.

Miguel


Sorry, but this is wrong. Chess 4.x was bitboard in 1974. As were the Russians with Kaissa. As was Tom Truscott with Duchess in 1977. Bitboards have been around since the beginning of computer chess. We had dozens of bitboard programs well before Rybka became yet another bitboard program. Rybka was not "the beginning". It was well after "the end" of the debate.
Very well known facts, but irrelevant to what I wrote.

Rybka was not after the end of the debate, Rybka was the end of the debate.

Miguel
What is that based on? We had dozens of bitboard programs prior to Rybka. I do not know of any after Rybka that were not there before it was modified to use bitboards... There is a limit to how much credit one can give to Rybka. It's certainly strong. It was not the "final coronation" for bitboard programs. It came to that party _way_ late.
Who is talking about credit?

When Rybka become #1, it was not possible anymore to come with any type of argument to demonstrate the alleged "inferiority" of bitboards (even if bitboards had nothing to do with R becoming #1). The fact is all the discussions we had seen in the past evaporated.

Miguel
Exactly _what_ discussions? The only person I recall in the past 10-15 years continually bashing bitboards was Vincent D. I've not seen anybody else suggest that they were not working, and I have seen dozens of new programs using them. The magic move generation development showed that many were using them. Etc. If Vincent carries that much weight with you, fine. But I have not seen anyone else suggest bitboards are no good. And I haven't seen anyone at all suggest such in the past 5 years, well before Rybka existed or used bitboards.

This is wishful thinking.
It took me few seconds to find a post from C. Theron criticizing bitboards.
Yes, Vincent too.

>>>>You have just explained why the bitboarders are less handicapped on 64 bits
>>>>machines.
>>>>
>>>>You have not explained why they are supposed to have "a bit of performance
>>>>advantage on 64 bits processors".
>>>>
>>>>
>>>>
>>>> Christophe

>>>
>>>Clearly, nothing beats the ugliness of bitboards.
>>>
>>>
>>>
>>> Christophe

I cannot believe you do not remember any of these posts.

Miguel
"ugliness" has nothing to do with performance. Any others you can recall? CT has not posted here in at least 5 years or so, so as I said, nothing recently.
That is the whole point!! because all of those discussions happened before Rybka became #1

You want more posts? Why? so you can keep ignoring them ;-) ? You answered the second but ignored the first one, which said

>>>>You have just explained why the bitboarders are less handicapped on 64 bits
>>>>machines.

That is directly related to performance.

Miguel
Sorry, but that is an exaggeration. OK, rybka using bitboards might have silenced CT.
An others. That was the whole point. Thank you.

Miguel
PS: What follows, it is irrelevant.
If you want to believe that, feel free. I consider it to be utter nonsense. Again, I have not seen any "bitboard detractors" in the last 5+ years, except perhaps for Vincent. Bitboard Rybka didn't exist then. I believe that most could read the discussions about bitboards, and make the decision about whether to try them or not, without worrying about whether just Rybka used bitboards or not.

I believe that Rybka uses bitboards because they work. Because so many were already working on them. Not the other way around. No technically competent person would believe bitboards are less efficient than a big array of squares. Some might not like the syntax, or the different way of thinking it requires. But I can't imagine anyone with any programming skills not seeing the obvious advantages they offer. Particularly on 64 bit hardware...
This is impossible.

I made a minor tangent comment to Matt and you read whatever you wanted to read. I needed 3-4 emails trying to make you understand what my point was (no whatever you thought my point was). Yet, you keep ignoring what I bring. C. Theron had some programming skills and failed to see the advantages of BB. It is in my previous email!!

And BTW, CT was not inactive for 10 years. He got a new version not *that* long ago:
http://www.lokasoft.nl/chess_tiger_2007

Miguel

Rybka has nothing to do with that understanding. Other than the fact that Vas followed the discussions and also concluded that they work. To believe otherwise requires an over-active imagination.

He's not been active for 10 years now. But bitboards were here to stay 10 years ago. We had an active group working on them when I started in late 1994. We now have hundreds of bitboard programs running around. This is not "post Rybka". In fact, one could quite logically make that case that all the bitboard use in the past convinced _Vas_ to switch. Not the other way around...

bob
Posts: 20914
Joined: Mon Feb 27, 2006 6:30 pm
Location: Birmingham, AL

Re: Crafty tests show that Software has advanced more.

Post by bob » Mon Sep 13, 2010 1:47 pm

I agree it is impossible. If you want to believe that Vas converting Rybka to bitboards ended the bitboard debate, feel free. The bitboard debate was ended _before_ Vas did the conversion. It is almost certainly what made him do the conversion, otherwise why go to the effort? We have hundreds of bitboard programs. Almost all started _before_ Rybka became bitboards.

This is an argument that can only rely on opinion, it can't be proved, so it is pretty pointless. If you want to believe that Vas ended the bitboard debate, that's your opinion. Mine (and many others) have a completely _different_ view of this, however...

BubbaTough
Posts: 1154
Joined: Fri Jun 23, 2006 3:18 am

Re: Crafty tests show that Software has advanced more.

Post by BubbaTough » Mon Sep 13, 2010 3:29 pm

bob wrote:
BubbaTough wrote:
bob wrote:
BubbaTough wrote:
bob wrote:
BubbaTough wrote:
bob wrote:
You don't like my using a program from 1995, that was designed to run on 1995 hardware and be fast, and now because it happens to be able to use the last 8 years of hardware improvements, that is a no-no. It is breaking some mythical rule that you attribute to me but I do not recall ever making such.
I don't understand why one would need to pick something that is still actively developed in both time periods. The odds that one thing is a good representative of the state of software at both points is pretty low. I know comparing the program I wrote in 1990 with my current program, while interesting (to me), would not really add much to this debate.

If you need source code to get things running, you could pick the best open source code from both periods. One advantage of this, is that the best open source is often extremely influential on other programs of the time. Stockfish from this period would be fine for representing top open source from this period. Not sure what represents the best open source from the past period you are trying to compare against...perhaps Crafty? If so, Crafty 1995 vs. Stockfish 2010 would be an interesting comparison.


-Sam
The point is that it is very hard to compare program A(1995) to program B(2010). It's simply easier to run all the tests when you have the source of both, and it is (to me) a bit easier to use the same program on both ends. Each program does things differently, each can easily produce different results in these tests. Some, for example, won't see the big speedup I see in Crafty. Comparing apples to oranges always makes things less clear.
I don't understand why its harder to compare top 1995 open source to top 2010 open source than top 1995 crafty to top 2010 crafty. The point is not to determine crafty software progress, the point is to determine software progress (redefined by me as open source software progress because I know you want source code).

I understand its easier to have the source code, luckily you do have the Stockfish 2010 source code (and Crafty 1995 source code). Regarding comparing apples to oranges...well whole point of the exercise is to compare apples to oranges, so I guess we are stuck there.

-Sam
Simple. there are two components. Hardware and software. If one gets more from hardware than the other, then comparing the software advantage is impossible because you will be measuring hardware when you test both on the same machine. We've already seen that Rebel sucks on new hardware. This affects its overall Elo as well.

Much easier if there is only one variable measured at a time. Same version on two different hardware platforms. Different versions on same hardware platform. Then you know which is hardware and which is software. If you mix, you won't.
Yes you want to change one variable at a time. I was kind of assuming you wanted to gather 4 results:

1. 1995 hardware with 2010 software
2. 2010 hardware with 1995 software
3. 2010 hardware with 2010 software
4. 1995 hardware with 1995 hardware

nice and simple assuming you can gather results comparing them. I am assuming you would be playing games on a private server or something so that the old hardware can play against new hardware to ensure the rating pools are joined. I guess an easier but less accurate way would be to take a few measurements, and then artificially cripple the nodes per second of the version supposedly running on old hardware but actually run it on new hardware.

Anyway, however you want to gather your results, my question was only regarding what software represents the 2010 software. My recommendation was Stockfish.

-Sam
If you compare stockfish to something else, you still have that extra degree of freedom. If you run stockfish on old/new hardware it might do better or worse on either than a different program. Yet you end up comparing new stockfish to some other old program to measure software change. But is what you measure all software? If you take a 1995 program that was very good on that hardware and run it on current hardware and compare to stockfish, are you just measuring software improvement? Or software + some hardware since the old program won't use new hardware features while stockfish might.

In my case, there were no hidden issues, so it seemed easier and more accurate. I'd like to see a stockfish comparison since it seems to be almost equal to Rybka. But the problem is, what to compare it against to make sure there is no hardware interference?
I guess I understand what you are saying, and I have always agreed with you (in fact I consider it quite obvious) that hardware advances have contributed more than software, but by using modern crafty instead of a top program you seem to be lopping off around 300 elo of today's advances (not sure what the exact number is but I suspect its pretty big).

Perhaps if you feel only modern crafty can be converted to older hardware, you could use the program to join the two rating groups (old vs. new) but still compare the top programs of the day to answer the proposed hypothesis. I expect the same conclusion (hardware advances have contributed more than software) but by a significantly smaller margin than the way you are currently measuring things.

There is no getting around there being some hidden issues. Picking an old program that benefits from 64 bit vs. not by itself introduces some issues. I don't really have an opinion what is fair and what is not on those kind of things...but the magnitude of choosing modern crafty vs. modern stockfish seems much larger than the magnitude of those kinds of choices.

-Sam

bob
Posts: 20914
Joined: Mon Feb 27, 2006 6:30 pm
Location: Birmingham, AL

Re: Crafty tests show that Software has advanced more.

Post by bob » Mon Sep 13, 2010 3:41 pm

BubbaTough wrote:
bob wrote:
BubbaTough wrote:
bob wrote:
BubbaTough wrote:
bob wrote:
BubbaTough wrote:
bob wrote:
You don't like my using a program from 1995, that was designed to run on 1995 hardware and be fast, and now because it happens to be able to use the last 8 years of hardware improvements, that is a no-no. It is breaking some mythical rule that you attribute to me but I do not recall ever making such.
I don't understand why one would need to pick something that is still actively developed in both time periods. The odds that one thing is a good representative of the state of software at both points is pretty low. I know comparing the program I wrote in 1990 with my current program, while interesting (to me), would not really add much to this debate.

If you need source code to get things running, you could pick the best open source code from both periods. One advantage of this, is that the best open source is often extremely influential on other programs of the time. Stockfish from this period would be fine for representing top open source from this period. Not sure what represents the best open source from the past period you are trying to compare against...perhaps Crafty? If so, Crafty 1995 vs. Stockfish 2010 would be an interesting comparison.


-Sam
The point is that it is very hard to compare program A(1995) to program B(2010). It's simply easier to run all the tests when you have the source of both, and it is (to me) a bit easier to use the same program on both ends. Each program does things differently, each can easily produce different results in these tests. Some, for example, won't see the big speedup I see in Crafty. Comparing apples to oranges always makes things less clear.
I don't understand why its harder to compare top 1995 open source to top 2010 open source than top 1995 crafty to top 2010 crafty. The point is not to determine crafty software progress, the point is to determine software progress (redefined by me as open source software progress because I know you want source code).

I understand its easier to have the source code, luckily you do have the Stockfish 2010 source code (and Crafty 1995 source code). Regarding comparing apples to oranges...well whole point of the exercise is to compare apples to oranges, so I guess we are stuck there.

-Sam
Simple. there are two components. Hardware and software. If one gets more from hardware than the other, then comparing the software advantage is impossible because you will be measuring hardware when you test both on the same machine. We've already seen that Rebel sucks on new hardware. This affects its overall Elo as well.

Much easier if there is only one variable measured at a time. Same version on two different hardware platforms. Different versions on same hardware platform. Then you know which is hardware and which is software. If you mix, you won't.
Yes you want to change one variable at a time. I was kind of assuming you wanted to gather 4 results:

1. 1995 hardware with 2010 software
2. 2010 hardware with 1995 software
3. 2010 hardware with 2010 software
4. 1995 hardware with 1995 hardware

nice and simple assuming you can gather results comparing them. I am assuming you would be playing games on a private server or something so that the old hardware can play against new hardware to ensure the rating pools are joined. I guess an easier but less accurate way would be to take a few measurements, and then artificially cripple the nodes per second of the version supposedly running on old hardware but actually run it on new hardware.

Anyway, however you want to gather your results, my question was only regarding what software represents the 2010 software. My recommendation was Stockfish.

-Sam
If you compare stockfish to something else, you still have that extra degree of freedom. If you run stockfish on old/new hardware it might do better or worse on either than a different program. Yet you end up comparing new stockfish to some other old program to measure software change. But is what you measure all software? If you take a 1995 program that was very good on that hardware and run it on current hardware and compare to stockfish, are you just measuring software improvement? Or software + some hardware since the old program won't use new hardware features while stockfish might.

In my case, there were no hidden issues, so it seemed easier and more accurate. I'd like to see a stockfish comparison since it seems to be almost equal to Rybka. But the problem is, what to compare it against to make sure there is no hardware interference?
I guess I understand what you are saying, and I have always agreed with you (in fact I consider it quite obvious) that hardware advances have contributed more than software, but by using modern crafty instead of a top program you seem to be lopping off around 300 elo of today's advances (not sure what the exact number is but I suspect its pretty big).

Perhaps if you feel only modern crafty can be converted to older hardware, you could use the program to join the two rating groups (old vs. new) but still compare the top programs of the day to answer the proposed hypothesis. I expect the same conclusion (hardware advances have contributed more than software) but by a significantly smaller margin than the way you are currently measuring things.

There is no getting around there being some hidden issues. Picking an old program that benefits from 64 bit vs. not by itself introduces some issues. I don't really have an opinion what is fair and what is not on those kind of things...but the magnitude of choosing modern crafty vs. modern stockfish seems much larger than the magnitude of those kinds of choices.

-Sam
We can perhaps "close this gap" with some estimation. The problem is, what is the delta between Rybka and Crafty, engine only? I don't know. I do know the delta between stockfish 1.8 and Crafty, and on my cluster it is in the close-to-200 range, actually slightly less, but close enough. No books of any kind. If I could run Rybka on my cluster I'd like to use it to answer this question. But I can't reliably take SF 1.8 on my cluster, and because it is _very_ close to Rybka on the rating lists, say that Rybka is about +200 better. Because Rybka may be stronger (or weaker) than SF 1.8 with no book. And I can't test that to determine whether it is true or not.

So all I can _reliably_ is measure my improvement. Which gives a big nod to hardware over software by 2:1. Or by 800:360 to be precise. It would still be 800:560 to 800:660 depending on the Rybka rating. But even 100 Elo is significant, 200 is a 3:1 winning ratio when playing games, which is pretty big.

I can close the gap some by using a parallel search, as the testing I have done shows Crafty getting a bigger Elo boost than SF, although not a huge one. But not just +10 Elo either, so it helps. But I don't test like that very often as it slows things down by a factor of 8.

User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 2:27 pm

Re: Crafty tests show that Software has advanced more.

Post by Don » Mon Sep 13, 2010 3:59 pm

bob wrote: What is that based on? We had dozens of bitboard programs prior to Rybka. I do not know of any after Rybka that were not there before it was modified to use bitboards... There is a limit to how much credit one can give to Rybka. It's certainly strong. It was not the "final coronation" for bitboard programs. It came to that party _way_ late.
I'm not sure if this is based on my comment that bitboards was a fad started by Rybka or not - but in either case I want to clarify the issue as I see it as the only thing worth commenting on in this thread - of which everything has been rehashed to death.

I agree with Bob that bitboard programs have been around a very long time. I was using them myself way before Rybka came out and Crafty always used them.

But that was not my point and I want that to be understood. Whether it was the right way to do it or not, what made it extremely popular was the fact that this new "rock star" Rybka was doing it. Personally I believe it's the best way to write a 64 bit program but that is not what made it so very popular.

Crafty also was doing it but until fairly recently almost all the top programs for 32 bit computers were 32 bit programs. That certainly did not sell anyone on the idea that 64 bit was the best way to write a program for a 32 bit machine.

Personally, I don't believe it is. I cannot prove that or back it up in any way but I would like to add that neither can you.

Post Reply