Question About CPP-C#, Performance, and Square Represenation

Discussion of chess software programming and technical issues.

Moderator: Ras

User avatar
emadsen
Posts: 441
Joined: Thu Apr 26, 2012 1:51 am
Location: Oak Park, IL, USA
Full name: Erik Madsen

Re: Question About CPP-C#, Performance, and Square Represena

Post by emadsen »

The real "NPS" speed was somewhere between 2.5 and 4 Mnps.
For comparison, I get 600K nps from the start position. Though I hope to improve that with smarter object allocation.
Erik Madsen | My C# chess engine: https://www.madchess.net
Cheney
Posts: 104
Joined: Thu Sep 27, 2012 2:24 am

Re: Question About CPP-C#, Performance, and Square Represena

Post by Cheney »

Thanks for all the replies, I found all of this helpful :)

I have been able to get a perft 5 to complete in just about 2 seconds and I am not sure if I can get it to move any faster. I am happy with these results and moved onto fixing the search routine.

It appears that my search routine can execute at about 600Knps as well - no sorting, quiesence, or any heruistics. I only count nodes that are valid moves, no different than in perft.

However, when I did write in the quiescence, the Knps dropped a bit and it took much longer (which I will post another thread on).

Thanks again :)
spirch
Posts: 95
Joined: Fri Nov 09, 2012 12:36 am

Re: Question About CPP-C#, Performance, and Square Represena

Post by spirch »

Hi, did you ever try to see what going on with a profiler (other than the one from VS 2010)?

Before using one my implementation was doing about maybe 40,000 nodes/sec on perft which was very ugly. (about 2 minutes for perft 5)

I took time to profile it correctly and did a tons of test, I learned a lot about how the clr work since the profiler that i was using can show the actual clr produced.

It was also the first time that i was using a profiler.

Some results did surprise me.

Right now I can do perft 5 in 0.1 second and perft 6 in 2.5 seconds which mean i had some pretty bad code not knowing it and the funny thing is, I think I can still improve it since so far I did not implemented transposition table or no hashing or other kind of thing like that. Simply pure array/bitwise/loop/etc.

look around for one but i used the trial version of dotTrace
Cheney
Posts: 104
Joined: Thu Sep 27, 2012 2:24 am

Re: Question About CPP-C#, Performance, and Square Represena

Post by Cheney »

Hi,

Sorry to come back to this post but I could not help reading some other posts and coming up with the same question which I feel pertains to my posted subject.

I was reading Lucas's post http://talkchess.com/forum/viewtopic.ph ... 47&t=45096 about C++ optimizations.

It looks like (not being a C/C++ programmer anymore) he move from using bits to whole ints to gain more performance.

I know in C#, some say use ints over bytes just like use arrays over lists - all for performance. I have moved to using just ints, arrays, and away from casting and saw massive performance increase (also, I cleaned up some junk coding).

However, I am still stuck with the question about having to know assembly in order to optimize code? I feel the answer is "absolutely". I feel that moving to C/C++ might help performance a bit but to truly achieve performance like his, 100M+ nps, I will not achieve that just by rewriting my code in C/C++; I would have to analyze the asm generated and look for ways to optimize that by adjusting my C/C++. And at that, this is something I cannot do using C#

Am am close with this assumption or way off target?

Thanks again everyone :)
spirch
Posts: 95
Joined: Fri Nov 09, 2012 12:36 am

Re: Question About CPP-C#, Performance, and Square Represena

Post by spirch »

I did a quick look at his code; https://github.com/lucasart/chess

He does some very nice tricks 8-)

But I don't see any asm code?

Hes using hashing and very nice bitwise operation and maybe others stuff but in the end, the code is nicely done, this is where thing get important.

Also C++ give WAY more power to the programmer (ex; force inline and pointer) and managed code will always be slower than compiled code.


My engine is under .Net like you (using only safe code) and I can do around 50m+ nps without hashing and without any nice tricks. Only thing that I did so far is looking at the CLR produced to make sure my code wasn't doing stupid things and making sure I had no object allocation so GC would not get triggered.

Open your compiled code with something like http://ilspy.net/ and play around 8-) You can optimized the generated code of C#.
User avatar
emadsen
Posts: 441
Joined: Thu Apr 26, 2012 1:51 am
Location: Oak Park, IL, USA
Full name: Erik Madsen

Re: Question About CPP-C#, Performance, and Square Represena

Post by emadsen »

Cheney,

Don't take this the wrong way, but judging from the basic questions you're asking about quiescence search (in another thread) you're getting way ahead of yourself worrying about micro-optimizations. You can't improve performance unless the search algorithm is correct. Focus on that first. You're trying for a "big bang" approach that will end up driving you nuts- because you'll be churning out bugs much faster than you can identify them.

Write clean code. Write defensively so when things fail, they fail obviously. Measure, measure, measure to identify performance bottlenecks. Then refactor code for speed. In my opinion, an iterative approach like this works best.
Erik Madsen | My C# chess engine: https://www.madchess.net
Cheney
Posts: 104
Joined: Thu Sep 27, 2012 2:24 am

Re: Question About CPP-C#, Performance, and Square Represena

Post by Cheney »

Thanks Fred :)

I browsed the movegen code but did not go any deeper (which I would like to).

I was not thinking he was using asm but it appears he has such a knowledge of asm that it helps with the efficient c/c++ coding. I do not know if that means "tricks", whatever that means :). I can only fathom a performance trick example would be to use ints instead of bytes or pass ref between methods.

I write code but have do not know much or recall much about the behind the scenes details about managed/unmanaged code, CLR, compiling, etc. I just write code and now am looking to be more efficient and think it means I now need to start learning the "behind the scenes" details and how to restructure the code more efficiently.

I was able to make the code more efficient by moving to bitboards, making moves for the side that is up, no looping over the squares multiple times for multiple items, etc... and avoided the GC - no new objects, every class is static.

I'll check out the link you sent me. I also statred using the VS2010 profiler in but this has been my first time :). I was not following it. An example of somthing I did not follow was why it reported my GeneratePawnMoves was expensive. My logic - of course, there are 16 pawns on the board that I was making moves for:). That's changed :)

So I will continue on my current path hoping to find a way to achieve 50mnps. Hopefully, my math is right to calculate nps :)
Cheney
Posts: 104
Joined: Thu Sep 27, 2012 2:24 am

Re: Question About CPP-C#, Performance, and Square Represena

Post by Cheney »

Hi Erik,

No, I don't take anything the wrong way, lay it all on the table :)

I do read you crystal clear :). For this thread, I am just trying to keep the idea open to flush out the things I am thinking about down the road. If I learn something now, like something as simple as passing ref values between methods, then I want to build on that now, not rebuild later. Of course, this is inevitable to a certain degree :). But, if it is more than that, then I would rather do as you write, work on the moves and searches to enhance and eliminate bugs.

Using the qsearch as an example, I was not expecting the explosion and increase from ~7 seconds at depth 7 search to 30 seconds. I started making a poor man's mvv/lva which helped a bit and I finished the mvv/lva (with help from that other post and lot's of re-reading) and am back at ~9 seconds total search time.

However, I have looked at this project from a perft perspective to measure the fundamental performance. If others can do a perft and achieve 50mnps, why can't I?

Of couse, I have to assume they are performing a plain perft like mine with no "tricks" and the same math to calculate nps; if this is all equal, then I should be able to squeeze more out of it by focusing on movegeneration and the move arrays/tables.

That's the question on my mind, how to achieve, and the only way I eliminate the answer of "wite in C/C++" is to read how well other C# programmers are doing and get some advice.

Not to sound like I have no clue :shock: , but to be sure, nodes/sec is measured by the amount of nodes (or leaves? at perft 5 = 4,865,609) divided by the time it took?

Anyway, thanks for your advice with this - I'll try not to go too nuts on this obsessive and rewarding quest I have journied onto :)
lucasart
Posts: 3241
Joined: Mon May 31, 2010 1:29 pm
Full name: lucasart

Re: Question About CPP-C#, Performance, and Square Represena

Post by lucasart »

spirch wrote:I did a quick look at his code; https://github.com/lucasart/chess

He does some very nice tricks 8-)

But I don't see any asm code?
Actually there is, in bitboard.h:

Code: Select all

inline int lsb(Bitboard b)
{
	Bitboard index;
	__asm__("bsfq %1, %0": "=r"(index): "rm"(b) );
	return index;
}
The problem is that:
- it only works for x86_64 and when the compiler is either GCC or ICC, but not MSVC (according to Stockfish's sources)
- i need a version for 32-bit ARM
- as for x86 I really don't care: who uses a 32 bit PC these days ? (a lot of people still use a 32-bit Windows XP on a x86_64 CPU but that's their problem)

For MacOSX should be the same as Linux hopefully (gcc for x86_64).
Theory and practice sometimes clash. And when that happens, theory loses. Every single time.
spirch
Posts: 95
Joined: Fri Nov 09, 2012 12:36 am

Re: Question About CPP-C#, Performance, and Square Represena

Post by spirch »

oops, ok it was a 1 liner that I didn't spot :wink:

I tried it on my computer, the test_perft() said something like 220 millions leaf per seconds

nice performance! :)