Question About CPP-C#, Performance, and Square Represenation

lucasart · Post by **lucasart** » Sat Oct 27, 2012 3:55 am

lucasart wrote:
bpfliegel wrote:Use the 'readonly' keyword for arrays (if preallocated with a static length), use private members if possible, make your classes sealed - and use static classes and members wherever possible. Obviously keeping a trade-off with your preferred architecture. Moves away somewhat from OO, I know.

Perft is 11 millon nodes/sec for me in WhiskySoda (on an i5 laptop), but I use magics, which helps a lot. In Portfish it is around 45 million nodes and that is .Net aswell I was too lazy to write a proper check detection in WhiskySoda, so just invalidate the move if it ends up in being check...

And use a decent profiler (EQATEC or Visual Studio built-in, I use both).

Good luck,
Balint
Sorry to spoil your C# party, but 11 Mnps or even 45 Mnps, on this kind of hardware, is NOT good.

My C/C++ code, which is not particularly smart, does a varied perft benchmark at about 100 Mnps on a Pentium Dual-Core E5300 @ 2.6 GHz, using 1 thread (and not cheating with a hash table of course). That's a 4 year old entry level computer.

I don't understand why people want to use C#. Here's what typically happens:
- they argue that it's much more high level than C, so they can write conceptual code without worrying about the details, and that efficiency doesn't matter. which is true (for a lot of projects efficiency is not that important, or is only important in small parts of the code)
- then their engine is so slow and lame (if it doesn't run out of memory before it finishes a game due to all the memory alloc and hazardous garbage collection), that they decide to optimize it.
- in the process of optimizing their code, they finally realize that everything that C# does that was supposedly better than C, is horribly inefficient and should be avoided.
- so not only they're not using the C# features that make life easier (if you dont care about performance) but they spend a lot of extra thought on how to avoid them. how to preallocate all your objects, and distribute the elements over preallocated objects etc, just to avoid tha garbage collection mechanism and so on.
- this is just plain stupid. In the end you spend less time and cause yourself less brain damage writing everything in C from the word go.

I don't mean to say that C# is useless. It is certainly useful (just like any truely high level language, eg. Python) when (1) you need to manipulate complex data structures, and when (2) performance is not important. However for a chess engine, both (1) and (2) are wrong. So you're just choosing the wrong tool for the job.
And reversly I don't claim that C is good at everything. If I want to write a program like the bash command ls, I will most likely write it in Python, precisely because (1) and (2) apply.

bpfliegel · Post by **bpfliegel** » Sat Oct 27, 2012 6:30 am

Lucas, you are fully correct, but you don't get the point.
This is fun!

Personally my motivation was with Portfish a) a little training on C# performance and see what could be achieved and how, b) have a pretty strong engine for non-C devices (for instance Windows RT now).
Now that caused some heavy brain damage and let's stay I stuck with the language...

Otherwise it's obvious that in order you want to be among the best, you plainly started off with the wrong technology, totally agree.

Best, Balint

emadsen · Post by **emadsen** » Sat Oct 27, 2012 8:36 pm

I don't understand why people want to use C#.

Mainly because writing a chess engine in C has been done to death.

I feel like I'm adding some genetic diversity to the pool of chess engines.

I fully concede your point that C is a better choice for the job. But C is a known quantity. The entire chess engine community already knows how strong of an engine can be written in C- extremely strong, stronger than the best human players.

What the chess engine community doesn't know is how strong of an engine can be written in C#. My curiosity got the better of me, so I started my MadChess project to answer the question. I'd like to conform to the highly abstract, object-oriented design intent of C# and see how strong of an engine I can write.

So in short, I say "respect." Respect to what you and other C/C++ programmers have accomplished. It is quite amazing how far you've pushed the state of the art. I think I speak for Balint too, when I say I'm not here to start a language war.

P.S. Where can I download your engine? I don't see a website link in your profile.

lucasart · Post by **lucasart** » Sun Oct 28, 2012 3:47 am

emadsen wrote: P.S. Where can I download your engine? I don't see a website link in your profile.

My two most recent projects are here:
http://wbec-ridderkerk.nl/html/details1 ... Check.html
https://github.com/lucasart/chess.git

Cheney · Post by **Cheney** » Mon Oct 29, 2012 9:57 pm

Lucas,

I started this whole thread on C++, C#, performance, and a few other concepts. I just refined my C# program to where it has moved from 90 seconds for perft five to two seconds. This is on an i5.

I am using bitboards and made the engine totally static and flat. No objects throughout the code, arrays for moves (no C# lists).

Just to be clear, you're saying in your code, perft 6, which is 119M nodes, you completed that in just about one second?

If this is true then that's where I show my inexperience

. Without rewriting direct assembly, just how much more can one optimize code?

I did some tests with some sample code that I rewrote from C# to C++ and I gained no performance increase. Maybe it is my choice of compiler or options/settings for compiling the code (VS Premium 2010, .NET 4, 64-bit Win7).

At this point, I am still learning about chess programming and will focus on my next challenges: move ordering, iterative deepening, Zobrist, etc, but am still curious about the potential with C/C++.

Thanks

Richard Allbert · Post by **Richard Allbert** » Tue Oct 30, 2012 7:55 am

Hi Lucas,

IMHO you are right.

I paid for a professional profiler whilst developing Jabba in C#, and with help from a few on this this forum (maybe you remember the thread) I got the speed to 60% of my C version.

It was fun, and I learnt a lot .... I remember especially being surprised to learn for example that structures can go on the stack or the heap, which is not what the books tell you!!

But I switched back to C because I became disappointed the loss of Strength due to speed. The time to write was much longer with C# because I spent so long with a profiler understanding where I could squeeze out performance. By contrast, the C version has had zero time invested in performance, I wrote and that was that.

HOWEVER

Where I really like C# is for writing things such as tournament managers, game referees, anything with file IO, inter process communication, a GUI.

These are much simpler and faster to write and debug in C# than C - especially inter process communication and multi threading.

As you say, there are different tools for different jobs.

Ciao

Richard

lucasart · Post by **lucasart** » Tue Oct 30, 2012 11:50 am

Cheney wrote:Just to be clear, you're saying in your code, perft 6, which is 119M nodes, you completed that in just about one second?

Yes. To be exact, I did a varied perft test, with a few positions, and the total was around 100 Mnps, on an Intel Duo Core E5300 @ 2.6 GHz, running one thread (a 4 year old entry level desktop PC).

Cheney wrote: If this is true then that's where I show my inexperience . Without rewriting direct assembly, just how much more can one optimize code?

This is a common misconception. These days, no human can beat a C compiler on any sufficiently large piece of code. Even the might Linus Torvaldes said that, and you can be sure that he tried!

Cheney wrote: I did some tests with some sample code that I rewrote from C# to C++ and I gained no performance increase. Maybe it is my choice of compiler or options/settings for compiling the code (VS Premium 2010, .NET 4, 64-bit Win7).

It's hard to say without looking at the code you actually wrote. But C++ is very vicious, and if you don't master the language fully (which few people do), it's very easy to write total and utter crap with it. Start using an std::list<> instead of a C-array, and you'll see...

If you want to write performance critical code, C is the only sane choice, unless you *truly master* C++ (which only few people do). And the reason is very simple: when you read C code, it's immediately clear how it will translate in assembly. Nothing is hidden from the programmer. So if you wanted to used a doubly linked list instead of a C-array in C, you'll have to code it yourself, and in doing so, you'll understand why it is so truly horrible it is in terms of performance:
- heap alloc all over the place, traversing the list is O(n) instead of O(1), and in doing so you maximize the cache misses... typically C#/Java programmers don't understand these subtilties and wonder why their code is so slow.
- in C, because you have to do what the C#/Java interpreter does yourself, you understand what is happenning.

Sven · Post by **Sven** » Tue Oct 30, 2012 1:54 pm

lucasart wrote:
Cheney wrote:Just to be clear, you're saying in your code, perft 6, which is 119M nodes, you completed that in just about one second?
Yes. To be exact, I did a varied perft test, with a few positions, and the total was around 100 Mnps, on an Intel Duo Core E5300 @ 2.6 GHz, running one thread (a 4 year old entry level desktop PC).

To be even more exact, your perft code calculated 100M leaves per second while saving all make/unmake operations of the last ply, and the real "NPS" speed was somewhere between 2.5 and 4 Mnps, depending on the average branching factor for the last ply in your test positions (here I assumed an average BF between 40 and 25).

No doubt that 2.5 ... 4 Mnps is impressive (it is faster than my best perft speed so far) but it is also not as "magic" as real 100 Mnps would be

Sven

lucasart · Post by **lucasart** » Tue Oct 30, 2012 2:09 pm

Sven Schüle wrote:
lucasart wrote:
Cheney wrote:Just to be clear, you're saying in your code, perft 6, which is 119M nodes, you completed that in just about one second?
Yes. To be exact, I did a varied perft test, with a few positions, and the total was around 100 Mnps, on an Intel Duo Core E5300 @ 2.6 GHz, running one thread (a 4 year old entry level desktop PC).
To be even more exact, your perft code calculated 100M leaves per second while saving all make/unmake operations of the last ply, and the real "NPS" speed was somewhere between 2.5 and 4 Mnps, depending on the average branching factor for the last ply in your test positions (here I assumed an average BF between 40 and 25).

No doubt that 2.5 ... 4 Mnps is impressive (it is faster than my best perft speed so far) but it is also not as "magic" as real 100 Mnps would be

Sven

bulk counting is not cheating. It's what everyone does:

Cheney wrote:Just to be clear, you're saying in your code, perft 6, which is 119M nodes, you completed that in just about one second?

From that message it was clear that we were talking about leaves, not interior nodes (perft(6) = 119,060,324 leaves).

But I hear what you're saying. When we look at NPS for a chess engine (not just a perft), we only count interior nodes. That's what I do in DiscoCheck anyway.

Sven · Post by **Sven** » Tue Oct 30, 2012 3:36 pm

lucasart wrote:bulk counting is not cheating.

I did not intend to accuse you or anyone else of cheating. Sorry if that was what you understood from my message. My intention was mainly to clarify the facts for other readers. At least Cheney appeared to be surprised about that huge "nps" number.

lucasart wrote:But I hear what you're saying. When we look at NPS for a chess engine (not just a perft), we only count interior nodes. That's what I do in DiscoCheck anyway.

If we talk about NPS in perft I think we should use the number of nodes that were really visited, whether these are only "interior" nodes or not. The calculated number of leaves per second (derived from the final perft result) is less useful than "real NPS" when intending to compare the speed of two perft implementations for arbitrary positions since "real NPS" does not depend much on the branching factor in contrast to bulk-counting based calculation of leaves. It is also less useful for another reason: you get a misleading impression of the "speed" of a perft implementation.

Furthermore, I do not agree to your statement "we only count interior nodes". Node counting in a chess engine usually counts *all* nodes that were visited.

Sven

Question About CPP-C#, Performance, and Square Represenation

Re: Question About CPP-C#, Performance, and Square Represena

Re: Question About CPP-C#, Performance, and Square Represena

Re: Question About CPP-C#, Performance, and Square Represena

Re: Question About CPP-C#, Performance, and Square Represena

Re: Question About CPP-C#, Performance, and Square Represena

Re: Question About CPP-C#, Performance, and Square Represena

Re: Question About CPP-C#, Performance, and Square Represena

Re: Question About CPP-C#, Performance, and Square Represena

Re: Question About CPP-C#, Performance, and Square Represena

Re: Question About CPP-C#, Performance, and Square Represena