Question About CPP-C#, Performance, and Square Represenation

Cheney · Post by **Cheney** » Fri Oct 05, 2012 2:21 am

Hi Erik,

Of course, I'm doing some brain storming

and decided to check for square (bit) occupancy from a king's perspective. If I treat the king like a queen, it will reach out across all files, ranks, and diagonals. It sounds just like what you are referring to. I should be able to do this with bit shifting, the same way I make moves.

If I do this, as you say, for the side that is moving, then I should be able to just generate moves for one side at a time plus inlcude one additional process, validating check on the king for that side.

I read you on the long code, I need a longer screen just to see it all

. I did not totally consolidate but feel a few methods can be optimized this way.

Thanks again, I'll reply later with any results
-Cheney

Cheney · Post by **Cheney** » Fri Oct 19, 2012 1:16 am

Hi,

I finally completed a fair amount for the move generation and perft test. Using the new model, perft 5 now runs in under 5 seconds.

The changes I made:

* I kept the bitboard model but did not use magics.

* The squares array is just used as an index for holding pieces (which are now ints) and not used as the primary array for locating pieces and generating moves. All moves are generated based on the bitboards.

* I am generating moves just for the side that is moving.

* Moves are no longer an object with multiple bools and bytes/ints, it is just a single integer which I use bits to represent pieces, squares to/from, etc.

* I was able to use the bitboards and file/rank/diagonal masks to see if a piece is attacking the king. If true then I use the move generation to figure out if the attack is blocked or a legitimate check.

* I did continue to organize the code into smaller methods to remain organized.

Unfortunately, this all moves away from OO. It uses flat arrays and the bitboards; no lists, no stacks, no real objects.

I did like the way I could access members of a piece object before and even use polymorphism. I think your code was doing that as well. Since I have all my original code, I am going to try to integrate piece objects back into this, control how frequently I am creating new pieces, and see how it goes.

Thanks again

-Cheney

stegemma · Post by **stegemma** » Fri Oct 19, 2012 10:25 am

emadsen wrote:Do you reuse moves? If so how do you track scope? That is, how do you know when a move is no longer referenced (further up the search tree) and can be safely reused?
Do you make use of finalizers or any other language constructs to track this? Or when you are out of moves in the move buffer, do you just allocate many at once?

I reuse move objects, just assigning them the new values. My moves array is static, pre-dimensioned, and i start with an index to current move that point to the first one. Anytime i create a move, simply set move[index].from, to etc. to the new values and then increment index. When i play moves, i keep another index to the first one played for the current node. Keep index to the first move in the array for any node, keep another index to the played one and you're done. Never have to allocate moves, just increment/decrement indexes.

emadsen · Post by **emadsen** » Thu Oct 25, 2012 7:49 pm

Using the new model, perft 5 now runs in under 5 seconds.

Sounds like your efforts to refactor the code paid off. Glad to hear it.

I reuse move objects, just assigning them the new values. My moves array is static, pre-dimensioned, and i start with an index to current move that point to the first one. Anytime i create a move, simply set move[index].from, to etc. to the new values and then increment index. When i play moves, i keep another index to the first one played for the current node. Keep index to the first move in the array for any node, keep another index to the played one and you're done. Never have to allocate moves, just increment/decrement indexes.

That makes sense. My next priority is to limit object allocations. The biggest offender is moves created / garbage collected during searching. Using a move buffer similar to what you describe should greatly reduce allocations and improve performance.

bpfliegel · Post by **bpfliegel** » Thu Oct 25, 2012 9:14 pm

Use the 'readonly' keyword for arrays (if preallocated with a static length), use private members if possible, make your classes sealed - and use static classes and members wherever possible. Obviously keeping a trade-off with your preferred architecture. Moves away somewhat from OO, I know.

Perft is 11 millon nodes/sec for me in WhiskySoda (on an i5 laptop), but I use magics, which helps a lot. In Portfish it is around 45 million nodes and that is .Net aswell

I was too lazy to write a proper check detection in WhiskySoda, so just invalidate the move if it ends up in being check...

And use a decent profiler (EQATEC or Visual Studio built-in, I use both).

Good luck,
Balint

emadsen · Post by **emadsen** » Thu Oct 25, 2012 9:39 pm

I don't understand what you mean by a readonly array. Readonly just means the array itself cannot be reassigned. Its members can be read and written. I don't understand the perf benefit there.

In principle I understand the benefit of sealed methods (eliminates call through vtable) but in practice I haven't seen any measurable speedup.

Where can I download your WhiskySoda engine?

bpfliegel · Post by **bpfliegel** » Thu Oct 25, 2012 10:26 pm

emadsen wrote:Readonly just means the array itself cannot be reassigned.

Exactly! Could the compiler make some optimization on top of this fact? See for yourself...

WhiskySoda is not yet available for download, it's still in a quite early state.

Best,
Balint

Ps: 1) No GC - big speedup, 2) static and readonly - some speedup were the biggest tricks I found. Sealed is really marginal.

emadsen · Post by **emadsen** » Thu Oct 25, 2012 10:44 pm

OK Balint, thanks for your advice. I'll focus on no GC as it seems the most promising.

Perf issues in .NET can be counter-intuitive because it's not always clear what the JIT compiler can optimize. I suspect sealed may fall in this category. Still useful for code clarity- just not a big perf win.

bpfliegel · Post by **bpfliegel** » Thu Oct 25, 2012 10:46 pm

emadsen wrote:OK Balint, thanks for your advice. I'll focus on no GC as it seems the most promising.

Perf issues in .NET can be counter-intuitive because it's not always clear what the JIT compiler can optimize. I suspect sealed may fall in this category. Still useful for code clarity- just not a big perf win.

Yes, zero memory alloc FTW! That is a must unfortunately... Especially when you will go multithreaded...

Good luck, Balint

lucasart · Post by **lucasart** » Sat Oct 27, 2012 3:45 am

bpfliegel wrote:Use the 'readonly' keyword for arrays (if preallocated with a static length), use private members if possible, make your classes sealed - and use static classes and members wherever possible. Obviously keeping a trade-off with your preferred architecture. Moves away somewhat from OO, I know.

Perft is 11 millon nodes/sec for me in WhiskySoda (on an i5 laptop), but I use magics, which helps a lot. In Portfish it is around 45 million nodes and that is .Net aswell I was too lazy to write a proper check detection in WhiskySoda, so just invalidate the move if it ends up in being check...

And use a decent profiler (EQATEC or Visual Studio built-in, I use both).

Good luck,
Balint

Sorry to spoil your C# party, but 11 Mnps or even 45 Mnps, on this kind of hardware, is NOT good.

My C/C++ code, which is not particularly smart, does a varied perft benchmark at about 100 Mnps on a Pentium Dual-Core E5300 @ 2.6 GHz, using 1 thread (and not cheating with a hash table of course). That's a 4 year old entry level computer.

I don't understand why people want to use C#. Here's what typically happens:
- they argue that it's much more high level than C, so they can write conceptual code without worrying about the details, and that efficiency doesn't matter. which is true (for a lot of projects efficiency is not that important, or is only important in small parts of the code)
- then their engine is so slow and lame (if it doesn't run out of memory before it finishes a game due to all the memory alloc and hazardous garbage collection), that they decide to optimize it.
- in the process of optimizing their code, they finally realize that everything that C# does that was supposedly better than C, is horribly inefficient and should be avoided.
- so not only they're not using the C# features that make life easier (if you dont care about performance) but they spend a lot of extra thought on how to avoid them. how to preallocate all your objects, and distribute the elements over preallocated objects etc, just to avoid tha garbage collection mechanism and so on.
- this is just plain stupid. In the end you spend less time and cause yourself less brain damage writing everything in C from the word go.

Question About CPP-C#, Performance, and Square Represenation

Re: Question About CPP-C#, Performance, and Square Represena

Re: Question About CPP-C#, Performance, and Square Represena

Re: Question About CPP-C#, Performance, and Square Represena

Re: Question About CPP-C#, Performance, and Square Represena

Re: Question About CPP-C#, Performance, and Square Represena

Re: Question About CPP-C#, Performance, and Square Represena

Re: Question About CPP-C#, Performance, and Square Represena

Re: Question About CPP-C#, Performance, and Square Represena

Re: Question About CPP-C#, Performance, and Square Represena

Re: Question About CPP-C#, Performance, and Square Represena