Rein Halbersma wrote:For C++11, my preferred option is using a std::vector with a flexible and quite well performing arena allocator. This will allocate from a stack-based fixed-sized buffer, and go to the heap when that runs out. Overhead per allcoation is a conditional (check if pointer is inside buffer) and a pointer increment.
Why do things in a simple and effecient way when there is always an ugly bloated solution I would kill for that.
Do you realize that the code is actually broken?
It doesn't align the allocs.
So if you use the same "arena" to allocate 3 bytes then one 32-bit int on ARM, you will get an exception once you try to access it.
But allocating heterogeneous types of different alignments is not the use case for allocators. See here for example code:
Instead, the use case is to have a variable move_list of type std::vector<move_t, short_alloc<move_t, N * sizeof(move_t)>> to store the generated moves in through a sequence of calls to move_list.push_back(current_moves). As long as the number of generated moves stays below 128 (or whatever number you like), this will only increment pointers into an arena object which is a thin wrapper around a stack-allocated char[128*sizeof(move_t)]. Only when the array has been exhausted, will the allocator go to the heap for additional memory. So it's both efficient and safe to use with arbitrary large move lists.
Rein Halbersma wrote:But allocating heterogeneous types of different alignments is not the use case for allocators.
Yes I understand the use case. The problem is that misuse of arena can break things (badly).
What's more important about vector is that it grows by powers of two when using push_back.
So you not only waste performance by copying/moving elements each time you push 2^n-th element (negligible),
but more importantly you waste the preallocated space (using 128*sizeof(int32_t) you are out stack memory already when pushing 29th element!!!).
reserve() and push_back would be much better and resize() would be best if you know in advance how much space you'll need.
Either way this solution is by far no replacement for VLAs and doesn't justify the added complexity.
Rein Halbersma wrote:But allocating heterogeneous types of different alignments is not the use case for allocators.
Yes I understand the use case. The problem is that misuse of arena can break things (badly).
What's more important about vector is that it grows by powers of two when using push_back.
So you not only waste performance by copying/moving elements each time you push 2^n-th element (negligible),
but more importantly you waste the preallocated space (using 128*sizeof(int32_t) you are out stack memory already when pushing 29th element!!!).
reserve() and push_back would be much better and resize() would be best if you know in advance how much space you'll need.
Well of course, you need to use move_list.reserve(128) once before you start pushing back moves (and that only adjusts a pointer, does not allocate anything). That way, the only time you are copying is when you are going to the heap anyway. There really is no overhead except for the conditional to check whether there is enough stack space left.
Either way this solution is by far no replacement for VLAs and doesn't justify the added complexity.
It's more complexity, granted, and for games like chess where you can infer the number of moves from popcounting bit patterns, VLAs are probably the preferred solution.
But for games like draughts, you need a recursive search to find multiple captures and these can grow pretty large (there are positions in 10x10 international draughts that have > 500 moves). Moreoever, it's not possible to infer the total number from popcounting alone. So for that, the flexibility of a std::vector + stack allocator is really an efficient solution, even if it requires a few extra statement to set up the arena and allocator.