8

I'm working on a memory pool for a small game engine.

The main use will be as a segregated storage; a pool contains object of a specific type and size. Currently the pools can be used to store anything, but allocations will be done in blocks of a specific size. Most of the memory need will be allocated at once, but "overgrowth" can be enabled if needed to assist in tuning (almost fixed size).

Problem is, I started to get somewhat paranoid when contemplating about memory alignment. I'm only used to raw memory management on 8 bit processors where everything is byte aligned.

I'm letting the user (me) specify the desired size of the blocks, which in the segregated storage case would be the size of the objects that I'm going to store in it.

The current approach is to allocate a chunk of memory blocks * (desired_size + header_size) big and place the objects in it, with a header for each block; objects would obviously be positioned directly behind this header.

What do I need to consider with regards to memory alignment in my scenario?

The answer I've come up with so far is that as long as desired_size represents n-byte aligned data; the header is correctly aligned and packed by the compiler as well as the actual data, everything stored in the block will be n-byte aligned.

n is whatever boundary required by the platform. I'm targeting x86 for the moment but I don't like to make any assumptions about the platform in my code.

Some of the resources I've used:

Edit

Uploaded small sample code which may helpful anybody as confused as me in the future here.

Community
  • 1
  • 1
Skurmedel
  • 20,625
  • 5
  • 49
  • 65

5 Answers5

8

Allocations with malloc are guaranteed to be aligned for any type provided by the compiler and hence any object[*].

The danger is when your header has a smaller alignment requirement than the maximum alignment requirement for your implementation. Then its size might not be a multiple of the max. alignment, and so when you try to cast/use buf + header_size as a pointer to something that does have the max. alignment, it's misaligned. As far a C is concerned, that's undefined behaviour. On Intel it works but is slower. On some ARMs it causes a hardware exception. On some ARMs it silently gives the wrong answer. So if you don't want to make assumptions about platform in your code, you must deal with it.

There are basically three tricks that I'm aware of to ensure that your header doesn't cause misalignment:

  • Use an implementation-specific alignment pragma to force the issue.
  • Use platform-specific knowledge of struct layout and alignment to make sure that its size just so happens to be a multiple of the max alignment requirement on the system. Typically this means, "stick an extra int in as padding if necessary to make it an 8-multiple rather than just a 4-multiple".
  • Make the header a union of every standard type, together with the struct that you actually want to use. Works well in C, but you'd have problems in C++ if your header isn't valid for membership of unions.

Alternatively, you can just define header_size not to be sizeof(header), but to be that size rounded up to a multiple of some chunky power of 2 that's "good enough". If you waste a bit of memory, so be it, and you can always have a "portability header" that defines this kind of thing in a way that isn't purely platform-independent, but makes it easy to adjust to new platforms.

[*] with a common exception being over-sized SIMD types. Since they're non-standard, and it would be wasteful to 16-align every allocation just because of them, they get hand-waved aside, and you need special allocation functions for them.

Steve Jessop
  • 265,622
  • 35
  • 449
  • 690
  • Might as well add that the same is true with all allocation functions (so `::operator new` and kin.) – GManNickG Aug 19 '10 at 22:52
  • Thanks for the answer. How do one determine the max. alignment of a type? If I understood your second paragraph correctly - let's assume x86 here for simplicity - this means that if header is 4 bytes big, it'll be 4 byte aligned, and data is 8 bytes big it will be 8 bytes aligned. This would mean I need to pad header with 4 bytes to make the data properly aligned, correct? – Skurmedel Aug 19 '10 at 23:19
  • I've been thinking of your "good enough" proposal, it sounds pragmatic; I like pragmatic. How would I know what is good enough, would this just make sure the header is aligned to the max. alignment of the platform or would it be dependent on the actual size of the stored data (making it less desirable)? – Skurmedel Aug 19 '10 at 23:27
  • 1
    Each type on a given platform has an alignment imposed by the compiler (at least, in cases where no packing pragma or attribute indicates otherwise). You need to find out what those alignments are, and find the largest. The *allocation*, and hence the header itself since you're placing it at the start of the allocation, will have that alignment. The *next byte after* the header will have that max alignment if (and only if) the size of the header is a multiple of the max. alignment. – Steve Jessop Aug 20 '10 at 14:23
  • As for "good enough" - the size of the largest type on your platform is "good enough", since nothing can have an alignment requirement bigger than its size. – Steve Jessop Aug 20 '10 at 14:23
  • Thanks, I think I'm grasping this now. This would mean that if the platform's max alignment were 8, but the header occupied something like 14 bytes I would just assume that the header was 16 bytes large because that would properly position the data in the block. – Skurmedel Aug 20 '10 at 15:56
  • 2
    That's right: leave 16 bytes for it, of which it will occupy 14 and the other 2 are unused. Although it's unlikely that your header will have size 14, since it will probably have something in it that requires 4-alignment, so the size must be a multiple of 4. It's 12 that you really want to watch out for and round up. – Steve Jessop Aug 20 '10 at 19:20
  • Thank you for the help, I should have everything I need to make at least the base use case work now. – Skurmedel Aug 20 '10 at 19:49
2

Your compiler will already align the members of objects and structures that will be stored in the pool. Using the default packing that's appropriate for your architecture, usually 8 for a 32-bit core. You just need to make sure that the address you generate from your pool is aligned accordingly. Which on a 32-bit operating system ought to be a multiple of 8 bytes.

Mis-aligning the objects can be very expensive when they cross a CPU cache line boundary. A double that straddles two cache lines takes as much as three times as long to be read or written.

Hans Passant
  • 897,808
  • 140
  • 1,634
  • 2,455
  • Actually the compiler will not necessarily align members. Eg .in VC++ alignment is subject to `#pragma pack` option or `/Zp` option. Code that deals with IO buffers (network, disk) frequently forces `#pragma pack(1)` for compacting the data on-disk or on-wire. And there are more options to control alignment, like `__declspec(align)`, http://msdn.microsoft.com/en-us/library/2e70t5y1%28VS.80%29.aspx http://msdn.microsoft.com/en-us/library/83ythb65%28v=VS.100%29.aspx – Remus Rusanu Aug 19 '10 at 22:47
  • 2
    Sigh, yes, you can override it. The implicit assumption is that you know what you're doing when you do. We wouldn't have this question if the OP already knew that. – Hans Passant Aug 19 '10 at 22:57
  • The compiler ALWAYS aligns members. The only thing is that you can override it to align differently. That is not the same as not aligning them. – Puppy Aug 19 '10 at 23:10
  • I was in fact aware that you could change the packing, but as Hans correctly assumes I'm far too inexperienced in these matters to understand the ramifications of such usage. Thanks for the answer, I'll have to wait to be able to upvote though (been generous today :)) – Skurmedel Aug 19 '10 at 23:13
  • This isn't about packing, this is about the address you generate from the pool. – Hans Passant Aug 20 '10 at 01:12
  • Hans Passant: Sorry I mmisunderstood. – Skurmedel Aug 20 '10 at 10:55
2

I've fiddled with memory alignment too. The first thing to understand is that each object has its own memory alignment requirements, which is (at most) the size of the object itself. Fortunately it's often smaller.

There are facilities of two types to help writing correct memory allocators (C++0x, they can be found in std::tr1 in most STL):

  • std::alignment_of gives the alignment as a compile-time value
  • std::aligned_storage gives an aligned storage according to its parameters

Working with the two of them, you'll get what you need.

However your design is slightly off-base, not by much mind, because your header won't have the same alignment requirement (in general) than the objects stored.

template <class T, size_t N>
struct Block
{
  // Header bits

  typedef std::aligned_storage<
      N*sizeof(T),
      std::alignment_of<T>::value
  >::type buffer_type;
  buffer_type mBuffer;
};

Two notes:

  • Block itself might have a different alignment, but it doesn't matter
  • You could use sizeof for the alignment, it's guaranteed to work, even if a bit wasteful

Finally, you can also succeed without all these, and some pointer arithmetic, but since it's offered....

Matthieu M.
  • 268,556
  • 42
  • 412
  • 681
  • Thanks for the answer, these look like they will come in handy. Although I think I might pass in the alignment instead since I would like to keep the pool non-type specific; but this only means I'll use alignment_of somewhere else. – Skurmedel Aug 20 '10 at 15:58
  • Things get a bit more complicated if you go the generic road, because of fragmentation. Don't hesitate to create a first type-specific (or at least alignment specific) pool then wrap it with a generic one which just manages a collection of them. – Matthieu M. Aug 20 '10 at 17:04
  • Matthieu, a good idea. Avoiding fragmentation was one of the points of having a pool in the first place since I suspect I will have a quite massive amount of minor allocations. – Skurmedel Aug 20 '10 at 17:06
1

Use http://msdn.microsoft.com/en-us/library/bb982233.aspx - std::alignment_of. This ensures that for every T you allocate into your pool, you will know the alignment and can ensure that it fits. I'm not going to pretend to know/understand the actual logic you would use to make this information into alignment guarantees, but it does exist and is available for use.

Puppy
  • 141,834
  • 35
  • 244
  • 454
1

As far as I know boost::pool is based on alexandrescu's 'allocator of small objects' explained in his book "modern c++ design". I is must read book (since you are writing this stuff for learning too)

f0b0s
  • 2,850
  • 23
  • 29