How to serialize integers into memory

Here is the analysis of a recent bug I’ve stumbled upon. My initial reaction was that the problem is in the compiler (or that “These are wrong bees”). Consider the code below. We copy 64 integers into a properly allocated destination buffer and yet, if compiled with -O3 switch this code crashes with segfault!


// Compile with (g++-5):
// g++ -std=c++11 -O3 && a.out
#include <memory>
#include <cstdint>

using namespace std;

__attribute__ ((noinline))
void SerializeTo(const uint64_t * const & src, size_t len, uint8_t* dest) {
  for (size_t i = 0; i < len; ++i) {
    *reinterpret_cast<uint64_t*>(dest) = src[i];
    dest += sizeof(src[i]);
  }
}

int main() {
 unique_ptr<uint64_t[]> d(new uint64_t[64]);

 unique_ptr<uint8_t[]> tmp(new uint8_t[1024]);

 SerializeTo(d.get(), 64, tmp.get() + 4);

 return 0;
}

You can see it yourself by running it with g++ on x64 architecture. A quick session with gdb shows that it fails on the instruction movaps. g++ is smart enough to perform vectorized optimization and copy 2 64 bit integers at once. Unfortunaltely movaps requires that its destination memory operand would be 16 byte aligned but SerializeTo can not possibly know whether dest is aligned or not. To my surprise g++ does not generate any preamble code that handles possible unalignment issues the way I would expect him to, nor it decides to fallback to instructions that handle unligned words. Is g++ in his rights?

g++ is correct to generate this code because we broke some basic rules and this code has undefined behavior (UB) due to incorrect cast reinterpret_cast<uint64_t*>(dest). See “Type Aliasing” paragraph here. According to these “strict aliasing” rules we can not cast from uint8_t* to uint64_t* and if we do it causes an UB. We made g++ believe that it can perform aligned write due to our cast to uint64_t*.

Before we talk about solution, lets check how common such mistake is. Many google projects, for example, have the following macro #define UNALIGNED_STORE64(_p, _val) (*reinterpret_cast<uint64_t *>(_p) = (_val)) for x64 bit architecture. The name of the macro eloquently shows that it’s meant for storing integer into possibly unaligned address. There are millions other appearences of reinterpret_cast in github and I estimate that vast part of them do not follow strict aliasing rules.

So how do we copy an integer into specific memory address? We could, of course copy it byte by byte but that’s slow. The only acceptable solution I know of is to use memcpy(dest, src, sizeof(uint64)). In optimized mode, the compiler recognizes memcpy and replaces it with specific CPU instructions according to the target architecture. In case of uint64_t on x64 it translates to a single mov instruction. While it maybe suboptimal (it’s still possible to use vectorized instructions) it’s nethertheless correct.

I think it’s a bit sad that currently C++ language does not have a dedicated tool in the language that explicitly allows storing a computer word in memory and instead we need to rely on an external function but that’s state of matters at this moment.

If you know any other standard way of performing this task correctly please tell me.

Share Comments
comments powered by Disqus