Some Thoughts about Aliasing in C++

Eric Lengyel   •   September 18, 2017

[Edit: A detailed specification about the disjoint qualifier is available here.]

The C++ standard explicitly states when pointers can alias at the end of Section 3.10. It does not allow simple things like this:

float f;
...
int i = *reinterpret_cast(&f);

Pointers to int and float are not related in a way that supports aliasing, and this is technically undefined behavior. At the end of Section 5.2.10, the standard talks about type puns and says you can do this:

int i = reinterpret_cast(f);

I interpret this as the intention to add enough expressiveness to the language to allow a programmer to reinterpret the bits of a float as an int. However, the spec also states that this type pun is equivalent to the pointer reinterpretation above, which many view as implying that it’s still undefined behavior.

In my opinion, the spec is being too restrictive about what can alias and what must not alias. I don’t think it’s a good idea for any compiler to assume that two pointers don’t alias the same storage based solely on the types of objects they point to. There are plenty of good reasons for a programmer to interpret the bits in some chunk of memory in more than one way.

We would still like to be able to tell the compiler that some things don’t alias, though, in order to enable various optimizations. The C99 standard introduced the restrict keyword to C to address this a long time ago, but it has not been officially added to C++. Most compilers support it anyway through implementation-dependent decorations such as __restrict.

The restrict keyword is applied to a pointer as in the following example:

int *restrict ptr;

This tells the compiler that ptr points to storage that no other pointer could possibly point to. In my opinion, applying restrict to the pointer is the wrong approach. I submit that things would work out better if the storage itself could be marked as non-aliased, and this post just contains some notes about how that would work.

I will use the keyword disjoint below, but it should be regarded as a stand-in for some potentially better choice to be determined later.

A new qualifier

The disjoint keyword could be applied as a new qualifier to a type, similar to how const and volatile are currently applied.

disjoint int buffer[64];   // Declare non-aliased storage

Unlike the const and volatile qualifiers, the disjoint qualifier can be implicitly removed. Suppose we want to pass the above buffer to a function with the following signature.

void foo(int *data);

Making the call foo(buffer) would be perfectly fine. Inside the function foo, the compiler has to assume that the storage pointed to by data could be aliased, so it takes the safe route. If the function foo was instead declared as follows, then the compiler would be able to make extra optimizations under the assumption that the storage pointed to by data is not aliased.

void foo(disjoint int *data);

If buffer had not been declared with the disjoint qualifier, then it could not be passed to this version of foo. This is the opposite of const and volatile. The disjoint qualifier cannot be implicitly added to a type.

The disjoint qualifier changes the type of a pointer just like const and volatile. Any disjoint qualifiers applied to function parameters are included in the function’s signature, so the above two declarations of foo represent two distinct functions.

A non-static member function could be declared disjoint as follows to indicate that the storage pointed to by this is not aliased. Such a member function could be called only for an object that was itself declared disjoint.

struct Bar
{
    void f() disjoint;    // *this has non-aliased storage
};

Bar A;
disjoint Bar B;

A.f();    // error: can't call disjoint function
B.f();    // OK

Since the disjoint qualifier cannot be implicitly added to a type, we need a way to allocate storage on the heap as disjoint. It would not hurt anything to change the behavior of the new operator so that it always returns a disjoint-qualified type:

disjoint Bar *C = new Bar;

In existing code, the disjoint qualifier would simply be implicitly removed on assignment without any consequences.

The disjoint qualifier could appear in multi-level points anywhere that the const or volatile qualifiers could appear. For example:

int *disjoint *ptr;

Here, ptr is a pointer to disjoint storage containing a pointer to (possibly aliased) int.