Home

_Generic for Type Reification in C

David Priver, July 29th, 2022

Occasionally, you need to turn a type into a value. An example of when you would need to do this is when constructing a tagged union that can hold pointers to different types:

// anypointer.h
#include <stdio.h>

// prefixes and other types elided for brevity
enum AnyPointerTag {
    UNINIT,
    INT,
    CHAR_STAR,
    DOUBLE,
};
typedef struct AnyPointer AnyPointer;
struct AnyPointer {
    enum AnyPointerTag tag;
    union {
        void* pointer;
        int* integer;
        char** charstar;
        double* double_;
    };
};

void
print_any(AnyPointer any){
    switch(any.tag){
        case UNINIT: return;
        case INT:
            printf("%d\n", *any.integer);
            return;
        case CHAR_STAR:
            printf("%s\n", *any.charstar);
            return;
        case DOUBLE:
            printf("%f\n", *any.double_);
            return;
        default: return;
    }
}
// example.c
#include "anypointer.h"

int main(void){
    int x = 3;
    AnyPointer any = {.tag=INT, .integer=&x};
    print_any(any); // prints 3
    return 0;
}

This example is small so it is not that difficult to manually set the tag when constructing the tagged union. However, setting the tag wrong would be disastrous, as you would be misinterpreting what is being pointed to.

// problem.c
#include "anypointer.h"

int main(void){
    int x = 3;
    AnyPointer any = {.tag=CHAR_STAR, .integer=&x};
    print_any(any); // segfaults
}

_Generic

C11 added the rarely used feature _Generic. It somewhat resembles a switch statement, but using types instead of integer constant expressions. Its motivating use case was to simulate overloading in C, but it can select arbitrary expressions, not just functions.

// generic.c
#include <stdio.h>

int main(void){
    // prints: "int: 3"
    printf(
        _Generic(3,
        int: "int: %d\n",
        float: "float: %f\n"), 3);
    return 0;
}

In common use, _Generic is used in a macro.

_Generic for Tagging

We can use the power of _Generic to automatically select the appropriate type tag for our tagged union.

// generic_tag.c
#include "anypointer.h"

#define TAG(x) _Generic(x, \
    int: INT, \
    char*: CHAR_STAR, \
    double: DOUBLE)

int main(void){
    int x = 3;
    AnyPointer any = {.tag=TAG(x), .pointer=&x};
    print_any(any); // prints 3
    return 0;
}

Going even farther, we can completely automate the creation of our tagged union in a macro:

// generic_compound.c
#include "anypointer.h"

// NOTE: we now take a pointer instead of the value
#define TAG(x) _Generic(x, \
    int*: INT, \
    char**: CHAR_STAR, \
    double*: DOUBLE)

#define ANY(x) (AnyPointer){.tag=TAG(x), .pointer=x}

int main(void){
    int x = 3;
    AnyPointer any = ANY(&x);
    print_any(any); // prints 3
    return 0;
}

Now, we can safely and concisely construct our tagged union without the risk of accidentally setting the wrong tag. Additionally, if we try to wrap a value of an unsupported type, we'll get a compilation error as the _Generic will fail to match any of its known types.

Use Cases

In practice, you don't use this technique very often. C code doesn't usually need to use this level of indirection and using too many macros like this can make the code obfuscated. However, there are times when a little "magic" like this is justified in my opinion.

Argument parsing libraries

You can use a type-reifying macro in essentially the same way as in the above examples. When parsing arguments, you would switch on the type tag to determine what type conversion function to call or action to take.

Serialization code

You can use a type-reifying macro like this, along with sizeof and offsetof to create a descriptor for a data type. The serializer/deserializer could then use that descriptor to parse binary data or something like JSON.

Dynamic "objects"

Instead of a type tag, you could have a pointer to a vtable. This would allow you to convert staticly typed objects to an interface.

It's a useful technique to have in your toolbox. Don't overuse it.

Conclusion

C11 added the means to reify types, a capability lacking from the C programming language before then. Using the _Generic construct, we can bridge one of the gaps between compile time information (types) and runtime information (type tags). Doing so allows you to implement things like an Any type, introspecting values for data processing and other use cases. Opportunities to use this technique are not too frequent, but when you do need it the technique can save error-prone manual boilerplate.

Appendix: Other Languages

In C++, you can achieve the same effect by writing a collection of trivial constexpr overloaded functions that return the type tag. This is more verbose, but can yield the same result.

If you are using GCC, but for some reason can't use C11 features, you can use __builtin_types_compatible_p.

In languages with first class types, you can just store the pointer to the type instead.

All code in this article is released into the public domain.