Kevin Cuzner's Personal Blog

Electronics, Embedded Systems, and Software are my breakfast, lunch, and dinner.


Writing reusable USB device descriptors (and other constant data) with C++ constexpr

Jan 02, 2026

Several years ago I wrote a post which introduced my method of declaring XML comments in my source code and scanning them with a Python script to produce a generated byte array. I've used this several times over the years and as tends to happen, I now hate it. My biggest pet peeve has turned out to be its lack of flexibility. Every time I want to do something crazy, like create HID reports or add extensive audio descriptors (with their relatively complicated cross-referencing scheme), I end up having to make big changes to my Python. It just isn't simple enough! The other thing is that it's not very portable either. If have some hardware that, for example, locks endpoint addresses to specific endpoint instances (a restriction that the STM32 USB peripheral doesn't have, but the SAMD21 does), it'll be yet another modification to the script.

I'd like to introduce in this post a fluent API written entirely using C++ constexpr which enables a syntax like this:

 1constexpr auto kHidEndpointIn = usb::EndpointDescriptor()
 2                                    .EndpointAddress(0x81)
 3                                    .Attributes(0x03)
 4                                    .MaxPacketSize(64)
 5                                    .Interval(1);
 6constexpr auto kHidEndpointOut = usb::EndpointDescriptor()
 7                                     .EndpointAddress(0x01)
 8                                     .EndpointAddress(0x01)
 9                                     .Attributes(0x03)
10                                     .MaxPacketSize(64)
11                                     .Interval(1);
12
13constexpr auto kConfigDescriptor =
14    usb::ConfigurationDescriptor(0)
15        .ConfigurationValue(1)
16        .Attributes(0x80)
17        .WithInterface(
18            usb::InterfaceDescriptor()
19                .InterfaceClass(0x03)
20                .InterfaceSubClass(0x00)
21                .WithEndpoint(kHidEndpointIn)
22                .WithEndpoint(kHidEndpointOut));

To produce something like this in the .rodata section of my executable:

 1000014e1 <_ZL17kConfigDescriptor>:
 2    14e1:   00290209        eoreq   r0, r9, r9, lsl #4
 3    14e5:   80000101        andhi   r0, r0, r1, lsl #2
 4    14e9:   00040900        andeq   r0, r4, r0, lsl #18
 5    14ed:   00030200        andeq   r0, r3, r0, lsl #4
 6    14f1:   21090000        mrscs   r0, (UNDEF: 9)
 7    14f5:   01000111        tsteq   r0, r1, lsl r1
 8    14f9:   07001922        streq   r1, [r0, -r2, lsr #18]
 9    14fd:   40038105        andmi   r8, r3, r5, lsl #2
10    1501:   05070100        streq   r0, [r7, #-256] @ 0xffffff00
11    1505:   00400301        subeq   r0, r0, r1, lsl #6
12    1509:   00000001        andeq   r0, r0, r1

Now, I'm not a C++ expert by any means. I'm almost certain I did things in a harder way than necessary. But my hope is that by telling my journey in getting to this point someone might find some benefit.

Continue on to read more!

Motivation

Although it's been well over a decade, I'm still enamoured by USB. I enjoy writing device-side drivers and coming up with new ways to express simply the complexities of USB. A constant thorn in my side has been the descriptors. Starting from the VUSB driver and continuing with my "Teensy" phase, I used hand-composed arrays of uint8_t to encode descriptors. I soon tired of this and moved on to my XML-based method detailed in my post from 7 years ago. This worked well for a while, but I just had this "itch" in the back of my mind that there was a better way to do this. Something that would let me express the descriptors natively in the language I was using and that was more maintainable than a massive single-file Python XML ETL script.

At this point I should mention that I spent 2022-2024 exploring writing firmware in Rust. Perhaps I'll write a post about that adventure at some point, but suffice to say I got frustrated by the fact that returning -> impl Future in a trait is not object-safe and the workaround of using -> Box<dyn Future> requires using the heap which I'm staunchly against in embedded programming. But that's a rant for another day. I mention Rust because I came across usb-device's DesriptorWriter and became more convinced that there should be a way to express, within the language, the contents of a descriptor and have it generate for me a blob containing it. I wasn't a huge fan of the usb-device crate's implementation since it ends up writing the descriptor to a fixed-size ram-based buffer which sounds wasteful, but the concept of having an API which generates the descriptors really sounded enticing.

I focus a lot on putting descriptors in the flash directly, but it's worth noting that it's perfectly valid to generate descriptors on-the-fly like the usb-device crate did. The main issue I have is that I think it ends up being less flash space to simply express the entire descriptor directly as data rather than expressing the instructions that produce the descriptor. There is not a lot of "wasted space" in descriptors. On the other hand, perhaps I'm just stuck on the VUSB/"Teensy" philosophy which expressed the descriptors as blobs in the flash data. Creating a descriptor dynamically has its benefits, such as allowing a device to dynamically describe some configuration without requiring every possible configuration to be described at compile-time.

C++ in embedded applications

This seems to be a point of controversy still, even in 2025. I decided to use C++ mainly because RAII, lambdas, and other near-zero-cost abstractions available in C++ seemed to me to be a great way to write expressive, concise, and maintainable code that a modern compiler would be more capable of optimizing. I don't write a lot of microcontroller firmware at work, but when I do I've used almost exclusively C++.

Many of the "cons" brought up in articles I read about C++ in embedded systems mention things like dynamic dispatch, heap allocations, exceptions, etc. as reasons not to use C++. And I agree! I just simply avoid using these features (although, I think dynamic dispatch has its place and a vtable is not that big of an overhead as long as you're careful to keep the call out of hot loops).

Truth be told, I'd rather be using Rust, but here we are.

Program Sections

For the uninitiated, programs for flash-based microcontrollers (and indeed most programs, at least on Linux) have their contents typically segmented into 3 different sections:

  • .text: This contains the executable program instructions
  • .data/.bss: This contains the variables for the program. The .data section typically has compile-time initialized values whereas the .bss (Block Starting Symbol) section has uninitialized or zero-initialized variables.
  • .rodata: This contains read-only non-executable data for the program.

The .text, .data, and .rodata sections (along with a few others) are included in the final flash data programmed to the microcontroller. When a microcontroller boots, it usually starts off by reading a specific address in the flash to establish where the program entry point (or "reset vector" lives). While we traditionally think of programs starting in main, there's actually a bunch of startup/initialization that has to happen before main runs. In particular, the startup code on a microcontroller consists of 3 primary responsibilities:

  1. Copy the .data segment from the read-only flash into the appropriate location within the microcontroller SRAM. While the initial values for .data are stored in the flash, the program itself is compiled against these variables being placed in the SRAM address region. The setting of these values in SRAM doesn't happen automatically: the initialization code takes care of this.
  2. Zero out the .bss segment in the SRAM. These variables need to be initialized to zero. We could have stored these variables in .data, but it's wasteful to just store a bunch of zeroes in the flash just to copy them into the RAM when we could have just noted which addresses in the ram need to be zeroed.
  3. Execute static constructors and other initialization functions. There's a list of function pointers in a .init section (treated similarly to .rodata with its placement in the flash) which should be called prior to main starting.

After doing these things and perhaps some initial configuration of peripherals or clocking, the program will finally call the main function.

Note

There are some specific cases where program instructions can end up in SRAM, copied there from the flash in a similar manner to .data. When I've done this I've usually just straight-up included those functions in .data and then called them normally, but in more security-concious systems these kinds of functions might be added to a different segment in order to ensure they're located somewhere in the memory space properly marked for execution. So there might be more to the initialization phase than I've alluded to here.

On most microcontrollers, there is considerably more flash than SRAM. So our objective is to try and make as much of this constant data like USB descriptors live in the .rodata segment: This segment is placed in the flash like the .data segment, but rather than being copied into the SRAM before use, the program is compiled against these variables at their address within the flash. This saves on the SRAM usage and leaves it for data that actually has to change at runtime.

constexpr

In C++11, the constexpr keyword was introduced. When placed on a function, this keyword notes that a function could potentially be executed at compile-time and puts restrictions on what the function can do in order to allow the compiler to achieve that. If the function is called in a context that allows the compiler to derive all the prerequisites for executing the function, it can also derive the result. This also means that constexpr functions are implicitly inline and the definition of the function needs to be present with its declaration in order to allow the compiler to evaluate it at compile-time. When placed on a variable declaration, constexpr instructs the compiler to require that the variable (which is now also implicitly const) have a value precomputed at compile-time rather than having a runtime constructor that runs during program initialization.

Note

It's worth mentioning that C++20 introduced the constinit keyword which requires that a variable's value be known and computed at compile-time, but the value itself isn't treated as const. I didn't use that here, but it's worth noting.

Note

In C++14, the allowable control structures within a constexpr function were expanded to include for, while, and other such things.

With the above said, I have noticed that sometimes even a constexpr-decorated variable ends up being placed in the .data section rather than .rodata. I'm not entirely sure what causes this, though it tended to coincide with more complex types being stored within the target type or using an address of something that the compiler has trouble determining at compile-time in calling the constexpr function. But, sometimes the compiler would complain about using non-constexpr values in constexpr contexts, so I'm really not sure.

Using a struct/class to represent descriptor information

I started out pretty simply with defining structs for descriptors like this:

 1struct __attribute__((packed)) DeviceDescriptor {
 2  static constexpr uint8_t kType = 1;
 3  uint8_t bLength;
 4  uint8_t bDescriptorType = kType;
 5  uint16_t bcdUSB = 0x0200;
 6  uint8_t bDeviceClass = 0;
 7  uint8_t bDeviceSubClass = 0;
 8  uint8_t bDeviceProtocol = 0;
 9  uint8_t bMaxPacketSize0 = 8;
10  uint16_t idVendor = 0x0;
11  uint16_t idProduct = 0x0;
12  uint16_t bcdDevice = 0x0;
13  uint8_t iManufacturer = 0;
14  uint8_t iProduct = 0;
15  uint8_t iSerialNumber = 0;
16  uint8_t bNumConfigurations = 0;
17};

This example struct contains all the data for a USB Device Descriptor and uses the ((packed)) attribute (a.k.a #pragma pack) to remove space between the data members of the struct. This ensures that the layout as-specified in the struct matches exactly the descriptor specification.

I then defined an API like this:

 1struct __attribute__((packed)) DeviceDescriptor {
 2...
 3  constexpr DeviceDescriptor() : bLength(sizeof(DeviceDescriptor)) {}
 4
 5  constexpr auto& DeviceClass(uint8_t c) {
 6    bDeviceClass = c;
 7    return *this;
 8  }
 9  constexpr auto& DeviceSubclass(uint8_t c) {
10    bDeviceSubClass = c;
11    return *this;
12  }
13  constexpr auto& DeviceProtocol(uint8_t c) {
14    bDeviceProtocol = c;
15    return *this;
16  }
17  constexpr auto& MaxPacketSize(uint8_t c) {
18    bMaxPacketSize0 = c;
19    return *this;
20  }
21  constexpr auto& VendorProduct(uint16_t v, uint16_t p) {
22    idVendor = v;
23    idProduct = p;
24    return *this;
25  }
26  constexpr auto& DeviceVersion(uint16_t v) {
27    bcdDevice = v;
28    return *this;
29  }
30...
31};

This allows me to use the struct in a builder pattern to define the contents in a single statement during initialization:

1constexpr auto kDeviceDescriptor = usb::DeviceDescriptor()
2                                       .MaxPacketSize(64)
3                                       .VendorProduct(0x16c0, 0x05dc)
4                                       .DeviceVersion(0x0001);

Variable-length descriptors

So the above is fine for descriptors whose length is known in advance, but what about descriptors that are made up of multiple descriptors sent in a single transfer? The configuration descriptor is one example of this:

 1struct __attribute__((packed)) ConfigurationDescriptorData {
 2  static constexpr uint8_t kType = 2;
 3  uint8_t bLength;
 4  uint8_t bDescriptorType = kType;
 5  uint16_t wTotalLength = 0;
 6  uint8_t bNumInterfaces = 0;
 7  uint8_t bConfigurationValue = 0;
 8  uint8_t iConfiguration = 0;
 9  uint8_t bmAttributes = 0;
10  uint8_t bMaxPower = 0;
11
12  constexpr ConfigurationDescriptorData()
13      : bLength(...), wTotalLength(...) {}
14};

In USB, a configuration typically consists of one or more Interface instances which consist of one or more Endpoint instances. For some classes of USB devices there are also extra descriptors included as well. When the host requests a configuration descriptor, it starts out by requesting the first 9 bytes of the configuration descriptor (shown above). This covers just the configuration descriptor itself, which contains the wTotalLength member. This gives a total length of the concatenated descriptors that form the configuration descriptor. Using this information, the host can then request the full descriptor.

This presents something of a challenge: We need to create a data structure which can store the information for multiple descriptors, without the length known in advance. Using template is the obvious choice, as that's the only canonical way in C++ to create a structure whose size can vary. The question is how exactly to form these template.

My first attempt at this was using variadic templates (i.e. templates with an unbounded number of arguments, template <typename T, typename... Trest>) and using either inheritance or member declarations to store the sub-descriptors. The idea was something like this:

1template<typename T, typename... Trest>
2struct __attribute__((packed)) ExtendableDescriptor : ExtendableDescriptor<Trest...> {
3  ...
4
5  template<typename M>
6  constexpr ExtendableDescriptor<M, T, Trest...> AddDescriptor(M other) {
7    ...
8  }
9};

This inheritance-based scheme ran into compilation issues which I never fully figured out. Again, I'm not a C++ expert. It also ended up with an API that I wasn't too fond of where I end up having to wrap the whole descriptor collection in a container, which isn't exactly what I was going for. I also suspect that it would run into packing issues as __attribute__((packed)) by definition only applies to the current class, not its ancestors (otherwise, the ancestors would have an inconsistent memory layout if they weren't packed normally).

I moved on to something that feels hackier, but actually works and results in the API that I wanted:

 1template<size_t N, typename Data>
 2struct __attribute__((packed)) ExtendableDescriptor : public Data {
 3  std::array<std::byte, N> others;
 4
 5  constexpr ExtendableDescriptor() : Data() {}
 6
 7  // Constructor from a smaller descriptor, appending a new sub-descriptor
 8  //
 9  // The caller must ensure that N is appropriately sized to contain both the
10  // smaller descriptor's data and the new sub-descriptor.
11  template <size_t M, typename T>
12  constexpr ExtendableDescriptor(const ExtendableDescriptor<M, Data>& other,
13                                 T t)
14      : Data(other) {
15    static_assert(N > M, "Cannot shrink a descriptor");
16    // Use sizeof(T) so that it's trivially copyable for bit_cast
17    std::array<std::byte, sizeof(T)> arr =
18        std::bit_cast<std::array<std::byte, sizeof(T)>>(t);
19    for (size_t i = 0; i < M; i++) {
20      others[i] = other.others[i];
21    }
22    // Trust that N-M covers the correct number of bytes of T (it will
23    // generally be less than sizeof(T).
24    for (size_t i = M; i < N; i++) {
25      others[i] = arr[i - M];
26    }
27  }
28};

The idea behind this method is that the descriptor inherits from some "Data" class which is meant to be the parent/top-level descriptor and can be grown by bit_cast-ing the child descriptors into end of the byte array. What this does is (in a constexpr-safe manner) take each non-padding byte of T and map it to a value in the array. So long as the Data class is also packed, we shouldn't run into padding issues and the descriptors should end up consecutive. Here is an example of using this with the configuration descriptor:

 1template <size_t N = 0>
 2class __attribute__((packed)) ConfigurationDescriptor
 3    : public ExtendableDescriptor<N, ConfigurationDescriptorData> {
 4  using Base = ExtendableDescriptor<N, ConfigurationDescriptorData>;
 5
 6 public:
 7  // Constructor for an empty configuration. Configurations require an index.
 8  constexpr ConfigurationDescriptor(uint8_t index) : Base(index) {}
 9
10  template <size_t M, typename T>
11  constexpr ConfigurationDescriptor(const ConfigurationDescriptor<M>& other,
12                                    T t)
13      : Base(other, t) {}
14
15  // Adds an interface
16  template <typename T>
17  constexpr auto WithInterface(T t) {
18    auto other = ConfigurationDescriptor<T::Size() + T::ExtendedSize() +
19                                         Base::ExtendedSize()>(*this, t);
20    other.wTotalLength =
21        decltype(other)::Size() + decltype(other)::ExtendedSize();
22    other.bNumInterfaces += 1;
23    return other;
24  }
25};

The WithInterface method is where the magic happens. It takes in some type which is assumed to be an interface descriptor. Through the magic of template argument deduction, the actual invocation of this method doesn't require any <> and can just take in the argument of the descriptor. The method then constructs a new expanded descriptor based on the current descriptor, passes in the new descriptor, and then modifies the appropriate fields of the parent/top-level descriptor to keep things coherent. In this case, the wTotalLength is incremented by the length of the new descriptor and bNumInterfaces is incremented by 1.

The other thing to note here is the default value on the size_t parameter: It's valid to have a descriptor that has 0 bytes of other descriptors. This is one of main benefits of using std::array to hold the data array: It's valid to have a 0-sized std::array whereas a standard C-array can't have a size of 0. By starting a descriptor off at 0 (and making it the default), we end up with a very ergonomic (in my opinion) API:

1constexpr auto kConfigDescriptor =
2    usb::ConfigurationDescriptor()
3        .ConfigurationValue(1)
4        .Attributes(0x80)
5        .WithInterface(...);

Descriptor Sizing

At this point, I need to discuss how I deal with descriptor lengths. Above you might have noticed the subtle detail of setting the DeviceDescriptor's bLength member with sizeof(DeviceDescriptor). While this works for that case, it breaks down badly for extendable descriptors. Typically, the bLength member of a descriptor refers only to the length of that descriptor itself, not including any appended descriptors. That means that using sizeof on an ExtendableDescriptor is not tenable.

To manage this situation, I ended up defining a simple class which provides a static function declaring some number which is meant to be used as the bLength value:

1template <size_t N>
2struct __attribute__((packed)) DescriptorSize {
3  static constexpr size_t Size() { return N; }
4  static constexpr size_t ExtendedSize() { return 0; }
5};

Now, descriptors simply need to inherit from DescriptorSize and they'll usable in a few convenient situations:

 1struct __attribute__((packed)) ConfigurationDescriptorData
 2    : public DescriptorSize<9> {
 3  ...
 4  uint8_t bLength;
 5  ...
 6  uint16_t wTotalLength;
 7  ...
 8
 9  // Now bLength and wTotalLength can default to the explicitly defined
10  // descriptor size!
11  constexpr ConfigurationDescriptorData()
12      : bLength(Size()), wTotalLength(Size()) {}
13};
14
15// The extendable descriptor can "override" (well, name-hide or "shadow")
16// the ExtendedSize member so that it returns an appropriate value.
17template <size_t N, typename Data>
18class __attribute__((packed)) ExtendableDescriptor : public Data {
19 public:
20  // This hides the version from DescriptorSize (presumably, Data inherits
21  // from that)
22  static constexpr size_t ExtendedSize() { return N; }
23  ...
24};
25
26// We can also appropriately-size the return type of functions that expand
27// the descriptor!
28struct ConfigurationDescriptor ... {
29  ...
30  template <typename T>
31  constexpr auto WithInterface(T t) {
32    auto other = ConfigurationDescriptor<T::Size() + T::ExtendedSize() +
33                                         Base::ExtendedSize()>(*this, t);
34    other.wTotalLength =
35        decltype(other)::Size() + decltype(other)::ExtendedSize();
36    ...
37  }
38};

So that covers bLength, but wTotalLength is still an issue. Above you can see that we're using the ExtendedSize function to set wTotalLength along with ensuring the new descriptor is large enough to contain the appended interface descriptor. In this situation, we simply wanted to expand the descriptor by the size of the appended descriptor and ensure wTotalLength is incremented accordingly. Why then did we not use sizeof here? The issue is twofold:

  • Padding: While __attribute__((packed)) guarantees that no additional padding bytes are placed between members for alignment, it seems to still be possible that trailing padding bytes may appear at the end of a class. I had this happen once. Maybe I made a typo and forgot ((packed)) in a class? Or maybe I ran into the next issue and it spiraled into this issue as well.
  • Metadata members: This isn't something I've introduced yet in this article (keep reading!), but in certain situations it's very convenient to have extra information about the descriptor that might be useful for the program to use as a constant, but shouldn't be included in the data sent with the descriptor. This data can be placed in a position where the bit_cast doesn't pick it up and include it in the expandable byte array.

The ExtendedSize function is therefore defined to be the additional bytes in the descriptor which contribute to its USB-facing payload. Summing Size and ExtendedSize gives the total length of the descriptor. We therefore augment the ExtendableDescriptor like so to make that work:

 1// The extendable descriptor can "override" (well, name-hide or "shadow")
 2// the ExtendedSize member so that it returns an appropriate value.
 3template <size_t N, typename Data>
 4class __attribute__((packed)) ExtendableDescriptor : public Data {
 5 public:
 6  // This hides the version from DescriptorSize (presumably, Data inherits
 7  // from that)
 8  static constexpr size_t ExtendedSize() { return N; }
 9  ...
10};

We've now fully decoupled the size of the class from the size of the descriptor. And done so in a way that is still constexpr-safe!

Descriptor "Tagging" or Metadata

I mentioned earlier that decoupling the total length of a descriptor from sizeof enables "metadata" members of descriptors, or data on a descriptor that doesn't constitute part of its payload. Consider the case of an HID (Human Interface Device) report descriptor: These descriptors are referenced in the HID descriptor that's included with the configuration/interface descriptor so that the host is made aware of their presence. That reference can itself be considered a descriptor:

 1// HID report descriptors are declared in the HID descriptor. This represents
 2// the entry in that descriptor for a given report descriptor.
 3struct __attribute__((packed)) HidReportDescriptorTag
 4    : public DescriptorSize<3> {
 5  uint8_t bDescriptorType = HidReportDescriptorData::kType;
 6  uint16_t wDescriptorLength;
 7
 8  constexpr HidReportDescriptorTag(uint16_t length)
 9      : wDescriptorLength(length) {}
10};

When declaring a report descriptor, it'd be nice if this tag was automatically created so that it doesn't have to be manually hardcoded. This is straightforwardly done by simply declaring an additional data member in the ExtendableDescriptor child class HidReportDescriptor:

 1// HID report descriptor builder
 2template <size_t N = 0>
 3class __attribute__((packed)) HidReportDescriptor
 4    : public ExtendableDescriptor<N, HidReportDescriptorData> {
 5  using Base = ExtendableDescriptor<N, HidReportDescriptorData>;
 6
 7 public:
 8  // This tag is used in the HID descriptor to reference this descriptor
 9  HidReportDescriptorTag hid_tag;
10
11  constexpr HidReportDescriptor()
12      : Base(), hid_tag(Base::Size() + Base::ExtendedSize()) {}
13  template <size_t M, typename T>
14  constexpr HidReportDescriptor(const HidReportDescriptor<M>& other, T t)
15      : Base(other, t), hid_tag(Base::Size() + Base::ExtendedSize()) {}
16
17...
18};

Since the data for parent classes is ordered before the data for child classes, we're assured that this hid_tag member is not going to be copied during the bit_cast operation that copies the contents of the ExtendableDescriptor's array and we're assurred that it's not going to get caught in the ExpandedSize bytes which are included in the USB payload. The main thing we need to cover then is ensuring that the tag is set correctly when it's constructed each time. This is straightforwardly done by setting a default "base case" value in the default constructor and then making sure to also initialize it in the constructor invoked when expanding the descriptor. It's worth noting that so long as you don't have a default constructor for the tag type, the compiler will actually bark at you if you forget to initialize the value: constexpr requires that everything be explicitly initialized!

We can now use this tag when declaring a descriptor that requires the information:

 1constexpr auto kHidReport =
 2    usb::hid::HidReportDescriptor()
 3        .UsagePage16(0xFF00)  // Vendor
 4        .Usage(0x1)
 5        .Collection(0x1)  // Application
 6        ...
 7        .EndCollection();
 8
 9constexpr auto kConfigDescriptor =
10    usb::ConfigurationDescriptor(0)
11        .ConfigurationValue(1)
12        .Attributes(0x80)
13        .WithInterface(
14            usb::InterfaceDescriptor()
15                .InterfaceClass(0x03)
16                .InterfaceSubClass(0x00)
17                .WithDescriptor(usb::hid::HidDescriptor().IncludeReport(
18                    kHidReport.hid_tag)) // <-- Just reference it!
19                ...
20        );

Creating a distributed descriptor lookup table

Now that we can use constexpr to define our descriptors and they can be stored as constants, we need to provide a way for the descriptors to be used. While we can hardcode the association between the GET_DESCRIPTOR requests and the particular report to return (and many USB device implementations do this), I continue to be enamored by the pattern from the "Teensy" codebase where a table was declared which is then searched dynamically when a setup request arrives. It's small and code-efficient!

In the past, I've declared this table as a straight-up array. Here's an example from my old codegen:

1const USBDescriptorEntry usb_descriptors[] = {
2  { 0x0100, 0x0000, sizeof(device), device },
3  { 0x0300, 0x0000, sizeof(lang), lang },
4  { 0x0301, 0x0409, sizeof(manufacturer), manufacturer },
5  { 0x0302, 0x0409, sizeof(product), product },
6  { 0x0200, 0x0000, sizeof(configuration), configuration },
7  { 0x2200, 0x0002, sizeof(hid_report), hid_report },
8  { 0x0000, 0x0000, 0x00, NULL }
9};

When a GET_DESCRIPTOR request arrives, the usb_descriptors array will be traversed until the "null" entry at the end is reached or the requested descriptor is found. My main problem with this method is that the descriptor table ends up needing to be centrally located: I can have one and only one table, and it has to include all the descriptors for the project.

In an ideal world, I should be able to add a USB descriptor anywhere in my codebase and have it included automatically in this table. In this implementation, I've accomplished this by way of a specially named code section. Firstly, here's how I've defined the descriptor entry:

 1struct _DescriptorTableEntry {
 2  const void* ptr;
 3  size_t length;
 4  union {
 5    uint16_t wValue;
 6    struct {
 7      uint8_t index;  // LSByte I hope
 8      uint8_t type;   // MSByte I hope
 9    } type_index;
10  };
11  uint16_t wIndex;
12
13  constexpr _DescriptorTableEntry()
14      : ptr(nullptr), length(0), wValue(0), wIndex(0) {}
15
16  template <typename T>
17  constexpr _DescriptorTableEntry(const T& desc)
18      : ptr(&desc),
19        length(T::Size() + T::ExtendedSize()),
20        type_index{desc.tag.index, T::kType},
21        wIndex(desc.tag.wIndex) {}
22
23  static bool Find(uint16_t wValue, uint16_t wIndex, const void*& ptr,
24                   size_t& length);
25};
26#define USB_DESCRIPTOR(desc)                                                \
27  constexpr __attribute__((                                                 \
28      used, section(".rodata.keep.usbdescbbb"))) usb::_DescriptorTableEntry \
29  desc##Entry(desc)

This causes each table entry to be linked into a section called .rodata.keep.usbdescbbb. There's nothing actually special about this name, aside from the bbb part: I just named it this way to be consistent with the conventions that already existed for read-only data.

The table entry itself contains a pointer to the location of the descriptor (ostensibly in flash), its length, and the wValue/wIndex parameters that will be passed during GET_DESCRIPTOR when this descriptor is requested. The descriptor size is determined by examining the members declared by DescriptorSize: Size and ExtendedSize. The wValue and wIndex are determined by examining a member of the passed descriptor which should be named tag and another which should be called kType. For kType, this is simply a static constant with the descriptor type and can be seen in the earlier examples in this article. The tag member is used as a metadata member of the descriptor as described in the previous section. The struct is very straightforward and simply supplies the index of the descriptor (index, part of wValue) and the language ID, if applicable (wIndex).

 1// As descriptors declare an explicit size for USB, extra data at the end of
 2// the struct is fine. It wastes some space, but not too much. Note that even
 3// for the extendable descriptors this is true: we only copy the portion up to
 4// Size into the larger-sized descriptor.
 5struct __attribute__((packed)) DescriptorTag {
 6  // Index of this descriptor, if applicable
 7  uint8_t index;
 8  // wIndex value for this descriptor
 9  uint16_t wIndex;
10
11  constexpr DescriptorTag() : index(0), wIndex(0) {}
12  constexpr DescriptorTag(uint8_t index, uint16_t wIndex = 0)
13      : index(index), wIndex(wIndex) {}
14};
15
16...
17
18// Device Descriptor
19struct __attribute__((packed)) DeviceDescriptor : public DescriptorSize<18> {
20  static constexpr uint8_t kType = 1;
21  uint8_t bLength;
22  uint8_t bDescriptorType = kType;
23  ...
24
25  DescriptorTag tag;
26
27  constexpr DeviceDescriptor(uint8_t index = 0) : bLength(Size()), tag(index) {}
28  ...
29};

The next piece of this is a change to my linker script. I added a .rodata.keep definition within the flash segment:

 1 ...
 2 /* Section Definitions */
 3 SECTIONS
 4 {
 5     .text :
 6     {
 7         . = ALIGN(4);
 8         _sfixed = .;
 9         KEEP(*(.vectors .vectors.*))
10         *(.text .text.* .gnu.linkonce.t.*)
11         *(.glue_7t) *(.glue_7)
12         KEEP(*(.rodata.keep SORT(.rodata.keep.*))) // <-- New section!
13         *(.rodata .rodata* .gnu.linkonce.r.*)
14         *(.ARM.extab* .gnu.linkonce.armextab.*)
15...

There are two important pieces here:

KEEP
The KEEP statement ensures that the sections placed there do not get pruned. I typically have -Wl,--gc-sections in my linker invocation which will prune sections that don't have a reference. This prevents that from occurring with these entries.
SORT
Sections names can be referenced with a glob-like syntax using *. By putting SORT(.rodata.keep.*) I am specifying that sections which start with .rodata.keep. should be sorted alphabetically.

The last part to creating my table is the definition of some additional symbols in C++ source:

1// This should go at the start of the usb descriptors
2static const __attribute__((used, section(".rodata.keep.usbdescaaa"))) _DescriptorTableEntry _desc_begin;
3static const __attribute__((used, section(".rodata.keep.usbdesczzz"))) _DescriptorTableEntry _desc_end;

Because of SORT, these two variables will be placed before and after the descriptors defined with the USB_DESCRIPTOR macro. When the program is compiled, I end up with a structure that starts with _desc_begin, has all the program descriptors, and then stops with _desc_end:

 1000013e0 <_ZL11_desc_begin>:
 2    ...
 3
 4000013ec <_ZL20kHidInLedReportEntry>:
 5    13ec:   0000150d        andeq   r1, r0, sp, lsl #10
 6    13f0:   00000019        andeq   r0, r0, r9, lsl r0
 7    13f4:   00002200        andeq   r2, r0, r0, lsl #4
 8
 9000013f8 <_ZL17kProductNameEntry>:
10    13f8:   0000152c        andeq   r1, r0, ip, lsr #10
11    13fc:   00000026        andeq   r0, r0, r6, lsr #32
12    1400:   04090302        streq   r0, [r9], #-770 @ 0xfffffcfe
13
1400001404 <_ZL22kManufacturerNameEntry>:
15    1404:   00001555        andeq   r1, r0, r5, asr r5
16    1408:   00000020        andeq   r0, r0, r0, lsr #32
17    140c:   04090301        streq   r0, [r9], #-769 @ 0xfffffcff
18
1900001410 <_ZL15kLanguagesEntry>:
20    1410:   00001578        andeq   r1, r0, r8, ror r5
21    1414:   00000004        andeq   r0, r0, r4
22    1418:   00000300        andeq   r0, r0, r0, lsl #6
23
240000141c <_ZL22kConfigDescriptorEntry>:
25    141c:   000014e1        andeq   r1, r0, r1, ror #9
26    1420:   00000029        andeq   r0, r0, r9, lsr #32
27    1424:   00000200        andeq   r0, r0, r0, lsl #4
28
2900001428 <_ZL22kDeviceDescriptorEntry>:
30    1428:   000014cc        andeq   r1, r0, ip, asr #9
31    142c:   00000012        andeq   r0, r0, r2, lsl r0
32    1430:   00000100        andeq   r0, r0, r0, lsl #2
33
3400001434 <_ZL9_desc_end>:
35    ...

This is a little wasteful, since the table entry is 12 bytes and both the begin and end entries are just null "markers", but it didn't require anything specific to the USB descriptor table in my linker script to work: The same pattern here where I name the sections such that SORT controls the order they appear in the flash can be used for other such table-like things.

With this table, I can now use a straightforward search to locate descriptors on demand:

 1bool _DescriptorTableEntry::Find(uint16_t wValue, uint16_t wIndex, const void* &ptr, size_t &length) {
 2  const _DescriptorTableEntry* entry = &_desc_begin;
 3  while (entry < &_desc_end) {
 4    if (entry->wValue == wValue && entry->wIndex == wIndex) {
 5      ptr = entry->ptr;
 6      length = entry->length;
 7      return true;
 8    }
 9    ++entry;
10  }
11  return false;
12}

Descriptor Examples

Some of this is revealed above already, but I figured having some actual examples of the API used to define the descriptors followed by an example of those descriptors might be useful.

Extendable Descriptors (Configuration/Interface/Endpoint)

These descriptors I ended up defining with 3 separate definitions:

  • The base data
  • The API
  • The top-level descriptor that marries the base data, API, and extendable descriptor.

I'll use the configuration descriptor as an example here. The data is declared as noted earlier:

 1struct __attribute__((packed)) ConfigurationDescriptorData
 2    : public DescriptorSize<9> {
 3  static constexpr uint8_t kType = 2;
 4  uint8_t bLength;
 5  uint8_t bDescriptorType = kType;
 6  uint16_t wTotalLength = 0;
 7  uint8_t bNumInterfaces = 0;
 8  uint8_t bConfigurationValue = 0;
 9  uint8_t iConfiguration = 0;
10  uint8_t bmAttributes = 0;
11  uint8_t bMaxPower = 0;
12
13  constexpr ConfigurationDescriptorData()
14      : bLength(Size()), wTotalLength(Size()) {}
15};

The API portion of this uses a "fluent" or "builder" style, where *this is returned from each method. This allows "chaining" the descriptor. I probably could have included this API in the top-level class, but keeping it as a separate class didn't really seem to harm anything. One complication of having it as a separate class is that it needs to be sure to return instances of the top-level class in the fluent API methods. This is accomplished using CRTP.

 1template <typename T>
 2struct __attribute__((packed)) ConfigurationDescriptorApi {
 3  constexpr T ConfigurationValue(uint8_t v) {
 4    auto me = static_cast<T*>(this);
 5    me->bConfigurationValue = v;
 6    return *me;
 7  }
 8
 9  constexpr T Attributes(uint8_t a) {
10    auto me = static_cast<T*>(this);
11    me->bmAttributes = a;
12    return *me;
13  }
14
15  constexpr T MaxPower(uint8_t p) {
16    auto me = static_cast<T*>(this);
17    me->bMaxPower = p;
18    return *me;
19  }
20};

Again, I'm not a C++ expert and the above probably could have been done more succinctly and probably more simply if I just included these methods in the top-level class.

The top-level class marries together all of these classes into a unified top-level. This provides the constructor (which you'll notice does require an index to be passed, since devices have multiple configurations) and the WithInterface method which permits extending the descriptor. A default argument to the N parameter which defines the extended size is provided to cut down on the verbosity when this is actually used.

 1template <size_t N = 0>
 2class __attribute__((packed)) ConfigurationDescriptor
 3    : public ExtendableDescriptor<N, ConfigurationDescriptorData>,
 4      public ConfigurationDescriptorApi<ConfigurationDescriptor<N>> {
 5  using Base = ExtendableDescriptor<N, ConfigurationDescriptorData>;
 6
 7 public:
 8  // Constructor for an empty configuration. Configurations require an index.
 9  constexpr ConfigurationDescriptor(uint8_t index) : Base(index) {}
10
11  template <size_t M, typename T>
12  constexpr ConfigurationDescriptor(const ConfigurationDescriptor<M>& other,
13                                    T t)
14      : Base(other, t) {}
15
16  // Adds an interface
17  template <typename T>
18  constexpr auto WithInterface(T t) {
19    auto other = ConfigurationDescriptor<T::Size() + T::ExtendedSize() +
20                                         Base::ExtendedSize()>(*this, t);
21    other.wTotalLength =
22        decltype(other)::Size() + decltype(other)::ExtendedSize();
23    other.bNumInterfaces += 1;
24    return other;
25  }
26};

One thing to note here is that the WithInterface method doesn't actually enforce that the object itself is an interface, it just assumes. Maybe that could be improved one day, but for flexibility's sake I left it as-is.

The interface and endpoint descriptors are eerily similar to this one in their structure.

String Descriptors

String descriptors were a little tricky. I wanted to be able to have an API like StringDescriptor("My String", kMyIndex, kMyLanguage). The complication is that USB represents its characters as 16-bit wide characters. What is most unfortunate is that on Linux, the wchar_t "wide char" type is actually 4 bytes long, so it's not suitable for this usage. I ended up compromising by using char16_t which represents a UTF-16 code unit. If I actually used non-ASCII characters I'd probably run into a problem, but so long as I stick to ASCII this shouldn't cause unexpected string values.

The other difficulty is with appropriately sizing the descriptor for a passed literal string. It turns out that if I receive the string as a reference to an array of char16_t and use a function to construct the descriptor (rather than just calling the constructor), the template arguments can be deduced.

 1// String Descriptor
 2template <size_t N>
 3struct __attribute__((packed)) StringDescriptorBase
 4    : public DescriptorSize<N * 2> {
 5  static constexpr uint8_t kType = 3;
 6  uint8_t bLength;
 7  uint8_t bDescriptorType = kType;
 8  char16_t data[N - 1];
 9
10  DescriptorTag tag;
11
12  constexpr StringDescriptorBase(const char16_t (&data)[N], uint8_t index,
13                                 uint16_t wIndex = 0)
14      : bLength(N * 2), tag(index, wIndex) {
15    for (size_t i = 0; i < N - 1; i++) {
16      this->data[i] = data[i];
17    }
18  }
19};
20
21// Deduce how large a string descriptor we need based on a literal
22template <size_t N>
23consteval StringDescriptorBase<N> StringDescriptor(const char16_t (&data)[N],
24                                                   uint8_t index,
25                                                   uint16_t wIndex = 0) {
26  return StringDescriptorBase<N>(data, index, wIndex);
27}

The resulting syntax is awfully close to what I had wanted:

1constexpr auto kLanguage = 0x0409;
2constexpr auto kLanguages = usb::StringDescriptor({kLanguage, 0x0}, 0);
3constexpr auto kManufacturerName =
4    usb::StringDescriptor(u"kevincuzner.com", 1, kLanguage);
5
6USB_DESCRIPTOR(kLanguages);
7USB_DESCRIPTOR(kManufacturerName);

HID Report Descriptors

HID Report Descriptors are a little different from other descriptors in that they don't have a defined length. Instead, they're parsed! They are defined in terms of "items". There are "short" items which are 1-4 bytes in length and "long" items which can be up to 258 bytes. I tend to only use short items, so what I ended up doing was defining the descriptor to be just an expandable descriptor which is expanded item-by-item. The item itself is defined as something that implements DescriptorSize. Constructors and convenience functions are provided for constructing the item from an array of bytes:

 1enum class HidItemType {
 2  kMain = 0,
 3  kGlobal = 1,
 4  kLocal = 2,
 5};
 6
 7// This is a base type which declares one required byte of a short item
 8template <size_t N>
 9struct __attribute__((packed)) ShortItemBase : public DescriptorSize<N + 1> {
10  uint8_t bSize : 2 = 0;
11  uint8_t bType : 2;
12  uint8_t bTag : 4;
13  std::array<uint8_t, N> data;
14
15  static_assert(N <= 4, "ShortItem data must be <= 4 bytes");
16  static_assert(N != 3, "ShortItem data must be 0, 1, 2, or 4 bytes");
17
18  constexpr ShortItemBase(HidItemType bType, uint8_t bTag)
19      : bType(static_cast<uint8_t>(bType)), bTag(bTag) {}
20  constexpr ShortItemBase(HidItemType bType, uint8_t bTag,
21                          const uint8_t (&data)[N])
22      : bType(static_cast<uint8_t>(bType)), bTag(bTag) {
23    bSize = N == 0 ? 0 : N == 1 ? 1 : N == 2 ? 2 : 3;
24    std::copy(data, data + N, this->data.begin());
25  }
26};
27
28// Convenience function for constructing short items with data of some length
29template <size_t N>
30constexpr auto ShortItem(HidItemType bType, uint8_t bTag,
31                         const uint8_t (&data)[N]) {
32  return ShortItemBase<N>(bType, bTag, data);
33}
34constexpr auto ShortItem(HidItemType bType, uint8_t bTag) {
35  return ShortItemBase<0>(bType, bTag);
36}

The report descriptor then expands itself short item by short item. I ended up providing convenience functions for each short item type defined in the specification (Input, Output, Usage, Collection, End Collection, etc):

 1// The HID report descriptors don't have any extra data before the extendable
 2// portion. Declare this untemplated base so we have a kType and a base Size()
 3// to call.
 4struct __attribute__((packed)) HidReportDescriptorData
 5    : public DescriptorSize<0> {
 6  static constexpr uint8_t kType = 0x22;
 7};
 8
 9// HID report descriptor builder
10template <size_t N = 0>
11class __attribute__((packed)) HidReportDescriptor
12    : public ExtendableDescriptor<N, HidReportDescriptorData> {
13  using Base = ExtendableDescriptor<N, HidReportDescriptorData>;
14
15 public:
16  // This tag is used in the HID descriptor to reference this descriptor
17  HidReportDescriptorTag hid_tag;
18
19  constexpr HidReportDescriptor()
20      : Base(), hid_tag(Base::Size() + Base::ExtendedSize()) {}
21  template <size_t M, typename T>
22  constexpr HidReportDescriptor(const HidReportDescriptor<M>& other, T t)
23      : Base(other, t), hid_tag(Base::Size() + Base::ExtendedSize()) {}
24
25  // Adds a short item
26  constexpr auto ShortItem(HidItemType bType, uint8_t bTag) {
27    return HidReportDescriptor<N + 1>(*this,
28                                      ::usb::hid::ShortItem(bType, bTag));
29  }
30  constexpr auto ShortItem(HidItemType bType, uint8_t bTag, uint8_t data0) {
31    return HidReportDescriptor<N + 2>(
32        *this, ::usb::hid::ShortItem(bType, bTag, {data0}));
33  }
34  constexpr auto ShortItem(HidItemType bType, uint8_t bTag, uint8_t data0,
35                           uint8_t data1) {
36    return HidReportDescriptor<N + 3>(
37        *this, ::usb::hid::ShortItem(bType, bTag, {data0, data1}));
38  }
39  constexpr auto ShortItem(HidItemType bType, uint8_t bTag, uint8_t data0,
40                           uint8_t data1, uint8_t data2, uint8_t data3) {
41    return HidReportDescriptor<N + 5>(
42        *this,
43        ::usb::hid::ShortItem(bType, bTag, {data0, data1, data2, data3}));
44  }
45
46  // Main Items
47  constexpr auto Input(uint8_t lsb_flags) {
48    return ShortItem(HidItemType::kMain, 0x8, lsb_flags);
49  }
50  constexpr auto Input(uint8_t lsb_flags, uint8_t msb_flags) {
51    return ShortItem(HidItemType::kMain, 0x8, lsb_flags, msb_flags);
52  }
53  constexpr auto Output(uint8_t lsb_flags) {
54    return ShortItem(HidItemType::kMain, 0x9, lsb_flags);
55  }
56  constexpr auto Output(uint8_t lsb_flags, uint8_t msb_flags) {
57    return ShortItem(HidItemType::kMain, 0x9, lsb_flags, msb_flags);
58  }
59  ...
60};

This allows for what I think is a pretty convenient API for defining a report descriptor. Here's a before/after comparison between my XML comment method (which used a macro) and this new method:

 1// Before
 2/*
 3 * <descriptor id="hid_report" childof="hid" top="top" type="0x22" order="1" wIndexType="0x04">
 4 *  <hidden name="bDescriptorType" size="1">0x22</hidden>
 5 *  <hidden name="wLength" size="2">sizeof(hid_report)</hidden>
 6 *  <raw>
 7 *  HID_SHORT(0x04, 0x00, 0xFF), //USAGE_PAGE (Vendor Defined)
 8 *  HID_SHORT(0x08, 0x01), //USAGE (Vendor 1)
 9 *  HID_SHORT(0xa0, 0x01), //COLLECTION (Application)
10 *  HID_SHORT(0x08, 0x01), //  USAGE (Vendor 1)
11 *  HID_SHORT(0x14, 0x00), //  LOGICAL_MINIMUM (0)
12 *  HID_SHORT(0x24, 0xFF, 0x00), //LOGICAL_MAXIMUM (0x00FF)
13 *  HID_SHORT(0x74, 0x08), //  REPORT_SIZE (8)
14 *  HID_SHORT(0x94, 64), //  REPORT_COUNT(64)
15 *  HID_SHORT(0x80, 0x02), //  INPUT (Data, Var, Abs)
16 *  HID_SHORT(0x08, 0x01), //  USAGE (Vendor 1)
17 *  HID_SHORT(0x90, 0x02), //  OUTPUT (Data, Var, Abs)
18 *  HID_SHORT(0xc0),       //END_COLLECTION
19 *  </raw>
20 * </descriptor>
21 */
22
23// After
24constexpr auto kHidInReport = usb::hid::HidReportDescriptor()
25                                  .UsagePage16(0xFF00)  // Vendor
26                                  .Usage(0x1)           // Vendor 1
27                                  .Collection(0x1)      // Application
28                                  .Usage(0x1)           // Vendor 1
29                                  .LogicalMin8(0)
30                                  .LogicalMax16(0xFF)
31                                  .ReportSize8(8)
32                                  .ReportCount8(64)
33                                  .Input(0x2)   // Data, var, abs
34                                  .Usage(0x1)   // Vendor 1
35                                  .Output(0x2)  // Data, var, abs
36                                  .EndCollection();

I, for one, find that to be much more descriptive and maintainable.

Conclusion

Defining USB descriptors and other constant data using constexpr is a maintainable and expressive way to include constant data in programs. Even complex structures can be defined this way. In contrast to my prior incarnations of this which used a code generator, the use of the C++ language itself eliminates the need for defining what effectively ends up being a domain-specific language or at a minimum some kind of extract-transform-load framework when writing a code generator. Nearly all the effort expended is entirely spent defining the API to structure data, rather than in the infrastructure behind that API.


arm-programming c hardware usb usb-descriptor