Electronics, Embedded Systems, and Software are my breakfast, lunch, and dinner.
Feb 02, 2018
During my LED Wristwatch project, I decided early on that I wanted to do something different with the way my USB stuff was implemented. In the past, I have almost exclusively used libusb to talk to my devices in terms of raw bulk packets or raw setup requests. While this is ok, it isn't quite as easy to do once you cross out of the fruited plains of Linux-land into the barren desert of Windows. This project instead made the watch identify itself (enumerate) as a USB Human Interface Device (HID).
What I would like to do in this post is a step-by-step tutorial for modifying a USB device to enumerate as a human interface device. I'll start with an overview of HID, then move on to modifying the USB descriptors and setting up your device endpoints so that it sends reports, followed by a few notes on writing host software for Windows and Linux that communicates to devices using raw reports. With a little bit of work, you should be able to replace many things done exclusively with libusb with a cross-platform system that requires no drivers. Example code for this post can be found here:
**https://github.com/kcuzner/led-watch**
One thing to note is that since I'm using my LED Watch as an example, I'm going to be extending using my API, which I describe a little bit here. The main source code files for this can be found in common/src/usb.c and common/src/usb_hid.c.
Step 1: Extending Setup Requests
Step 2: Descriptors Modifying the configuration descriptor
Before doing anything HID, you need two pieces of documentation:
This is one of the few times I have read an industry specification and it hasn't required an almost lawyer-style analysis to comprehend. It is surprisingly readable and I highly recommend at least scrolling through it somewhat since it makes a halfway decent reference.
HID communicates by way of "Reports". The basic idea is that a "Report" tells the host something going on with the human interface device, be it a mouse moment, a keystroke, or whatever. There are not only reports going to the host, but also those that go from the host to the device. These reports can do things like turn on the caps lock LED and so forth. The reports are named as follows:
All the report naming is quite host-centric as you can see. Now, when a device declares itself as a human interface device, it has to implement the following endpoints:
So in total, a HID has either one or two extra endpoints beyond the basic control endpoint. These endpoints are used to either send IN Reports of events happening to the device to the host or receive OUT Reports from the host commanding the device to do things.
One of the key features, and a major motivation behind writing HID devices, is that most operating systems require no drivers for these devices to work. For most other USB devices, the operating system requires a driver which interacts with the device on a low level and provides an interface either back into userspace or kernelspace which can be used by regular programs to interact with the device. However, for human interface devices the OS actually provides a driver which translates whatever reports a custom device may send back to the host into an API usable by programs. For example, if you plug a USB joystick or gamepad which enumerates as a HID into the computer, other programs can call OS methods that allow enumerating the analog joysticks and pushbuttons that the gamepad provides, without needing to use a manufacturer-specific driver to translate the reports into input actions.
This is possible by the use of "Report Descriptors". These serve as a way for the device to self-describe the format of the reports it is going to send back. A joystick from manufacturer A might send four analog values followed by 16 button values, but a joystick from manufacturer B may instead send 16 button values followed by only two analog values. The OS driver makes sense of the report formatting by reading the report descriptors returned by the device when it enumerates. Report descriptors are represented as a series of tokens which are parsed one after another to build up the description of the report. Tokens that may appear include:
Building cross-platform report descriptors is one of the more challenging parts of creating a human interface device. Some operating systems, such as Linux, are extremely permissive and will still enumerate the device with a badly formatted report. Other operating systems, such as Windows, are extremely strict in terms of what they accept and will not enumerate your device if the report descriptor doesn't conform to its exacting standards (you'll get the dreaded "Device failed to start" error in Device Manager).
The general USB specification defines a setup request command GET_DESCRIPTOR. The spec defines the high byte of wValue to be the "descriptor type". The HID specification defines the following class-specific descriptors:
In general, hosts won't issue requests for descriptor type 0x21, but type 0x22 will be seen as part of the enumeration process. You'll need to extend your GET_DESCRIPTOR request so that it responds to 0x22 descriptor requests at index 0 and returns your HID descriptor (or even at multiple indexes if you have multiple HID descriptors).
In my LED watch with its API, I just have a read-only table of descriptors that has the expected wValue, wIndex, and a pointer to the data. My descriptor table looks like so:
1const USBDescriptorEntry usb_descriptors[] = {
2 { 0x0100, 0x0000, sizeof(dev_descriptor), dev_descriptor },
3 { 0x0200, 0x0000, sizeof(cfg_descriptor), cfg_descriptor },
4 { 0x0300, 0x0000, sizeof(lang_descriptor), lang_descriptor },
5 { 0x0301, 0x0409, sizeof(manuf_descriptor), manuf_descriptor },
6 { 0x0302, 0x0409, sizeof(product_descriptor), product_descriptor },
7 { 0x2200, 0x0000, sizeof(hid_report_descriptor), hid_report_descriptor }, //new descriptor for HID
8 { 0x0000, 0x0000, 0x00, NULL }
9};
Now, in addition to extending GET_DESCRIPTOR, the HID specification requires one new setup request be supported: Class-specific request 0x01 (bRequest = 0x01, bmRequestType = 0x01), known as GET_REPORT. This provides a control-request way to get HID reports. Now, I've actually found that both Windows and Linux don't mind if this isn't implemented. However, it may be good to implement anyway. It has the following arguments:
In my LED Watch, the USB setup request handler will call hook_usb_handle_setup_request when it receives a request that the base driver can't handle. Here is my implementation:
1/**
2 * Implementation of hook_usb_handle_setup_request which implements HID class
3 * requests
4 */
5USBControlResult hook_usb_handle_setup_request(USBSetupPacket const *setup, USBTransferData *nextTransfer)
6{
7 uint8_t *report_ptr;
8 uint16_t report_len;
9 switch (setup->wRequestAndType)
10 {
11 case USB_REQ(0x01, USB_REQ_DIR_IN | USB_REQ_TYPE_CLS | USB_REQ_RCP_IFACE):
12 //Get report request
13...determine which report is needed and get a pointer to it...
14 nextTransfer->addr = report_ptr;
15 nextTransfer->len = report_len;
16 return USB_CTL_OK;
17 }
18 return USB_CTL_STALL;
19}
And with that, your device is now prepared to handle the host setup requests. The next step is going to be actually writing the descriptors.
Every USB device has a configuration descriptor. In reality, what I'm calling the "configuration descriptor" here is actually a concatenated list of everything that follows the configuration descriptor. Here are the parts of a configuration descriptor, as they appear in order:
This is usually just a byte array. When making a device into a HID, the descriptor needs to change. Two new descriptor types are introduced by the HID class specification that we will use: 0x21 (HID descriptor) and 0x22 (Report Descriptor). The HID Descriptor declares the version of the HID spec that the device follows along with a country code. It also contains one or more report descriptors. The report descriptors contain only a length of a report (along with the bDescriptorType). These will be used later when the host makes a special HID setup request to load these descriptors.
The configuration descriptor of something that has an HID interface looks like so (changes in bold, see HID specification section 7.1, very first paragraph):
In addition, the device descriptor must change so that bDeviceClass = 0 to signal that the device's class is defined by its interfaces.
If you want to implement multiple separate HID devices in the same device (making a composite HID device), it is as simple as adding more interfaces. The only restriction is that the endpoint addresses need to be unique so that the host can talk to a specific HID implementation. This is one way to build things like mouse/keyboard combo devices.
Here is an example of a completed configuration descriptor that declares a single HID interface with both IN and OUT endpoints:
1/**
2 * Configuration descriptor
3 */
4static const uint8_t cfg_descriptor[] = {
5 9, //bLength
6 2, //bDescriptorType
7 9 + 9 + 9 + 7 + 7, 0x00, //wTotalLength
8 1, //bNumInterfaces
9 1, //bConfigurationValue
10 0, //iConfiguration
11 0x80, //bmAttributes
12 250, //bMaxPower
13 /* INTERFACE 0 BEGIN */
14 9, //bLength
15 4, //bDescriptorType
16 0, //bInterfaceNumber
17 0, //bAlternateSetting
18 2, //bNumEndpoints
19 0x03, //bInterfaceClass (HID)
20 0x00, //bInterfaceSubClass (0: no boot)
21 0x00, //bInterfaceProtocol (0: none)
22 0, //iInterface
23 /* HID Descriptor */
24 9, //bLength
25 0x21, //bDescriptorType (HID)
26 0x11, 0x01, //bcdHID
27 0x00, //bCountryCode
28 1, //bNumDescriptors
29 0x22, //bDescriptorType (Report)
30 sizeof(hid_report_descriptor), 0x00,
31 /* INTERFACE 0, ENDPOINT 1 BEGIN */
32 7, //bLength
33 5, //bDescriptorType
34 0x81, //bEndpointAddress (endpoint 1 IN)
35 0x03, //bmAttributes, interrupt endpoint
36 USB_HID_ENDPOINT_SIZE, 0x00, //wMaxPacketSize,
37 10, //bInterval (10 frames)
38 /* INTERFACE 0, ENDPOINT 1 END */
39 /* INTERFACE 0, ENDPOINT 2 BEGIN */
40 7, //bLength
41 5, //bDescriptorType
42 0x02, //bEndpointAddress (endpoint 2 OUT)
43 0x03, //bmAttributes, interrupt endpoint
44 USB_HID_ENDPOINT_SIZE, 0x00, //wMaxPacketSize
45 10, //bInterval (10 frames)
46 /* INTERFACE 0, ENDPOINT 2 END */
47 /* INTERFACE 0 END */
48};
One thing to note here: The HID Descriptor declares how many Report Descriptors will appear in relation to the USB device (bNumDescriptors + (bDescriptorType + wDescriptorLength)*<number of descriptors>). In general, HID devices don't usually need more than one report descriptor since you can describe multiple reports in a single descriptor. However, there's nothing stopping you from implementing multiple report descriptors.
The HID class describes a new class-specific setup request which can be used to read Report Descriptors. When this setup request is sent by the host, the device should return the Report Descriptor requested. Report Descriptors are fairly unique compared to the other descriptors used in USB. One major difference is that they read more like an XML document than a key-value array. There is no set order and no set length. In fact, the only way the host knows how many bytes to read for this setup request is from the HID Descriptor found inside the Configuration Descriptor that says how many bytes to expect. With other descriptors, the host usually reads the descriptor twice: Once only reading the first 9 bytes to get the wTotalLength and a second time reading the wTotalLength. With the Report Descriptor the host will read exactly as many bytes as were declared by the HID Descriptor. This of course means that if that length value is not set up correctly, then the host will get a truncated report descriptor and will have a hard time parsing it.
The most difficult part about writing report descriptors is that they are not easy to debug. On Windows, the device manager will simply say "Device failed to start". On Linux, a similar error appears in the system log. You'll get no help figure out what went wrong. Here are my tips to writing report descriptors:
The first thing I'm going to describe are my helper macros, actually:
1/**
2 * HID Descriptor Helpers
3 */
4#define HID_SHORT_ZERO(TAGTYPE) (TAGTYPE | 0)
5#define HID_SHORT_MANY(TAGTYPE, ...) (TAGTYPE | (NUMARGS(__VA_ARGS__) & 0x3)), __VA_ARGS__
6#define GET_HID_SHORT(_1, _2, _3, _4, _5, NAME, ...) NAME
7#define HID_SHORT(...) GET_HID_SHORT(__VA_ARGS__, HID_SHORT_MANY, HID_SHORT_MANY, HID_SHORT_MANY, HID_SHORT_MANY, HID_SHORT_ZERO)(__VA_ARGS__)
All HID tokens have a common format. They are a sequence of bytes with the first byte describing how many of the bytes following are part of the token, up to five bytes total. The first byte has the following format:
These helper macros are a little complex, and to be honest I based them of something I found on stackoverflow somewhere. I'm not even sure if they work with any compiler other than GCC. Here's how they work:
Here's some examples of what happens when this is evaluated:
With this macro we can define our HID tokens without having to worry about making a mistake encoding the length in the first byte.
I'm not going to go through the token types exhaustively since those are in the spec, but here's a couple common ones:
Since the easiest way to get started with these is with some examples, let's start off with a report descriptor that describes two reports: an IN report that is 64 bytes long and an OUT report that is 64 bytes long. The 64 bytes in both of these reports have a "vendor defined" usage and thus can be used for general buffers. The OS won't try to hook them into any input system.
1static const uint8_t hid_report_descriptor[] = {
2 HID_SHORT(0x04, 0x00, 0xFF), //USAGE_PAGE (Vendor Defined)
3 HID_SHORT(0x08, 0x01), //USAGE (Vendor 1)
4 HID_SHORT(0xa0, 0x01), //COLLECTION (Application)
5 HID_SHORT(0x08, 0x01), // USAGE (Vendor 1)
6 HID_SHORT(0x14, 0x00), // LOGICAL_MINIMUM (0)
7 HID_SHORT(0x24, 0xFF, 0x00), //LOGICAL_MAXIMUM (0x00FF)
8 HID_SHORT(0x74, 0x08), // REPORT_SIZE (8)
9 HID_SHORT(0x94, 64), // REPORT_COUNT(64)
10 HID_SHORT(0x80, 0x02), // INPUT (Data, Var, Abs)
11 HID_SHORT(0x08, 0x01), // USAGE (Vendor 1)
12 HID_SHORT(0x90, 0x02), // OUTPUT (Data, Var, Abs)
13 HID_SHORT(0xc0), //END_COLLECTION
14};
Let's dig into this report descriptor a little:
Now let's move on to another kind of report descriptor: Defining multiple reports in one descriptor. This requires some discussion of "Report IDs".
When a REPORT_ID token appears in a report descriptor, it changes how reports are sent and received by the host and device:
Here's an example descriptor that declares three reports:
1static const USB_DATA_ALIGN uint8_t hid_report_descriptor[] = {
2 HID_SHORT(0x04, 0x01), //USAGE_PAGE (Generic Desktop)
3 HID_SHORT(0x08, 0x05), //USAGE (Game Pad)
4 HID_SHORT(0xa0, 0x01), //COLLECTION (Application)
5 HID_SHORT(0x84, 0x01), // REPORT_ID (1)
6 HID_SHORT(0x14, 0x00), // LOGICAL_MINIMUM (0)
7 HID_SHORT(0x24, 0x01), // LOGICAL_MAXIMUM (1)
8 HID_SHORT(0x74, 0x01), // REPORT_SIZE (1)
9 HID_SHORT(0x94, 4), // REPORT_COUNT(4)
10 HID_SHORT(0x18, 0x90), // USAGE_MINIMUM (D-pad up)
11 HID_SHORT(0x28, 0x93), // USAGE_MAXIMUM (D-pad left)
12 HID_SHORT(0x80, 0x02), // INPUT (Data, Var, Abs)
13 HID_SHORT(0x80, 0x03), // INPUT (Const, Var, Abs)
14 HID_SHORT(0x04, 0x08), // USAGE_PAGE (LED)
15 HID_SHORT(0x08, 0x4B), // USAGE (Generic Indicator)
16 HID_SHORT(0x94, 8), // REPORT_COUNT(8)
17 HID_SHORT(0x90, 0x02), // OUTPUT (Data, Var, Abs)
18 HID_SHORT(0x84, 0x02), // REPORT_ID (2)
19 HID_SHORT(0x14, 0xFF), // LOGICAL_MINIMUM (-128)
20 HID_SHORT(0x24, 0x7F), // LOGICAL_MAXIMUM (127)
21 HID_SHORT(0x74, 0x08), // REPORT_SIZE (8)
22 HID_SHORT(0x94, 2), // REPORT_COUNT (2)
23 HID_SHORT(0x04, 0x01), // USAGE_PAGE (Generic Desktop)
24 HID_SHORT(0x08, 0x38), // USAGE (Wheel)
25 HID_SHORT(0x80, 0x06), // INPUT (Data, Var, Rel)
26 HID_SHORT(0xc0), //END_COLLECTION
27};
The three reports defined here are:
Some more interesting things that this example brings up:
Note that in the HID Usage Tables document, there are more examples in Appendix A!
Now that you've got your report descriptors all figured out, you need to actually send the data. This is not complicated.
In your configuration descriptor, you gave a polling rate for the endpoint. This polling rate does not imply that the host expects you to transfer a report at that rate. It only means that the host will attempt to start an IN transfer that often. When you have no report to send, make your endpoint NAK (don't STALL).
In my LED Watch project I wrote a USB API which takes care of packetizing for me. When I want to send data, I just point it towards an byte array and it sends it using as many or as few packets. For HID reports, I only sent them as-needed. The only complicated part is constructing the report itself. Follow these simple steps to send an IN report:
You'll probably want to set up some system for notifying the program that the report was sent. Note that most microcontroller USB peripherals should set an endpoint to NAK once a report has sent, so the host will not see another report to read until you explicitly tell your peripheral to send again.
This is the exact same story as IN reports, except this time you don't construct a report. Instead, you allocate space for it and wait for the host to send. Here's the steps for an OUT report:
Remember again that if you used the REPORT_ID token, the first byte will be the report ID and all bytes that follow will be the report.
Writing host software for HID devices is not complicated, but there are some gotchas to keep in mind. In general, the operating system will expose USB devices as a file of some kind. On Linux you can use the parsed hid driver or the unparsed hidraw driver (I've only used hidraw). hidraw will let you send raw reports. A similar system exists for Windows. HID devices are exposed as files which can be manipulated either with raw reports (using read and write on the file) or with the hid report parser (via calls to hid.dll).
When choosing how to write your host software you can choose to either use the OS's input system which will parse HID reports for you (abstracting away the reports themselves) or you can talk to the device in terms of reports ("raw"). I can't give much guidance for using the host's report parser, but for talking raw in terms of reports I do have some suggestions:
If you're application is going to be written in C or C++, then there is a fairly convenient cross-platform option available: https://github.com/signal11/hidapi
This library will take care of all the stuff that is required to enumerate the HID devices attached the computer. It will also handle reading and writing to the device using raw reports.
For python, I highly recommend using the "hid" module: https://pypi.python.org/pypi/hid
An example of using this can be found in the "host" directory in my LED watch repository.
The enumeration of human interface devices and communication with them happens using some methods in hid.dll and kernel32.dll. Using P/Invoke you can talk to these using C#. There are several libraries for this, but the lightest weight one I can find is here: https://github.com/MightyDevices/MightyHID
I don't actually recommend using the library itself. Rather, I would recommend reading through it and seeing how it does things and implementing that in your application directly. Sadly, although I have written an application in C# that talked pretty well to HID devices I do not have the source code available. Instead, I can give some tips:
At this point, I hope that I've armed you with enough information that you can implement a human interface device with any microcontroller that you have a working USB implementation for. We've gone through modifying the configuration descriptor, writing a report descriptor, sending and receiving reports, and briefly touched on writing host software to talk to the HID devices.
As always, if you have any suggestions, ideas, or questions feel free to comment below.