Reverse Engineering Keys from Firmware. A how-to

TL;DR

It is possible to reverse engineer keys from firmware with some tips:

Always looks for strings/constants.
Make guesses about the original source.
Find a function you can recognise and work backwards to identify other functions.
It helps if they use open-source code so you can crib from it.

Introduction

I’ve recently been testing a setup where two devices talk to each other over a UART connection. Whilst sniffing the connection it was obvious that the data was being encrypted and, furthermore, had a static encryption key, as I could control the data being inputted and could see that if I entered the same data, the same encrypted text came out.

Fortunately (for me), both devices had SWD enabled, which is a simple low-level debugging protocol which allows direct access to the processor core and memory. I used this to dump the RAM of the device.

The device was running on an STM32L151 processor – this runs a single ARM Cortex-M3 core. It only has a small amount of storage (128 KB Flash with 4 KB EEPROM) and a smaller amount of RAM (16 KB SRAM). These small numbers put restrictions on any software, meaning that it is unlikely to be running something like Linux and will usually run raw compiled machine code.

Note, the IDA disassembler had something called FLIRT which may automatically detect a load of standard functions so you don’t have to. I made life difficult by using Ghidra.

The first step was to load this into Ghidra, where I had to manually set the processor details ARM Cortex, Little Endian with a base address of 0x08080000:

Always look for Strings

Let me give you a reverse engineer’s secret: we always look for strings to make life easy, if we’re lucky there’s a full selection of debug strings which can help identify functions or allow us to ignore swathes of code.

Unfortunately, in this case, the strings that could be seen were enough to show me that it was a decent dump of code, but not enough to help identify what I wanted.

At this stage, you realise how much code 128 KB actually is when it’s in 16- and 32-bit thumb mnemonics. As I was only really interested in the encryption section of this, I ought to concentrate on that. The best way to view this is to think how you would do it.

There are a number of encryption algorithms out there that have a tight code density, such as any of the TEA family of ciphers, or some smaller versions of normally algorithms, such as the TinyAES implementation of AES. These all have constants which are used as part of the state of the main encryption calculations.

Normally these will be stored directly in memory, which means we can search for them. My first supposition was that the code would be using TinyAES, as that’s a common implementation amongst embedded systems.

If we look at the TinyAES source there are defined three constant arrays that should be sequential in memory: sbox[256], rsbox[256] and Rcon[11].

So the first step is to see whether we can find these tables in the exact order that they are in the source. Ghidra has a Search Memory dialogue that can be used, searching for 63 7c 77 soon brings up sbox at 0x0809ddca, with the right values for rsbox and Rcon immediately following them. I’ve renamed them to make it obvious in the screenshot:

This is enough to tell me that the code is likely to be TinyAES, it could of course be a modified version, but that’s something that will need to be checked later.

Unfortunately, there’s no immediate references which means that Ghidra’s not disassembled some function or it’s being linked via a memory offset, so it leaves this as a bit of a dead end.

Looking for functions

As we’ve had no real luck with the constants, we can look at the functions themselves. One advantage to Ghidra is that has a built in decompiler which can return C-like functions. It’s not perfect, but it can make it quicker to review and pinpoint code and understand the logic flow.

Another resource that can be used when looking at existing code, is Compiler Explorer, this will compile pasted code using a number of different compilers and to multiple architectures.

When compiling C code, usually the different source files will be compiled and output into object files, these will be linked together using a linker. This means that you can have some expectation that data and code from one source file will be roughly in the same area, looking above then the sbox is at memory location 0x0809ddca. This allows an assumption to be made that the memory around this point should be from the tinyAES.c source file.

This means that we should start looking at these functions first:

The last function, FUN_08099ffc is small (20 octets), but large enough that it’s not just calling another function. This disassembles as:

Looking at that in the decompiler shows:

A quick search through TinyAES’s source code (by searching for 0x1b) shows that the function xtime is very similar – once you allow for different compiler choices it performs the exact operation:

static uint8_t xtime(uint8_t x)
{
   return ((x<<1) ^ (((x>>7) & 1) * 0x1b));
}

We could try and further prove this by compiling it; but there’s a lot of compilers that could be used for embedded system and some playing in compiler explorer shows that it’s not gcc or clang. For now, I’m just going to assume that this function is xtime and label it as such.

From the previous screenshot, showing the disassembly we can see four references to the function, so the next step is to work my way backwards, with the source code of tinyAES in one monitor and Ghidra in the other.

The process is simple, follow a reference, look at the decompiled code (or disassembled mnemonics, which ever you prefer) to match a function – we don’t need to be precise: we can match on function calls within structures or constants. Then repeat until we get to where the key is set up.

The first reference is at address 0808f886, which is this messy function:

Which has four references to aes_xtime, this matches closely to MixColumns from within TinyAES, and this covers all the references to xtime (the only other place it is used in the source is in a defined macro). Now we can assume that this function is MixColumns and can look for references to MixColumns.

Fortunately there’s only one call to MixColumns which is the Cipher function; this matches well with the disassembled code.

Cipher is only called from a few functions: AES_ECB_encrypt, AES_CBC_encrypt_buffer and AES_CTR_xcrypt_buffer, but on the disassembly there is only one function. This maps closer to AES_CBC_encrypt_buffer. We can compare them directly (decompiled on the left, source on the right):

As AES_CBC_encrypt_buffer is mostly function calls it is easy to map a number of functions: XorWithIv and memcpy.

That function encrypts a buffer, it takes a ctx structure, i.e. a context structure. This structure contains the state of the encryption, so when initialised it will contain the private key and initialisation vector. The ctx structure is normally set up by a call to AES_init_ctx or AES_init_ctx_iv (depending on whether it’s using CBC/CTR or ECB mode) – so that’s the function we need to find. This function isn’t used internally within tinyAES so would be in the program flow from whatever is using that library.

If we look at the reference to AES_CBC_encrypt_buffer we get this function:

This is looking very suspiciously like an encrypt function. FUN_08084330 looks like it is setting up a context, following this leads to a function that maps very closely to AES_init_ctx_iv; so I’m going to assume that this is AES_init_ctx_iv.

I can now rewrite this in C, to ensure I’m getting the types right:

void encrypt(uint8_t *buffer,size_t length)
{
   struct AES_ctx ctx [224];

   AES_init_ctx_iv(&ctx, DAT_0808432c, DAT_08084328);
   AES_CBC_encrypt_buffer(&ctx, buffer,length);
   return;
}

As they’ve defined the ctx block as static memory rather than on the heap we can use this to identify the size of the AES_ctx structure in use and therefore the key size. The AES_ctx structure, taken from tinyAES/aes.h is:

So, assuming they’re using CBC mode (which we know as the AES_init_ctx_iv function is in use) then the structure consists of a RoundKey of the size of the key and an Iv of the size of AES_BLOCKLEN. AES has a 128 bit block size irrelevant of the size of the key, so we know that AES_BLOCKLEN will always be 16 bytes. The size of AES_keyExpSize is defined a wee bit further up in aes.h:

As size of ctx block (224) – AES_BLOCKLEN (16) = 208, then this means that the program is using a key size of 192 bits (24 bytes).

If we go back to the function I call encrypt above, it passes two undefined values to AES_init_ctx_iv:

DAT_0808432

DAT_08084328

If we look at these we can see that they store another memory address, so they’ll be mapped to a pointer to char (char *) type within C.

There is 24 bytes between the two pointers, which is the size of the key. This area of memory (0x0801a061) is outside the realm of the internal flash I grabbed and is in external flash which I had also grabbed a copy.

This did contain the key and a blank IV (i.e. 16 bytes of ASCII NULs (0x00)) showing that it was the key. Unfortunately, I can’t show you this as it was customer specific.

Test and Simulate

Detect and Respond

Improve and Protect

Comply

Reverse Engineering Keys from Firmware. A how-to

David Lodge

TL;DR

Introduction

Always look for Strings

Looking for functions