This is the first ever post I make about vulnerability research. Admittedly, I’m still a novice in the field, but I hope this article gives you a few ideas and/or motivation to move on with your own works.
Let’s get into it!
Preliminary research
The story begins a few years ago, when I bought a TP-Link WR741ND router for the low price of around 10$ (discounted) to play around with.
I noticed some work had already been conducted on the device and some people found a few vulnerabilities as well. Truth be told, I haven’t started working on it until this January (2023) and I was really curious whether I will be able to find anything interesting.
The most important information regarding the device (quoting the webpage):
- 150Mbps wireless data rates ideal for video streaming, online gaming and internet calling
- Wireless security encryption easily at a push of WPS button
- IP based bandwidth control allows administrators to determine how much bandwidth is allotted to each PC
- WDS wireless bridge provides seamless bridging to expand your wireless network
It’s important to mention that the device is already EOF (End of Life). However, this only raised my spirits as a large part of the research project was to practice and gather experience.
What could be a better target for this than a device already checked by others? I wanted to rise to the challenge.
Let’s see how I performed.
Unpacking the firmware
As a small tangent: at the time of reading, my device’s hardware and firmware versions are the following:
- Hardware version: Ver.5.0 (latest)
- Firmware version: 3.16.9 Build 150605
I downloaded the firmware from the vendors website and unpacked it using unblob.
Unpacking worked like a charm, but not perfectly. Some more regarding this towards the end of this “chapter”.
On the surface, the firmware contains some utility files like the License Agreement and Upgrade Instructions along with a .bin
file which is the actual firmware:
➜ tre -l 2
├── GPL License Terms.pdf
├── How to upgrade TP-LINK Wireless N Router.pdf
├── wr740nv5_wr741ndv5_en_ipv6_3_16_9_up_boot(150605).bin
└── wr740nv5_wr741ndv5_en_ipv6_3_16_9_up_boot(150605).bin_extract
├── 0-15632.unknown
├── 1115135-1180160.unknown
├── 1180160-3809792.squashfs_v4_le_extract
├── 132096-1115135.lzma_extract
├── 15632-49029.lzma_extract
├── 3809792-4063744.unknown
└── 49029-132096.unknown
Directories: 4, Files: 7, Symbolic Links: 0, Lines: 0
Looking at the output, It’s obvious the extracted binary contained a squashfs
file system as well:
➜ tre -l 1 1180160-3809792.squashfs_v4_le_extract/
1180160-3809792.squashfs_v4_le_extract
├── bin
├── dev
├── etc
├── lib
├── mnt
├── proc
├── root
├── sbin
├── sys
├── tmp
├── usr
├── var
└── web
I first looked at how the init process
runs and what services are started at boot etc. The /etc/inittab
and /etc/rc.d/rcS
files turned out to be my good friends. Their contents:
➜ cat etc/inittab
::sysinit:/etc/rc.d/rcS
::respawn:/sbin/getty ttyS0 115200
::shutdown:/bin/umount -a
The /etc/rc.d/rcS
file is a lot more interesting:
➜ cat etc/rc.d/rcS
#!/bin/sh
# This script runs when init it run during the boot process.
# Mounts everything in the fstab
mount -a
#mount -o remount +w /
#
# Mount the RAM filesystem to /tmp
#
mount -t ramfs -n none /tmp
mount -t ramfs -n none /var
export PATH=$PATH:/etc/ath
... SNIP ...
/usr/bin/httpd &
... SNIP ...
Judging from this, I had a suspicion the /usr/bin/httpd
program does basically everything. I wasn’t sure if this is all that was to it at the time but I’m gonna spoiler it for you. It is …
As a small side tangent, the file wasn’t unpacked completely. Unblob probably carved out the squashfs from the binary file, but it probably doesn’t recognize it entirely. I’ll add support for this file format to unblob at a later date along with a small writeup on how to create unblob handlers.
Understanding the httpd
binary
The bread and butter of the project and the reason I got to working on it. To hone my reverse engineering and analysis skills.
I conducted the reverse engineering process using Ghidra and did some light scripting to practice plugin development as well, altough I tried doing it manually for the most part.
The following chapters will describe my approach, leading to the vulnerability discovery.
Dynamic analysis
As I mentioned at the beginning already, I had physical access to the device. With a background in web pen-testing I decided to put these skills of mine to use and did some dynamic analysis to get an idea of how the app does certain things, like how requests are routed.
I clicked through the web UI to get an idea of how endpoints are handled, what data flows where and generic stuff.
Seems like endpoints are mapped under /userRpm/
and requests are routed to separate, aptly named endpoints such as: /userRpm/SoftwareUpgradeRpm.htm
.
I also noticed some interesting, random looking paths being injected in the middle of every link on the page:
I wasn’t really able to figure out what this is any more than a collection of random bytes specific to separate sessions.
I collected potential endpoints from the httpd
binary by looking for strings matching /userRpm/*
under which all endpoints belonged.
➜ strings httpd | grep '/userRpm/'
/userRpm/DMZRpm.htm
/userRpm/UpnpCfgRpm.htm
/userRpm/AccessCtrlAccessRulesRpm.htm
/userRpm/AccessCtrlAccessRuleModifyRpm.htm
/userRpm/AccessCtrlAccessRulesAdvRpm.htm
/userRpm/AccessCtrlAccessTargetsRpm.htm
/userRpm/AccessCtrlAccessTargetsAdvRpm.htm
../userRpm/AccessCtrlAccessTargetsRpm.htm
/userRpm/AccessCtrlHostsListsRpm.htm
/userRpm/AccessCtrlHostsListsAdvRpm.htm
../userRpm/AccessCtrlHostsListsRpm.htm
... SNIP ...
I then trued to use ffuf to check which one of the endpoints might be accessible without being authenticated:
Not much …. (unauthenticated requests are redirected to “/login” via a redirection page yielding a status code of 200. The size filter is used to ignore these redirections…)
I also checked whether or not there are endpoints provided by the server to execute commands,but aside from some very limited functionality like ping
or trace
I couldn’t find any obvious culprit:
I decided to jump into reversing
Reverse engineering
As shown below, the httpd binary is a simple ELF file:
➜ file httpd
httpd: ELF 32-bit MSB executable, MIPS, MIPS32 rel2 version 1 (SYSV), dynamically linked, interpreter /lib/ld-uClibc.so.0, no section header
I just plopped it into Ghidra. As the binary still has symbol information associated with it, I felt like I probably wont have a difficult time understanding its inner workings.
Since the binary is fairly large, I stuck to the endpoints implemented by the Web server, especially ones that accepted user-supplied data.
I needed to find which function maps endpoints to their handler functions so I enumerated potential paths via the strings
utility and checked where they’re referenced with Ghidra.
From here, looking for calls to httpRpmConfAdd
I got to practice some scripting.
Scripting Ghidra
IMO, the Ghidra API docs suck. They’re exhaustive, but do not explain anything. To circumvent this problem, I recommend you try this git repo. It’s way better than the rather minimalistic API docs provided by Ghidra and actually contain examples.
My original idea was the following:
- Get a list of references made to the
httpRpmConfAdd
function (used to map endpoints to handler functions)- Automatically parse the arguments as methods and their names and print them to the console
- Work my way down from there manually
What I ended up doing was making a list of a family of functions to collect references to. The reasoning behind this was that the family of ConfAdd
functions is used to add different types of functionality to the server. I’ll not go too deep into them and instead stick to how I looked for httpRpmConfAdd
functions and found the vulnerability.
Without wasting time to go through how I developed my simple script, I’ll just dump it here for you:
from ghidra.program.flatapi import FlatProgramAPI
from ghidra.app.decompiler import DecompInterface
from ghidra.util.task import ConsoleTaskMonitor
import collections
NAME_PATTERN = "ConfAdd"
# Contains the name of functions and which parameters are interesting for us
FUNC_TYPES = {
"httpAliasConfAdd": [0, 1],
"httpPwdConfAdd": [0],
"httpCtrlConfAdd": [0, 1, 2],
"httpFsConfAdd": [0, 1],
"httpUploadConfAdd": [0, 1],
"httpRpmConfAdd": [0],
}
def getString(addr):
mem = currentProgram.getMemory()
core_name_str = ""
while True:
byte = mem.getByte(addr.add(len(core_name_str)))
if byte == 0:
return core_name_str
core_name_str += chr(byte)
def find_functions_by_name(fm):
functions = []
funcs = fm.getFunctions(True) # True means 'forward'
for func in funcs:
name = func.getName()
if NAME_PATTERN in name:
if name != "httpSysRpmConfAdd" and name != 'httpMimeParseFnConfAdd':
functions.append(func)
return functions
def check_xrefs(func, address):
for ref in getReferencesTo(func.getEntryPoint()):
if ref.getFromAddress() == address:
return True
return False
# This is honestly a bit hacky, I'd probably need to look at this more
def get_call_args(func, decompiled_caller, caller):
high_func = decompiled_caller.getHighFunction()
opiter = high_func.getPcodeOps()
ops = collections.deque([], 5)
res = []
called = 0
while opiter.hasNext():
op = opiter.next()
mnemonic = str(op.getMnemonic())
if "COPY" in mnemonic:
ops.appendleft(op)
elif "CAST" in mnemonic:
called = op.getInput(0).getAddress()
elif "CALL" in mnemonic:
if called != 0:
if type(called) == int:
called = func.getEntryPoint().getAddress(str(called)).getAddress()
if check_xrefs(func, called):
for idx in FUNC_TYPES[func.getName()]:
val = toAddr(ops[idx].getInput(0).getOffset())
print "\t\t" + getString(val) + " - ",
print(" ")
return res
def main():
monitor = ConsoleTaskMonitor()
program = getCurrentProgram()
fm = program.getFunctionManager()
di = DecompInterface()
di.openProgram(program)
functions = find_functions_by_name(fm)
for func in functions:
print("{} >> ".format(func.getName()))
callers = func.getCallingFunctions(monitor)
for caller in callers:
print("\t{} >>".format(caller.getName()))
decomp_caller = di.decompileFunction(caller, 100, monitor)
args = get_call_args(func, decomp_caller, caller)
main()
It might not be the cleanest script one could conjure up, but I learned a lot while working on it and thus, the goals of the project were already accomplished.
I’ll probably come back to this and make this more generic so it can be used in a more generic way. Maybe allowing the user to pick function names to look for in a dialog and set which arguments are interesting?
As a small side tangent, I know I could’ve used the
decompiled_function.getDecompiledFunction().getC()
call to get a string representation of the decompiled, C-like pseudo code and use python string magic to get a list of functions to lookd for, but I wanted to be able to click through them from within the console. For this (if I understand correctly) I needed the more complex objects
Vulnerability discovery
The manual reverse engineering process took me around four of five afternoons. Instead of using any kind of specific approach, I just brute forced it and tried to go over every endpoint, get a feel of how they work and what they do.
I’d have a hard time describing the process and what I did. That probably belongs in a book or a course more than in a short article. Instead, I’ll focus on how I uncovered a command injection vulnerability using the tools available at my disposal.
Wireless settings
As mentioned already, I started out with endpoints accepting user-supplied data. One such “class” of endpoints are the settings. Having checked several other endpoints already I eventually moved on to the Wireless Settings
menu point.
The specific interface can be accessed as shown below:
Making changes here results in a request being made to the WlanNetworkRpm.htm
endpoint:
Looking at the decompiled code in Ghidra, we can see what endpoints are mapped to this endpoint:
Using call trees to find problematic function calls
I used the Function Call Trees
panel to observe incoming and outgoing calls to and from the handler functions mapped to the Wireless settings
page. Specifically, I used a filter to check if any call trees from the current method results in execFormatCmd
, a function the server uses to execute os commands on the device. Essentially, a wrapper around system
.
I traveled down the call trees ending in the execFormatCmd
function and found the following snippet:
Remembering we can change the ESSID of the wireless network via the user interface this snippet activated my Spidey sense so I started moving upwards in the call tree.
Tracing the call tree
From here, I simple did the following:
- Step 1 - Move one function upwards in the call tree
- Step 2 - Check if the caller validates data received from the web UI?
- Step 3 - If everything checks out, back to Step 1
With the approach cleared up, here’s an outline of the problematic code path:
wlanEnable
- The function responsible for calling theexecFormatCmd
with unfiltered, user-supplied data while setting up (enabling) the wireless network:
- The call to
wlanEnable
is done bywlanBasicDynSet
, which essentially checks if the config for the network has been changed and applies the changes if it was:
- From here up, the
WlanNetwork_APC
function is responsible for invokingwlanBasicDynSet
. This one’s tied to theWireless Settings
interface based on the request and the endpoint map shown way up above.
Exploiting the vulnerability
The following is an example of the vulnerability being exploited:
Updates from Vendor
Due to public disclosure TP-Link released a firmware containing a fix. If you want to stay on the secure side, I recommend you navigate to TP-Link’s site and get the latest firmware update.
IMPORTANT
TP-Link (and also me, for that matter) recommends you refrain from using third-party firmware for devices. You should only ever use firmware from the original vendor, there’s no telling what could be on these third-party firmware images otherwise …
Final words
I had a ton of fun while conducting the research and picked up quite a few tricks and experiences. I also believe my approach could’ve been automated using Ghidra scripts so that’s one direction to improve towards.
Hope this read was interesting and I recommend you check back in a few weeks as I’ll have more content available related to vulnerability research.
~ r4bbit
Vulnerability disclosure timeline
- May 7, 2023 - First contact with TP-Link
- May 9, 2023 - Response from TP-Link, agreement to share details
- May 10, 2023 - Vulnerability details shared with TP-Link via secure channel
- July 12, 2023 - TP-Link confirms the findings, no fix due to the device being EOL
- July 26, 2023 - Contact with MITRE about a potential CVE
- Aug 10, 2023 - TP-Link initiates a fixed firmware release
- Aug 15, 2023 - Coordinated public disclosure