This is the first ever post I make about vulnerability research. Admittedly, I’m still a novice in the field, but I hope this article gives you a few ideas and/or motivation to move on with your own works.

Let’s get into it!

Preliminary research

The story begins a few years ago, when I bought a TP-Link WR741ND router for the low price of around 10$ (discounted) to play around with.

I noticed some work had already been conducted on the device and some people found a few vulnerabilities as well. Truth be told, I haven’t started working on it until this January (2023) and I was really curious whether I will be able to find anything interesting.

The most important information regarding the device (quoting the webpage):

  • 150Mbps wireless data rates ideal for video streaming, online gaming and internet calling
  • Wireless security encryption easily at a push of WPS button
  • IP based bandwidth control allows administrators to determine how much bandwidth is allotted to each PC
  • WDS wireless bridge provides seamless bridging to expand your wireless network

It’s important to mention that the device is already EOF (End of Life). However, this only raised my spirits as a large part of the research project was to practice and gather experience.

What could be a better target for this than a device already checked by others? I wanted to rise to the challenge.

Let’s see how I performed.

Unpacking the firmware

As a small tangent: at the time of reading, my device’s hardware and firmware versions are the following:

  • Hardware version: Ver.5.0 (latest)
  • Firmware version: 3.16.9 Build 150605

I downloaded the firmware from the vendors website and unpacked it using unblob.

unpacking

Unpacking worked like a charm, but not perfectly. Some more regarding this towards the end of this “chapter”.

On the surface, the firmware contains some utility files like the License Agreement and Upgrade Instructions along with a .bin file which is the actual firmware:

➜ tre -l 2

├── GPL License Terms.pdf
├── How to upgrade TP-LINK Wireless  N Router.pdf
├── wr740nv5_wr741ndv5_en_ipv6_3_16_9_up_boot(150605).bin
└── wr740nv5_wr741ndv5_en_ipv6_3_16_9_up_boot(150605).bin_extract
    ├── 0-15632.unknown
    ├── 1115135-1180160.unknown
    ├── 1180160-3809792.squashfs_v4_le_extract
    ├── 132096-1115135.lzma_extract
    ├── 15632-49029.lzma_extract
    ├── 3809792-4063744.unknown
    └── 49029-132096.unknown

Directories: 4, Files: 7, Symbolic Links: 0, Lines: 0

Looking at the output, It’s obvious the extracted binary contained a squashfs file system as well:

➜ tre -l 1 1180160-3809792.squashfs_v4_le_extract/

1180160-3809792.squashfs_v4_le_extract
├── bin
├── dev
├── etc
├── lib
├── mnt
├── proc
├── root
├── sbin
├── sys
├── tmp
├── usr
├── var
└── web

I first looked at how the init process runs and what services are started at boot etc. The /etc/inittab and /etc/rc.d/rcS files turned out to be my good friends. Their contents:

➜ cat etc/inittab

::sysinit:/etc/rc.d/rcS
::respawn:/sbin/getty ttyS0 115200
::shutdown:/bin/umount -a

The /etc/rc.d/rcS file is a lot more interesting:

➜ cat etc/rc.d/rcS

#!/bin/sh

# This script runs when init it run during the boot process.
# Mounts everything in the fstab

mount -a
#mount -o remount +w /

#
# Mount the RAM filesystem to /tmp
#

mount -t ramfs -n none /tmp
mount -t ramfs -n none /var

export PATH=$PATH:/etc/ath

... SNIP ...

/usr/bin/httpd &

... SNIP ...

Judging from this, I had a suspicion the /usr/bin/httpd program does basically everything. I wasn’t sure if this is all that was to it at the time but I’m gonna spoiler it for you. It is …

As a small side tangent, the file wasn’t unpacked completely. Unblob probably carved out the squashfs from the binary file, but it probably doesn’t recognize it entirely. I’ll add support for this file format to unblob at a later date along with a small writeup on how to create unblob handlers.

Understanding the httpd binary

The bread and butter of the project and the reason I got to working on it. To hone my reverse engineering and analysis skills.

I conducted the reverse engineering process using Ghidra and did some light scripting to practice plugin development as well, altough I tried doing it manually for the most part.

The following chapters will describe my approach, leading to the vulnerability discovery.

Dynamic analysis

As I mentioned at the beginning already, I had physical access to the device. With a background in web pen-testing I decided to put these skills of mine to use and did some dynamic analysis to get an idea of how the app does certain things, like how requests are routed.

I clicked through the web UI to get an idea of how endpoints are handled, what data flows where and generic stuff.

webui

Seems like endpoints are mapped under /userRpm/ and requests are routed to separate, aptly named endpoints such as: /userRpm/SoftwareUpgradeRpm.htm.

I also noticed some interesting, random looking paths being injected in the middle of every link on the page:

vpath

I wasn’t really able to figure out what this is any more than a collection of random bytes specific to separate sessions.

I collected potential endpoints from the httpd binary by looking for strings matching /userRpm/* under which all endpoints belonged.

➜ strings httpd | grep '/userRpm/'

/userRpm/DMZRpm.htm
/userRpm/UpnpCfgRpm.htm
/userRpm/AccessCtrlAccessRulesRpm.htm
/userRpm/AccessCtrlAccessRuleModifyRpm.htm
/userRpm/AccessCtrlAccessRulesAdvRpm.htm
/userRpm/AccessCtrlAccessTargetsRpm.htm
/userRpm/AccessCtrlAccessTargetsAdvRpm.htm
../userRpm/AccessCtrlAccessTargetsRpm.htm
/userRpm/AccessCtrlHostsListsRpm.htm
/userRpm/AccessCtrlHostsListsAdvRpm.htm
../userRpm/AccessCtrlHostsListsRpm.htm
... SNIP ...

I then trued to use ffuf to check which one of the endpoints might be accessible without being authenticated:

ffuf

Not much …. (unauthenticated requests are redirected to “/login” via a redirection page yielding a status code of 200. The size filter is used to ignore these redirections…)

I also checked whether or not there are endpoints provided by the server to execute commands,but aside from some very limited functionality like ping or trace I couldn’t find any obvious culprit:

pingtrace

I decided to jump into reversing

Reverse engineering

As shown below, the httpd binary is a simple ELF file:

➜ file httpd

httpd: ELF 32-bit MSB executable, MIPS, MIPS32 rel2 version 1 (SYSV), dynamically linked, interpreter /lib/ld-uClibc.so.0, no section header

I just plopped it into Ghidra. As the binary still has symbol information associated with it, I felt like I probably wont have a difficult time understanding its inner workings.

Since the binary is fairly large, I stuck to the endpoints implemented by the Web server, especially ones that accepted user-supplied data.

I needed to find which function maps endpoints to their handler functions so I enumerated potential paths via the strings utility and checked where they’re referenced with Ghidra.

endpointmaps

From here, looking for calls to httpRpmConfAdd I got to practice some scripting.

Scripting Ghidra

IMO, the Ghidra API docs suck. They’re exhaustive, but do not explain anything. To circumvent this problem, I recommend you try this git repo. It’s way better than the rather minimalistic API docs provided by Ghidra and actually contain examples.

My original idea was the following:

  1. Get a list of references made to the httpRpmConfAdd function (used to map endpoints to handler functions)
  2. Automatically parse the arguments as methods and their names and print them to the console
  3. Work my way down from there manually

What I ended up doing was making a list of a family of functions to collect references to. The reasoning behind this was that the family of ConfAdd functions is used to add different types of functionality to the server. I’ll not go too deep into them and instead stick to how I looked for httpRpmConfAdd functions and found the vulnerability.

Without wasting time to go through how I developed my simple script, I’ll just dump it here for you:

from ghidra.program.flatapi import FlatProgramAPI
from ghidra.app.decompiler import DecompInterface
from ghidra.util.task import ConsoleTaskMonitor

import collections

NAME_PATTERN = "ConfAdd"

# Contains the name of functions and which parameters are interesting for us
FUNC_TYPES = {
    "httpAliasConfAdd": [0, 1],
    "httpPwdConfAdd": [0],
    "httpCtrlConfAdd": [0, 1, 2],
    "httpFsConfAdd": [0, 1],
    "httpUploadConfAdd": [0, 1],
    "httpRpmConfAdd": [0],
}

def getString(addr):
	mem = currentProgram.getMemory()
	core_name_str = ""
	while True:
		byte = mem.getByte(addr.add(len(core_name_str)))
		if byte == 0:
			return core_name_str
		core_name_str += chr(byte)

def find_functions_by_name(fm):
    functions = []
    funcs = fm.getFunctions(True) # True means 'forward'
    for func in funcs:
        name =  func.getName()
        if NAME_PATTERN in name:
            if name != "httpSysRpmConfAdd" and name != 'httpMimeParseFnConfAdd':
                functions.append(func)
    return functions

def check_xrefs(func, address):
    for ref in getReferencesTo(func.getEntryPoint()):
        if ref.getFromAddress() == address:
            return True
    return False

# This is honestly a bit hacky, I'd probably need to look at this more
def get_call_args(func, decompiled_caller, caller):
    high_func = decompiled_caller.getHighFunction()
    opiter = high_func.getPcodeOps()
    ops = collections.deque([], 5)
    res = []
    called = 0
    while opiter.hasNext():
        op = opiter.next()
        mnemonic = str(op.getMnemonic())
        if "COPY" in mnemonic:
            ops.appendleft(op)
        elif "CAST" in mnemonic:
            called = op.getInput(0).getAddress()
        elif "CALL" in mnemonic:
            if called != 0:
                if type(called) == int:
                    called = func.getEntryPoint().getAddress(str(called)).getAddress()
                if check_xrefs(func, called):
                    for idx in FUNC_TYPES[func.getName()]:
                        val = toAddr(ops[idx].getInput(0).getOffset())
                        print "\t\t" + getString(val) + " - ",
                    print(" ")
    return res
                                                


def main():
    monitor = ConsoleTaskMonitor()
    program = getCurrentProgram()
    fm = program.getFunctionManager()
    di = DecompInterface()
    di.openProgram(program)

    functions = find_functions_by_name(fm)
    for func in functions:
        print("{} >> ".format(func.getName()))
        callers = func.getCallingFunctions(monitor)
        for caller in callers:
            print("\t{} >>".format(caller.getName()))
            decomp_caller = di.decompileFunction(caller, 100, monitor)
            args = get_call_args(func, decomp_caller, caller)

main()

It might not be the cleanest script one could conjure up, but I learned a lot while working on it and thus, the goals of the project were already accomplished.

I’ll probably come back to this and make this more generic so it can be used in a more generic way. Maybe allowing the user to pick function names to look for in a dialog and set which arguments are interesting?

As a small side tangent, I know I could’ve used the decompiled_function.getDecompiledFunction().getC() call to get a string representation of the decompiled, C-like pseudo code and use python string magic to get a list of functions to lookd for, but I wanted to be able to click through them from within the console. For this (if I understand correctly) I needed the more complex objects

Vulnerability discovery

The manual reverse engineering process took me around four of five afternoons. Instead of using any kind of specific approach, I just brute forced it and tried to go over every endpoint, get a feel of how they work and what they do.

I’d have a hard time describing the process and what I did. That probably belongs in a book or a course more than in a short article. Instead, I’ll focus on how I uncovered a command injection vulnerability using the tools available at my disposal.

Wireless settings

As mentioned already, I started out with endpoints accepting user-supplied data. One such “class” of endpoints are the settings. Having checked several other endpoints already I eventually moved on to the Wireless Settings menu point.

The specific interface can be accessed as shown below:

wlanui

Making changes here results in a request being made to the WlanNetworkRpm.htm endpoint:

wlanreq

Looking at the decompiled code in Ghidra, we can see what endpoints are mapped to this endpoint:

wlanmap

Using call trees to find problematic function calls

I used the Function Call Trees panel to observe incoming and outgoing calls to and from the handler functions mapped to the Wireless settings page. Specifically, I used a filter to check if any call trees from the current method results in execFormatCmd, a function the server uses to execute os commands on the device. Essentially, a wrapper around system.

execfilter

I traveled down the call trees ending in the execFormatCmd function and found the following snippet:

execcal

Remembering we can change the ESSID of the wireless network via the user interface this snippet activated my Spidey sense so I started moving upwards in the call tree.

Tracing the call tree

From here, I simple did the following:

  • Step 1 - Move one function upwards in the call tree
  • Step 2 - Check if the caller validates data received from the web UI?
  • Step 3 - If everything checks out, back to Step 1

With the approach cleared up, here’s an outline of the problematic code path:

  • wlanEnable - The function responsible for calling the execFormatCmd with unfiltered, user-supplied data while setting up (enabling) the wireless network:

execcal

  • The call to wlanEnable is done by wlanBasicDynSet, which essentially checks if the config for the network has been changed and applies the changes if it was:

wlanenablecall

  • From here up, the WlanNetwork_APC function is responsible for invoking wlanBasicDynSet. This one’s tied to the Wireless Settings interface based on the request and the endpoint map shown way up above.

basicdynsetcall

Exploiting the vulnerability

The following is an example of the vulnerability being exploited:

Updates from Vendor

Due to public disclosure TP-Link released a firmware containing a fix. If you want to stay on the secure side, I recommend you navigate to TP-Link’s site and get the latest firmware update.

IMPORTANT
TP-Link (and also me, for that matter) recommends you refrain from using third-party firmware for devices. You should only ever use firmware from the original vendor, there’s no telling what could be on these third-party firmware images otherwise …

Final words

I had a ton of fun while conducting the research and picked up quite a few tricks and experiences. I also believe my approach could’ve been automated using Ghidra scripts so that’s one direction to improve towards.

Hope this read was interesting and I recommend you check back in a few weeks as I’ll have more content available related to vulnerability research.

~ r4bbit

Vulnerability disclosure timeline

  • May 7, 2023 - First contact with TP-Link
  • May 9, 2023 - Response from TP-Link, agreement to share details
  • May 10, 2023 - Vulnerability details shared with TP-Link via secure channel
  • July 12, 2023 - TP-Link confirms the findings, no fix due to the device being EOL
  • July 26, 2023 - Contact with MITRE about a potential CVE
  • Aug 10, 2023 - TP-Link initiates a fixed firmware release
  • Aug 15, 2023 - Coordinated public disclosure