Wow, time does fly. More than a year ago, on October 23rd, 2008, Nintendo finally released an update that fixed the strncmp (fakesigning) exploit in all forks of IOS. This disabled any direct methods to install unofficial content on all updated Wii consoles. At the time, version beta9 of The Homebrew Channel had been in the making for a while, so we decided to take the opportunity to use one of our stockpiled IOS exploits to work around the update and release beta9. These exploits differ from fakesigning in that they directly exploit the IOS runtime, injecting code that lets us take control and disable signatures altogether. Therefore, this was our first released IOS code execution exploit. HBC beta9 was released and worked great on all Wiis, as always.
In order to hinder Nintendo’s attempts at fixing it, and to avoid misuse by warez kiddies, sven and I had a lot of fun obfuscating the exploit over a couple afternoons. We decided not to release information about it, hoping it would last long enough to be useful for future installers and BootMii. Later we kind of forgot about this, but on a few occassions people have asked us to document it, and we proposed a challenge: we would document the exploit as soon as someone “broke” our obfuscation and figured out how the exploit works. The intent was to promote reverse engineering and also see just how long it would take people to crack it. Apparently, either people weren’t very interested or we did a pretty good obfuscation job, because it took pretty long π
Well, I’m happy to say that today I received an e-mail from an anonymous hacker who successfully reverse engineered our layers of obfuscation. They discovered the inner workings of the STM Release Exploit, as I will be calling it, and did so after three weekends of reverse engineering. Hats off to you, and thank you for taking the challenge!
This bug was discovered by accident, and in fact it is a real honest-to-goodness software bug that is not only exploitable, but a nuisance during regular use. To understand it, you need to understand how STM works.
STM is the IOS module in charge of random hardware functions such as handling the fan, “idle” (WC24) mode, the front slot LED (including the blink patterns), and the buttons. I have no clue what STM means, but I’ve seen it called “State-TM” somewhere on the Wii. One of the main functions of STM is to provide a way for PowerPC software to get notifications when either the Reset or the Power buttons are pressed. It’s worth noting that I have no clue why they did this –the PowerPC already knows about Reset via the legacy GameCube interface, and can be given direct access to Power including IRQ via the shared GPIO system, and IOS doesn’t use these buttons at all– but they did. It works like this: STM creates two devices, an “immediate” device, and an “event” device. The immediate device is used to issue commands to STM that take effect immediately, while the event device is the callback mechanism. The PowerPC code issues an IOS_IoctlAsync()
call on the “event” device, and this call blocks (asynchronously) until there is an event (such as a button press). When this happens, the call returns with the event code, and the PowerPC code reissues it to listen for further events.
One problem with this approach is that the PowerPC needs a way to shut down the event callback. The IOS IPC mechanism doesn’t provide a way for the PowerPC to cancel an ongoing request; it must wait until its completion. When PowerPC code needs to hand off execution, it needs to clean up all references and file descriptors to IOS, so it needs a way to get rid of the event call. STM implements this by having a call on the immediate interface that forces the event call to return with a zero event code. So far so good. If you’re interested, check out stm.c on libogc (particularly the functions with EventHook
in the name).
In order to better understand the mechanism, it’s worth looking at the individual messages as they are exchanged with IOS. Here’s what it might look like:
PowerPC
IOS
|
---|
Initializing STM |
open(path=”/dev/stm/immediate”) |
open() fd = 1 |
open(path=”/dev/stm/eventhook”) |
open() fd = 2 |
ioctl(fd=2, num=EVENTHOOK, evbuf=0x12345600) |
Time passes, user presses button |
Write event code to 0x12345600 |
ioctl(fd=2) result = 0 |
Read event code from 0x12345600 |
ioctl(fd=2, num=EVENTHOOK, evbuf=0x12345600) |
Time passes, software decides to shut down STM |
ioctl(fd=1, num=RELEASE) |
Write 0 event code to 0x12345600 |
ioctl(fd=2) result = 0 |
ioctl(fd=1) result = 0 |
close(2) |
close(2) result = 0 |
close(1) |
close(1) result = 0 |
Now, when I was reverse engineering STM, I noticed that things didn’t work well when using the Twilight Hack. This is because Zelda’s STM eventhook is still active, and STM won’t let you register a new one. So I added an STM eventhook release to the Twilight Hack code. One slight issue is that we can’t know if there was an old eventhook or not, depending on what the state of the machine was (since the Twilight Hack can be relaunched from software, as an SD loader of sorts, and this was popular in the early days), so we just make it attempt to release the eventhook always. This is fine, as the release function will return an error if there is no eventhook active.
Then IOS started crashing sometimes.
Looking closely at the release function in STM, here’s what I found:
release_eventhook MOV R12, SP STMFD SP!, {R4-R6,R11,R12,LR,PC} LDR R4, =hook_msg MOV R6, R0 SUB R11, R12, #4 LDR R0, =aRelease BL printf_disabled LDR R3, [R4] MOV R5, #0 CMP R3, R5 MOVL R1, -6 MOV R0, R6 BEQ loc_20300C04 loc_20300BD8 STR R5, [R4] MOV R0, R3 LDR R3, [R3,#0x18] MOV R1, R5 STR R5, [R3] BL AckMessage MOV R0, R6 MOV R1, R5 BL AckMessage LDMFD SP, {R4-R6,R11,SP,LR} BX LR loc_20300C04 BL AckMessage LDR R3, [R4] B loc_20300BD8
This translates to the following C code:
struct ios_message { // this isn't exactly right on the IOS side, but it doesn't matter here u32 command; // 0x00 = 6 for ioctl s32 result; // 0x04 s32 fd; // 0x08 // arguments for ioctl u32 ioctl_number; // 0x0c void *buffer_in; // 0x10 u32 in_size; // 0x14 void *buffer_out; // 0x18 u32 out_size; // 0x1c }; struct ios_message *hook_msg; void release_eventhook(ios_message *imm_msg) { struct ios_message *the_hook_msg = hook_msg; printf_disabled("Release\n"); if (!the_hook_msg) { AckMessage(imm_msg, -6); } hook_msg = NULL; *(u32*)the_hook_msg->buffer_out = 0; AckMessage(the_hook_msg, 0); AckMessage(imm_msg, 0); }
Notice anything wrong? They forgot a return;
statement right at the end of the if(!the_hook_msg)
block! This means that if there is no callback registered, it will try to ack the immediate message twice (which does nothing), it will try to ack a NULL message (which the kernel catches and does nothing), but most importantly, it will dereference a NULL structure, get a pointer from it, and write 0 to the address pointed to by that pointer. In other words, that line of code becomes **(u32**)0x18 = 0;
, as 0x18 is the offset of buffer_out
inside the structure. And 0x18 is an address in low MEM1 that we completely control from the PowerPC. Whoops.
In the Twilight Hack, this location usually contained some odd value, which caused IOS to crash with an unaligned access exception. We added a workaround in a later release of the Twilight Hack so IOS will no longer crash. It looks like this:
// STM bug workaround // On attempt to release callback when it's already released // or when it has fired and auto-released, STM dereferences a // member of a NULL IPC structure and then tries to write 0 // to outbuf. End result, STM tries to: // **((u32**)0x18) = 0; // so we set 0x18 here to a valid address (0x14) to prevent // a crash. *((u32*)0x80000018) = 0x00000014; sync_after_write((void*)0x80000014, 8); printf("Releasing STM callback..."); /* ... */
The comment was removed from the Twilight Hack public source code release ( π ), but the code is still there.
I chose IOS34 for the exploit because it is not used for homebrew or any games that I own (so I can patch it for debugging with impunity), and it shares the same STM binary with IOS35, which is mostly what I’ve been reverse engineering. The exploit is quite simple: we simply find the address of the stack location that contains the return address for the function (LR), and write it to 0x18. Then we release the STM callback twice. The second time around, STM zeroes out the return address and the function returns to execute code at address 0. We place our own code there, and clean up afterwards by jumping to the real return location, so STM keeps on running happily.
But wait, we need to somehow break into the kernel to disable the signature check. How can we do that? Well, it turns out that Nintendo left behind some useful IOS syscalls. They look like this:
wtf1 MOVS R3, #3 STR R3, [R0] MOVS R3, #0 STRH R3, [R1] BX LR wtf2 MOVS R3, #1 STR R3, [R0] MOVS R3, #0 STRH R3, [R1] BX LR
Which translates to:
void wtf1(u32 *a, u16 *b) { *a = 3; *b = 0; } void wtf2(u32 *a, u16 *b) { *a = 1; *b = 0; }
These functions appear to be used as configuration for certain global settings, such as whether IOS is monolithic or modular, so they just return constant values by dereferencing their arguments. In any case, there are no permission checks and these calls happily write to any address that you want, with full kernel permissions. We just pass along an address inside the signature check function that we want patched out, and we win.
Now, this exploit isn’t just caused by the small bug in STM; it’s also a consequence of poor security in IOS in general:
- IOS should unmap the zero page and cause NULL dereferences to abort.
- IOS should NEVER allow or use execute permission for memory controlled by the PowerPC (!!!).
- IOS system calls should be code-reviewed and checked for validation of arguments, as they are critical to security.
- Nintendo needs to backport security fixes to all IOSes. They had found the syscall bug and fixed it in newer IOS forks, but this is useless without backporting it back to all older IOSes. In fact, changes like that draw attention to the bugs.
- Which of course brings us to the fact that having dozens of forks of security-critical software is a maintenance nightmare and a really really bad idea.
Unfortunately, given later exploits and Nintendo’s changes to IOS, it seems they can’t be bothered to do any of the above. They fixed the STM bug and backported the syscall fix from other IOSes, but there are others with similar bugs.
Hope you had fun reading this latest episode of How To Pwn a Nintendo π
22 responses so far ↓
1 Artik // Jan 27, 2010 at 4:50 pm
This is how you properly explain an exploit. “Egohot” should take a hint.
2 SquidMan // Jan 27, 2010 at 4:57 pm
Very nice writeup marcan. I always love reading these exploit writeups, they’re so informative π
Also, First!
3 HyperHacker // Jan 27, 2010 at 5:40 pm
Nice. Agreed with SquidMan (except the “first” bit :p), these are a lot of fun to read.
I didn’t see this challenge posted though. Did you announce it here?
As soon as I saw “Write event code to 0x12345600” I knew this exploit was going to involve tricking IOS into writing to an arbitrary address. This method seems a little roundabout though. I guess you can’t just pass the address you want zero’d as evbuf, but why not use the NULL dereference to do the patch directly, instead of gaining kernel access to exploit a second bug to do it?
4 bushing // Jan 27, 2010 at 6:20 pm
@HyperHacker: code running in the STM module can’t overwrite code of other modules (unless you can jump into kernel mode).
5 funkamatic // Jan 27, 2010 at 8:57 pm
hey guys, cool post!
I have an important question you could hopefully answer. We’ve been waiting for dsibrew for a while, is there any way you could please give us any info?? I’m sure people would really like to know if things aren’t working out. I can understand if you guys want to keep it a secret, though, in case your getting close to a breakthrough.
6 pjs // Jan 27, 2010 at 10:06 pm
I will venture a guess that STM stands for state transition machine.
7 henke37 // Jan 28, 2010 at 12:54 am
So, the same disclosure rules applies for the next vulnerability I assume?
8 Sven // Jan 28, 2010 at 2:46 am
henke: yes, good luck with the Hackmii installer though π
9 DCX2 // Jan 28, 2010 at 9:14 am
What a beautiful story. A missing “return;” in user-code allows a null pointer to be dereferenced and an offset to be zeroed, which allows the stack to be smashed so that the next return takes us to address zero and the software just so happens to control that address, which allows a subroutine to be written that makes syscalls that can be used to overwrite arbitrary memory addresses, which are then used to patch out an instruction in the signature check function.
..but why can’t we just call wtf1 or wtf2 ourselves? Why do we need to branch to address 0?
10 Sven // Jan 28, 2010 at 10:29 am
This is all IOS code. You cannot call IOS syscalls from the PowerPC so you first need some way to run some code in usermode on Starlet.
11 DCX2 // Jan 28, 2010 at 11:13 am
So the PowerPC can easily inject an ARM assembly subroutine to address 0 and that subroutine makes the IOS syscalls that patch the signature check, but then we need to trick Starlet into branching to the injected code. I guess the PPC can’t write directly into Starlet’s stack, so instead it calls into the STM, and the STM *can* write to the stack, so we over-write the return address and make Starlet jump to the injected code?
12 Sven // Jan 29, 2010 at 3:22 am
exactly.
13 HenshinMijin // Jan 29, 2010 at 9:25 pm
Splendid..!
Though I hate wen your articles end.
They are such a pleasure to read!
~fsKDβ’
14 jpx92681 // Jan 31, 2010 at 10:57 pm
I have really found this reading absolutely pleasant. Is the kind of things that we hope from our generation. Would be great if someone go that far in terms of reversing engineering with the xbox 360 but it seems no one in the scene has such level.
15 KingLewy // Jan 31, 2010 at 11:22 pm
Wow. Fascinating. Half of that went right over my head, but an interesting read nonetheless. Thanks again.
16 Sven // Feb 1, 2010 at 4:03 am
jpx92681: sorry, but you clearly do not know what you are talking about. tmbinc or mist for example are sure able to what we are doing. the xbox360’s security system is just pretty sophisticated but they still managed to pull of the JTAG hack. We did not even have to resort to hardware hacks so far because IOS is full of bugs…
17 jpx92681 // Feb 1, 2010 at 8:42 am
hehe thanks Sven, jtag workaround is great no devalue at all, I don’t want to go so far with the details but just an example, there is no way to install homebrew if you have kernel 8955… (think about it..), not even using a hardware hack. That says a lot don’t you think?.
18 Sven // Feb 1, 2010 at 2:53 pm
yes, it say that microsoft knows a lot more about security than nintendo does. I know some of the xbox 360 hackers and I can definitely tell you that they are in no way inferior to us. We couldn’t do anything for the xbox 360 at all if we were working on it.
19 Chris // Feb 2, 2010 at 4:25 am
Very nice explanation… and a smart way to hack into IOS .
Now, I’m wondering, why using those “wtf” functions in your ARM routine ? Can’t you just directly patch IOS code in RAM since you are already running as IOS (something similar to self-modifying code) ?
20 Sven // Feb 2, 2010 at 9:42 am
Christ: no, you are only running in _usermode_ there. The STM module is neither allowed to patch its own code nor is able to even read the code of the kernel and/or the ES module (which we want to modify because the sign check is in there).
21 hcs // Feb 5, 2010 at 4:04 pm
Heh, very cool. I’m just a little confused about one bit of terminology:
“this call blocks (asynchronously)”
It sounded like the ioctl blocks, not returning until the response from the IOS, so I’m not sure what the “asynchronously” refers to. Is the blocking ioctl done on a different thread than the one that posts a callback function, so it looks asynchronous to the caller?
22 marcan // Feb 7, 2010 at 10:01 am
The PowerPC uses the IoctlAsync call, which returns to the caller before getting a reply from IOS. The callback is called from interrupt context once the reply does arrive. To IOS async and sync calls look the same; the only difference is the way they are handled in the PPC.
You must log in to post a comment.