Wednesday, November 3, 2010
Friday, September 10, 2010
AMAZING new-technology, 30 years old!
So, I just got the Tektronix 2236 Oscilloscope. Within 30 minutes, I've now got a pretty decent understanding of how to get what I'm looking for (at least for the Sprint AIRAVE project). That's mostly thanks to the amazing tutorials available online; I'm really very grateful for all of the advice that people have posted. I'm able to trace circuits, and will hopefully find the jtag lines off this chip soon. Once I find those, soldering in the actual trace pins to attach a jtag emulator will be tricky. I might have to ask one of the technicians at work to do some quick "freelance" soldering work.
I'm also pretty amazed at a lot of the malware out there now. I started a debug session on some of the 0-day adobe stuff that's floating around the internet (you can check out Sep. 8th metasploit blog post for more info) and was pretty impressed with all of the guards and techniques the system uses. I actually wasn't able to get it to run at all. I tried the !hidedebug All_... commands, right after starting the process in ImmunityDebugger, but I must not understand it well enough. I'll have to read up more on that when I get some time. Otherwise, I'm stuck trying to decipher the asm.
Work has asked me to scope out the effort required to port Valgrind to the Octeon Cavium family of processors. I think it should be quite a bit of time, since there's a lot of architecture configuration going on. Perhaps not, though. I've gone through some of the other architectures already built into Valgrind (x86-linux, x64-linux, ppc-linux), so I think it's a lot of "grunt" work; oh well, better than not getting paid.
I'm pumped about this Tektronix unit though. I'll be playing with it some more later tonight. I'm brewing tomorrow, so I won't get very far on the AIRAVE decode, but.. meh. I'm not in a super huge rush.
I'm also pretty amazed at a lot of the malware out there now. I started a debug session on some of the 0-day adobe stuff that's floating around the internet (you can check out Sep. 8th metasploit blog post for more info) and was pretty impressed with all of the guards and techniques the system uses. I actually wasn't able to get it to run at all. I tried the !hidedebug All_... commands, right after starting the process in ImmunityDebugger, but I must not understand it well enough. I'll have to read up more on that when I get some time. Otherwise, I'm stuck trying to decipher the asm.
Work has asked me to scope out the effort required to port Valgrind to the Octeon Cavium family of processors. I think it should be quite a bit of time, since there's a lot of architecture configuration going on. Perhaps not, though. I've gone through some of the other architectures already built into Valgrind (x86-linux, x64-linux, ppc-linux), so I think it's a lot of "grunt" work; oh well, better than not getting paid.
I'm pumped about this Tektronix unit though. I'll be playing with it some more later tonight. I'm brewing tomorrow, so I won't get very far on the AIRAVE decode, but.. meh. I'm not in a super huge rush.
Wednesday, September 8, 2010
Hacking (again)
So, I published an exploit for the nginx 0.6.38 and earlier heap corruption vulnerability. Apparently, some people were impressed enough with it that I got some really cool offers by mail (and some which seemed not so legitimate). I'm not able to act on the offers at this time (and believe me when I say, such a decision is freakin' tough); maybe someday in the future though.
I've gotta get back to hacking at sfuzz some more, but I can't seem to get myself excited about it. Or rather, there's a lot of mundane functionality that needs to be written (and rewritten) to get it to a state where I can add the cool stuff. Hopefully Ricky-Lee, and Vasu, will be able to help me get more excited about it. 0.6.3 should drop before the end of the year.
I just purchased a Tektronix 2236 multi-function scope, as well as an ARM DSO nano. Obviously, the Tektronix is for a bench unit, and the ARM DSO is for a portable hack-toy (and for $50 US, you can't go wrong).
I've been kindof itching to do some hardware hacking again. This time, I've cracked open a Samsung AIRAVE. The main ASIC/FPGA combo chip is labeled SBM1320, but it's really an Altera HC210. I have the data sheet on it, and maybe I'll post some pictures of the "guts" of the thing. Since I used to work on the Airvana femto (the AirHub or HubBub or whatever) I have some familiarity with the OTA stuff that's going on. Taking apart the shielded radio brought back some memories. I've found a number of i2c taps, and with the HC210 specs I should be able to locate the JTAG ports. I've got the Quantum II SDK, and hopefully I'll find all the assembler routines and write a small disassembler. From there, it'll be a long and arduous journey to complete control of the system. I haven't seen a TPM, but then again, I'm not an expert hardware hacker. Anyone who also has one of these and wants to void their warranty, let me know.
Oh - and beer. I'm thinking about getting a brew going in october for the december timeframe. Probably a stout. And probably a "Rock your face off" stout. Something dark, heavy, and warm for the winter.
I've gotta get back to hacking at sfuzz some more, but I can't seem to get myself excited about it. Or rather, there's a lot of mundane functionality that needs to be written (and rewritten) to get it to a state where I can add the cool stuff. Hopefully Ricky-Lee, and Vasu, will be able to help me get more excited about it. 0.6.3 should drop before the end of the year.
I just purchased a Tektronix 2236 multi-function scope, as well as an ARM DSO nano. Obviously, the Tektronix is for a bench unit, and the ARM DSO is for a portable hack-toy (and for $50 US, you can't go wrong).
I've been kindof itching to do some hardware hacking again. This time, I've cracked open a Samsung AIRAVE. The main ASIC/FPGA combo chip is labeled SBM1320, but it's really an Altera HC210. I have the data sheet on it, and maybe I'll post some pictures of the "guts" of the thing. Since I used to work on the Airvana femto (the AirHub or HubBub or whatever) I have some familiarity with the OTA stuff that's going on. Taking apart the shielded radio brought back some memories. I've found a number of i2c taps, and with the HC210 specs I should be able to locate the JTAG ports. I've got the Quantum II SDK, and hopefully I'll find all the assembler routines and write a small disassembler. From there, it'll be a long and arduous journey to complete control of the system. I haven't seen a TPM, but then again, I'm not an expert hardware hacker. Anyone who also has one of these and wants to void their warranty, let me know.
Oh - and beer. I'm thinking about getting a brew going in october for the december timeframe. Probably a stout. And probably a "Rock your face off" stout. Something dark, heavy, and warm for the winter.
Tuesday, December 1, 2009
Stack Canaries Part 3 - Sing birdie, sing!
So, time to look closer at the GNU Stack Canary generation. A brief bit of scrounging around with the GCC sources reveals that the stack protection is actually governed by the libssp code. Looking at the internals, we find that the following file contains some juicy bits:
http://gcc.gnu.org/viewcvs/trunk/libssp/ssp.c?view=log
Go ahead, look at it. Latest revision as of this writing is 146000. It may be updated by the time you look. Anyway, the important parts are reproduced below:
So, the stack guard is setup by the output of /dev/urandom.
Can we verify this? Of course! Lets just turn off read access to /dev/urandom and see if we get a static assignment:
[11:13:17][aconole@ssh:~]
$ ./sp32_t 1
gs:0x14[2786491977 / A6167E49]
gs:0x14[2786491977 / A6167E49]
gs:0x14[2786491977 / A6167E49]
gs:0x14[2786491977 / A6167E49]
gs:0x14[2786491977 / A6167E49]
gs:0x14[2786491977 / A6167E49]
[11:13:20][aconole@ssh:~]
$ sudo chmod go-r /dev/urandom
root's password:
[11:13:36][aconole@ssh:~]
$ ./sp32_t 1
gs:0x14[4278845440 / FF0A0000]
gs:0x14[4278845440 / FF0A0000]
gs:0x14[4278845440 / FF0A0000]
gs:0x14[4278845440 / FF0A0000]
gs:0x14[4278845440 / FF0A0000]
gs:0x14[4278845440 / FF0A0000]
[11:13:39][aconole@ssh:~]
$ ./sp32_t 1
gs:0x14[4278845440 / FF0A0000]
gs:0x14[4278845440 / FF0A0000]
gs:0x14[4278845440 / FF0A0000]
gs:0x14[4278845440 / FF0A0000]
gs:0x14[4278845440 / FF0A0000]
gs:0x14[4278845440 / FF0A0000]
[11:13:40][aconole@ssh:~]
$ ./sp32_t 1
gs:0x14[4278845440 / FF0A0000]
gs:0x14[4278845440 / FF0A0000]
gs:0x14[4278845440 / FF0A0000]
gs:0x14[4278845440 / FF0A0000]
gs:0x14[4278845440 / FF0A0000]
gs:0x14[4278845440 / FF0A0000]
Voila! Disabling read access means that the canary value is static. Why is this important? It's possible that the system would be misconfigured and /dev/urandom read access would be disabled. If that's the case, then our canary is well known, and we _possibly_ bypass the canary value.
NOTE: I say _possibly_ here. If we can't get 0xFF0A0000 into the stack (since \x00 is null terminator, \x0a is usually considered the line terminator) then we're hosed. However, if we _CAN_ inject that data, it's a clean way of bypass.
Let's setup a new test case to verify our stack overflow.
[11:33:06][aconole@ssh:~]
$ ./sp32_b 20 asdfasdfasdfasdfasdf 16
asdfasdfasdfasdf
*** stack smashing detected ***: ./sp32_b terminated
======= Backtrace: =========
/lib/libc.so.6(__fortify_fail+0x48)[0xf7635da8]
/lib/libc.so.6(__fortify_fail+0x0)[0xf7635d60]
./sp32_b[0x804869c]
./sp32_b[0x80488ab]
/lib/libc.so.6(__libc_start_main+0xe5)[0xf7565705]
./sp32_b[0x80485a1]
======= Memory map: ========
08048000-08049000 r-xp 00000000 08:11 7602732 /home/aconole/sp32_b
08049000-0804a000 r--p 00000000 08:11 7602732 /home/aconole/sp32_b
0804a000-0804b000 rw-p 00001000 08:11 7602732 /home/aconole/sp32_b
0804b000-0806c000 rw-p 0804b000 00:00 0 [heap]
f754e000-f754f000 rw-p f754e000 00:00 0
f754f000-f76a4000 r-xp 00000000 08:11 6381831 /lib/libc-2.9.so
f76a4000-f76a5000 ---p 00155000 08:11 6381831 /lib/libc-2.9.so
f76a5000-f76a7000 r--p 00155000 08:11 6381831 /lib/libc-2.9.so
f76a7000-f76a8000 rw-p 00157000 08:11 6381831 /lib/libc-2.9.so
f76a8000-f76ab000 rw-p f76a8000 00:00 0
f76ab000-f76c1000 r-xp 00000000 08:11 6381715 /lib/libpthread-2.9.so
f76c1000-f76c2000 r--p 00015000 08:11 6381715 /lib/libpthread-2.9.so
f76c2000-f76c3000 rw-p 00016000 08:11 6381715 /lib/libpthread-2.9.so
f76c3000-f76c5000 rw-p f76c3000 00:00 0
f76e5000-f76f2000 r-xp 00000000 08:11 6381883 /lib/libgcc_s.so.1
f76f2000-f76f3000 r--p 0000c000 08:11 6381883 /lib/libgcc_s.so.1
f76f3000-f76f4000 rw-p 0000d000 08:11 6381883 /lib/libgcc_s.so.1
f76f4000-f76f6000 rw-p f76f4000 00:00 0
f76f6000-f7714000 r-xp 00000000 08:11 6381973 /lib/ld-2.9.so
f7714000-f7715000 r--p 0001d000 08:11 6381973 /lib/ld-2.9.so
f7715000-f7716000 rw-p 0001e000 08:11 6381973 /lib/ld-2.9.so
ffe72000-ffe87000 rw-p 7ffffffea000 00:00 0 [stack]
ffffe000-fffff000 r-xp ffffe000 00:00 0 [vdso]
Aborted
[11:33:07][aconole@ssh:~]
$ sudo chmod go-r /dev/urandom
[11:33:16][aconole@ssh:~]
$ ./sp32_b 20 asdfasdfasdfasdfasdf 16
asdfasdfasdfasdf
[11:33:18][aconole@ssh:~]
$
Awesome! We have proven that it _IS_ possible to bypass the stack under severely contrived conditions. Those conditions are as follows:
1 - we use a memcpy
2 - we disable read access to /dev/urandom BEFORE the program is launched
3 - we have direct local access to the system
4 - it's 32-bits
We haven't tested 64-bits yet, so lets check that out now.
[11:36:04][aconole@ssh:~]
$ ./sp64_b 2
fs:0x28[18377501229438730240 / ff0a000000000000]
AHA! It's the same deal - just make sure our offset is correct, and we should be able to successfully exploit it.
[11:42:46][aconole@ssh:~]
$ ./sp64_b 26 asdfasdfasdfasdfasdfasdfasdfasdfasdfasdf 24
asdfasdfasdfasdfasdfasdf
[11:42:48][aconole@ssh:~]
$ sudo chmod go+r /dev/urandom
[11:42:57][aconole@ssh:~]
$ ./sp64_b 26 asdfasdfasdfasdfasdfasdfasdfasdfasdfasdf 24
asdfasdfasdfasdfasdfasdf
*** stack smashing detected ***: ./sp64_b terminated
Voila! So, we can now contrive a situation where stack canary bypass works.
Next week - playing with the linker. As a preview:
/tmp/ccUOKFpY.o: In function `reassign_sc_32':
/home/aconole/stack-protect.c:44: undefined reference to `__stack_chk_guard'
/tmp/ccUOKFpY.o: In function `reassign_sc_64':
/home/aconole/stack-protect.c:58: undefined reference to `__stack_chk_guard'
collect2: ld returned 1 exit status
http://gcc.gnu.org/viewcvs/trunk/libssp/ssp.c?view=log
Go ahead, look at it. Latest revision as of this writing is 146000. It may be updated by the time you look. Anyway, the important parts are reproduced below:
69 | static void __attribute__ ((constructor)) | |||
70 | __guard_setup (void) | |||
71 | { | |||
72 | unsigned char *p; | |||
73 | int fd; | |||
74 | ||||
75 | if (__stack_chk_guard != 0) | |||
76 | return; | |||
77 | ||||
78 | fd = open ("/dev/urandom", O_RDONLY); | |||
79 | if (fd != -1) | |||
80 | { | |||
81 | ssize_t size = read (fd, &__stack_chk_guard, | |||
82 | sizeof (__stack_chk_guard)); | |||
83 | close (fd); | |||
84 | if (size == sizeof(__stack_chk_guard) && __stack_chk_guard != 0) | |||
85 | return; | |||
86 | } | |||
87 | ||||
88 | /* If a random generator can't be used, the protector switches the guard | |||
89 | to the "terminator canary". */ | |||
90 | p = (unsigned char *) &__stack_chk_guard; | |||
91 | p[sizeof(__stack_chk_guard)-1] = 255; | |||
92 | p[sizeof(__stack_chk_guard)-2] = '\n'; | |||
93 | p[0] = 0; | |||
94 | } |
So, the stack guard is setup by the output of /dev/urandom.
Can we verify this? Of course! Lets just turn off read access to /dev/urandom and see if we get a static assignment:
[11:13:17][aconole@ssh:~]
$ ./sp32_t 1
gs:0x14[2786491977 / A6167E49]
gs:0x14[2786491977 / A6167E49]
gs:0x14[2786491977 / A6167E49]
gs:0x14[2786491977 / A6167E49]
gs:0x14[2786491977 / A6167E49]
gs:0x14[2786491977 / A6167E49]
[11:13:20][aconole@ssh:~]
$ sudo chmod go-r /dev/urandom
root's password:
[11:13:36][aconole@ssh:~]
$ ./sp32_t 1
gs:0x14[4278845440 / FF0A0000]
gs:0x14[4278845440 / FF0A0000]
gs:0x14[4278845440 / FF0A0000]
gs:0x14[4278845440 / FF0A0000]
gs:0x14[4278845440 / FF0A0000]
gs:0x14[4278845440 / FF0A0000]
[11:13:39][aconole@ssh:~]
$ ./sp32_t 1
gs:0x14[4278845440 / FF0A0000]
gs:0x14[4278845440 / FF0A0000]
gs:0x14[4278845440 / FF0A0000]
gs:0x14[4278845440 / FF0A0000]
gs:0x14[4278845440 / FF0A0000]
gs:0x14[4278845440 / FF0A0000]
[11:13:40][aconole@ssh:~]
$ ./sp32_t 1
gs:0x14[4278845440 / FF0A0000]
gs:0x14[4278845440 / FF0A0000]
gs:0x14[4278845440 / FF0A0000]
gs:0x14[4278845440 / FF0A0000]
gs:0x14[4278845440 / FF0A0000]
gs:0x14[4278845440 / FF0A0000]
Voila! Disabling read access means that the canary value is static. Why is this important? It's possible that the system would be misconfigured and /dev/urandom read access would be disabled. If that's the case, then our canary is well known, and we _possibly_ bypass the canary value.
NOTE: I say _possibly_ here. If we can't get 0xFF0A0000 into the stack (since \x00 is null terminator, \x0a is usually considered the line terminator) then we're hosed. However, if we _CAN_ inject that data, it's a clean way of bypass.
Let's setup a new test case to verify our stack overflow.
[11:33:06][aconole@ssh:~]
$ ./sp32_b 20 asdfasdfasdfasdfasdf 16
asdfasdfasdfasdf
*** stack smashing detected ***: ./sp32_b terminated
======= Backtrace: =========
/lib/libc.so.6(__fortify_fail+0x48)[0xf7635da8]
/lib/libc.so.6(__fortify_fail+0x0)[0xf7635d60]
./sp32_b[0x804869c]
./sp32_b[0x80488ab]
/lib/libc.so.6(__libc_start_main+0xe5)[0xf7565705]
./sp32_b[0x80485a1]
======= Memory map: ========
08048000-08049000 r-xp 00000000 08:11 7602732 /home/aconole/sp32_b
08049000-0804a000 r--p 00000000 08:11 7602732 /home/aconole/sp32_b
0804a000-0804b000 rw-p 00001000 08:11 7602732 /home/aconole/sp32_b
0804b000-0806c000 rw-p 0804b000 00:00 0 [heap]
f754e000-f754f000 rw-p f754e000 00:00 0
f754f000-f76a4000 r-xp 00000000 08:11 6381831 /lib/libc-2.9.so
f76a4000-f76a5000 ---p 00155000 08:11 6381831 /lib/libc-2.9.so
f76a5000-f76a7000 r--p 00155000 08:11 6381831 /lib/libc-2.9.so
f76a7000-f76a8000 rw-p 00157000 08:11 6381831 /lib/libc-2.9.so
f76a8000-f76ab000 rw-p f76a8000 00:00 0
f76ab000-f76c1000 r-xp 00000000 08:11 6381715 /lib/libpthread-2.9.so
f76c1000-f76c2000 r--p 00015000 08:11 6381715 /lib/libpthread-2.9.so
f76c2000-f76c3000 rw-p 00016000 08:11 6381715 /lib/libpthread-2.9.so
f76c3000-f76c5000 rw-p f76c3000 00:00 0
f76e5000-f76f2000 r-xp 00000000 08:11 6381883 /lib/libgcc_s.so.1
f76f2000-f76f3000 r--p 0000c000 08:11 6381883 /lib/libgcc_s.so.1
f76f3000-f76f4000 rw-p 0000d000 08:11 6381883 /lib/libgcc_s.so.1
f76f4000-f76f6000 rw-p f76f4000 00:00 0
f76f6000-f7714000 r-xp 00000000 08:11 6381973 /lib/ld-2.9.so
f7714000-f7715000 r--p 0001d000 08:11 6381973 /lib/ld-2.9.so
f7715000-f7716000 rw-p 0001e000 08:11 6381973 /lib/ld-2.9.so
ffe72000-ffe87000 rw-p 7ffffffea000 00:00 0 [stack]
ffffe000-fffff000 r-xp ffffe000 00:00 0 [vdso]
Aborted
[11:33:07][aconole@ssh:~]
$ sudo chmod go-r /dev/urandom
[11:33:16][aconole@ssh:~]
$ ./sp32_b 20 asdfasdfasdfasdfasdf 16
asdfasdfasdfasdf
[11:33:18][aconole@ssh:~]
$
Awesome! We have proven that it _IS_ possible to bypass the stack under severely contrived conditions. Those conditions are as follows:
1 - we use a memcpy
2 - we disable read access to /dev/urandom BEFORE the program is launched
3 - we have direct local access to the system
4 - it's 32-bits
We haven't tested 64-bits yet, so lets check that out now.
[11:36:04][aconole@ssh:~]
$ ./sp64_b 2
fs:0x28[18377501229438730240 / ff0a000000000000]
AHA! It's the same deal - just make sure our offset is correct, and we should be able to successfully exploit it.
[11:42:46][aconole@ssh:~]
$ ./sp64_b 26 asdfasdfasdfasdfasdfasdfasdfasdfasdfasdf 24
asdfasdfasdfasdfasdfasdf
[11:42:48][aconole@ssh:~]
$ sudo chmod go+r /dev/urandom
[11:42:57][aconole@ssh:~]
$ ./sp64_b 26 asdfasdfasdfasdfasdfasdfasdfasdfasdfasdf 24
asdfasdfasdfasdfasdfasdf
*** stack smashing detected ***: ./sp64_b terminated
Voila! So, we can now contrive a situation where stack canary bypass works.
Next week - playing with the linker. As a preview:
/tmp/ccUOKFpY.o: In function `reassign_sc_32':
/home/aconole/stack-protect.c:44: undefined reference to `__stack_chk_guard'
/tmp/ccUOKFpY.o: In function `reassign_sc_64':
/home/aconole/stack-protect.c:58: undefined reference to `__stack_chk_guard'
collect2: ld returned 1 exit status
Wednesday, November 25, 2009
Stack Canaries - Part 2 (64-bit, bypass techniques, etc.)
So, back for more pain, I decided to try and model the canary generation (without looking at either the kernel code which assigns the initial value to the registers, nor the libc code which might do some population). I want to treat this feature as a blackbox, at least initially.
For a little more in-depth view at what happens when we turn on stack protection (-fstack-protector-all), let's look at the values of the gs:0x14 memory location. We'll see if the assumption that it changes with each function call is correct (and if so, then we have a real conundrum on our hands involving memory in general). In order to measure this, the following C code can be slipped into a recursive function which just prints the value of gs:0x14.
For 64-bit code:
unsigned long int a = 0;
if(sizeof(unsigned int) != sizeof(unsigned long))
asm ("movq %%fs:0x28, %0\n":"=r"(a));
And for 32-bit code:
unsigned int a = 0;
if(sizeof(unsigned int) == sizeof(unsigned long))
asm ("movl %%gs:0x14, %0\n":"=r"(a));
These little snippets populate 'a' with the stack canary value. Note: 64-bit uses fs:0x28 instead of gs:0x14.
Running this reveals the following neat information:
[02:20:12][aconole@ssh:~]
$ ./sp32 4
gs:0x14[3829412976 / E4403470]
gs:0x14[3829412976 / E4403470]
gs:0x14[3829412976 / E4403470]
gs:0x14[3829412976 / E4403470]
gs:0x14[3829412976 / E4403470]
This tells us that the canary is global to the context in which the functions are running. Is this the same across threads? A simple change to the rig I used to print the values for 3 different threads reveals that yes - this canary is global to the process:
[02:25:47][aconole@ssh:~]
$ ./sp32_t 1
gs:0x14[1475995573 / 57F9E7B5]
gs:0x14[1475995573 / 57F9E7B5]
gs:0x14[1475995573 / 57F9E7B5]
gs:0x14[1475995573 / 57F9E7B5]
gs:0x14[1475995573 / 57F9E7B5]
gs:0x14[1475995573 / 57F9E7B5]
The same can be seen for 64-bit code:
[02:27:03][aconole@ssh:~]
$ ./sp64 4
fs:0x28[15354006399877715714 / d5146378bfc91702]
fs:0x28[15354006399877715714 / d5146378bfc91702]
fs:0x28[15354006399877715714 / d5146378bfc91702]
fs:0x28[15354006399877715714 / d5146378bfc91702]
fs:0x28[15354006399877715714 / d5146378bfc91702]
And threaded:
[02:27:31][aconole@ssh:~]
$ ./sp64_t 1
fs:0x28[15000647472177854910 / d02d01662c05b9be]
fs:0x28[15000647472177854910 / d02d01662c05b9be]
fs:0x28[15000647472177854910 / d02d01662c05b9be]
fs:0x28[15000647472177854910 / d02d01662c05b9be]
fs:0x28[15000647472177854910 / d02d01662c05b9be]
fs:0x28[15000647472177854910 / d02d01662c05b9be]
So, now we know some important things about the canary for 64-bit and 32-bit.
#1 - if we can obtain the value while the system is running, then there's no worries on modifying the stack.
#2 - We know from where this value is obtained.
#3 - We know how to retrieve this value in a fancy rig.
The next question, before we jump headfirst into probabilities and statistics for the segment register offset value is: Can we generically modify this global canary?
Here are two generic routines to do so:
void reassign_sc_32()
{
unsigned int a = 0;
if(sizeof(unsigned int) == sizeof(unsigned long))
asm ("movl %0, %%gs:0x14\n"::"r"(a));
}
void reassign_sc_64()
{
unsigned long int a = 0;
if(sizeof(unsigned int) != sizeof(unsigned long))
asm ("movq %0, %%fs:0x28\n"::"r"(a));
}
We can simply call: reassign_sc_64(); reassign_sc_32(); and the code should modify the canary value to 0 on the correct platform. A test reveals:
[02:32:30][aconole@ssh:~]
$ ./sp32_c_t 1
gs:0x14[0 / 0]
gs:0x14[0 / 0]
gs:0x14[0 / 0]
gs:0x14[0 / 0]
gs:0x14[0 / 0]
gs:0x14[0 / 0]
64-bit also yields the same results:
[02:34:05][aconole@ssh:~]
$ ./sp64_c_t 1
fs:0x28[0 / 0]
fs:0x28[0 / 0]
fs:0x28[0 / 0]
fs:0x28[0 / 0]
fs:0x28[0 / 0]
fs:0x28[0 / 0]
So, we can modify the canary value - OUTSTANDING! Lets turn on the stack protector and see it in action:
[02:35:21][aconole@ssh:~]
$ ./sp32_c_t 1
*** stack smashing detected ***: ./sp32_c_t terminated
Aborted
Hrrm. Not quite what I had expected. Looks like we're going to have to delve into the internals of the stack protector after all, which is something I was hoping to avoid.
The code for stack-protector.c below:
-Aaron
For a little more in-depth view at what happens when we turn on stack protection (-fstack-protector-all), let's look at the values of the gs:0x14 memory location. We'll see if the assumption that it changes with each function call is correct (and if so, then we have a real conundrum on our hands involving memory in general). In order to measure this, the following C code can be slipped into a recursive function which just prints the value of gs:0x14.
For 64-bit code:
unsigned long int a = 0;
if(sizeof(unsigned int) != sizeof(unsigned long))
asm ("movq %%fs:0x28, %0\n":"=r"(a));
And for 32-bit code:
unsigned int a = 0;
if(sizeof(unsigned int) == sizeof(unsigned long))
asm ("movl %%gs:0x14, %0\n":"=r"(a));
These little snippets populate 'a' with the stack canary value. Note: 64-bit uses fs:0x28 instead of gs:0x14.
Running this reveals the following neat information:
[02:20:12][aconole@ssh:~]
$ ./sp32 4
gs:0x14[3829412976 / E4403470]
gs:0x14[3829412976 / E4403470]
gs:0x14[3829412976 / E4403470]
gs:0x14[3829412976 / E4403470]
gs:0x14[3829412976 / E4403470]
This tells us that the canary is global to the context in which the functions are running. Is this the same across threads? A simple change to the rig I used to print the values for 3 different threads reveals that yes - this canary is global to the process:
[02:25:47][aconole@ssh:~]
$ ./sp32_t 1
gs:0x14[1475995573 / 57F9E7B5]
gs:0x14[1475995573 / 57F9E7B5]
gs:0x14[1475995573 / 57F9E7B5]
gs:0x14[1475995573 / 57F9E7B5]
gs:0x14[1475995573 / 57F9E7B5]
gs:0x14[1475995573 / 57F9E7B5]
The same can be seen for 64-bit code:
[02:27:03][aconole@ssh:~]
$ ./sp64 4
fs:0x28[15354006399877715714 / d5146378bfc91702]
fs:0x28[15354006399877715714 / d5146378bfc91702]
fs:0x28[15354006399877715714 / d5146378bfc91702]
fs:0x28[15354006399877715714 / d5146378bfc91702]
fs:0x28[15354006399877715714 / d5146378bfc91702]
And threaded:
[02:27:31][aconole@ssh:~]
$ ./sp64_t 1
fs:0x28[15000647472177854910 / d02d01662c05b9be]
fs:0x28[15000647472177854910 / d02d01662c05b9be]
fs:0x28[15000647472177854910 / d02d01662c05b9be]
fs:0x28[15000647472177854910 / d02d01662c05b9be]
fs:0x28[15000647472177854910 / d02d01662c05b9be]
fs:0x28[15000647472177854910 / d02d01662c05b9be]
So, now we know some important things about the canary for 64-bit and 32-bit.
#1 - if we can obtain the value while the system is running, then there's no worries on modifying the stack.
#2 - We know from where this value is obtained.
#3 - We know how to retrieve this value in a fancy rig.
The next question, before we jump headfirst into probabilities and statistics for the segment register offset value is: Can we generically modify this global canary?
Here are two generic routines to do so:
void reassign_sc_32()
{
unsigned int a = 0;
if(sizeof(unsigned int) == sizeof(unsigned long))
asm ("movl %0, %%gs:0x14\n"::"r"(a));
}
void reassign_sc_64()
{
unsigned long int a = 0;
if(sizeof(unsigned int) != sizeof(unsigned long))
asm ("movq %0, %%fs:0x28\n"::"r"(a));
}
We can simply call: reassign_sc_64(); reassign_sc_32(); and the code should modify the canary value to 0 on the correct platform. A test reveals:
[02:32:30][aconole@ssh:~]
$ ./sp32_c_t 1
gs:0x14[0 / 0]
gs:0x14[0 / 0]
gs:0x14[0 / 0]
gs:0x14[0 / 0]
gs:0x14[0 / 0]
gs:0x14[0 / 0]
64-bit also yields the same results:
[02:34:05][aconole@ssh:~]
$ ./sp64_c_t 1
fs:0x28[0 / 0]
fs:0x28[0 / 0]
fs:0x28[0 / 0]
fs:0x28[0 / 0]
fs:0x28[0 / 0]
fs:0x28[0 / 0]
So, we can modify the canary value - OUTSTANDING! Lets turn on the stack protector and see it in action:
[02:35:21][aconole@ssh:~]
$ ./sp32_c_t 1
*** stack smashing detected ***: ./sp32_c_t terminated
Aborted
Hrrm. Not quite what I had expected. Looks like we're going to have to delve into the internals of the stack protector after all, which is something I was hoping to avoid.
The code for stack-protector.c below:
#include
#include
int stack_prot_64(int num_left)
{
unsigned long int a = 0;
if(sizeof(unsigned int) != sizeof(unsigned long))
asm ("movq %%fs:0x28, %0\n":"=r"(a));
if(!num_left)
{
return printf("fs:0x28[%lu / %lx]\n", a, a);
}
stack_prot_64(num_left-1);
return printf("fs:0x28[%lu / %lx]\n", a, a);
}
int stack_prot_32(int num_left)
{
unsigned int a = 0;
asm ("movl %%gs:0x14, %0\n":"=r"(a));
if(!num_left)
{
return printf("gs:0x14[%lu / %X]\n", a, a);
}
stack_prot_32(num_left-1);
return printf("gs:0x14[%lu / %X]\n", a, a);
}
void reassign_sc_32()
{
unsigned int a = 0;
asm ("movl %0, %%gs:0x14\n"::"r"(a));
}
void reassign_sc_64()
{
unsigned long int a = 0;
if(sizeof(unsigned int) != sizeof(unsigned long))
asm ("movq %0, %%fs:0x28\n"::"r"(a));
}
void *run32(void *a)
{
#ifdef GAPING_SECURITY_HOLE
reassign_sc_32();
#endif
stack_prot_32(atoi(a));
return NULL;
}
void * run64(void *a)
{
#ifdef GAPING_SECURITY_HOLE
reassign_sc_64();
#endif
stack_prot_64(atoi(a));
return NULL;
}
int main(int argc, char *argv[])
{
pthread_t t1,t2,t3;
if(argc <= 1)
return printf("error: give a number of functions.\n");
if(sizeof(unsigned int) == sizeof(unsigned long))
{
pthread_create(&t1, NULL, run32, argv[1]);
pthread_create(&t2, NULL, run32, argv[1]);
pthread_create(&t3, NULL, run32, argv[1]);
} else
{
pthread_create(&t1, NULL, run64, argv[1]);
pthread_create(&t2, NULL, run64, argv[1]);
pthread_create(&t3, NULL, run64, argv[1]);
}
pthread_join(t1, NULL);
pthread_join(t2, NULL);
pthread_join(t3, NULL);
return 0;
}
-Aaron
Tuesday, November 17, 2009
Stack Canaries - Part 1
So, for those of you who haven't paid much attention, starting...oh...3-4 years ago the GNU compiler folded in a stack guarding implementation to protect against the classic buffer overflow attack. What that means is that when you make a function call in C, the system actually tries to protect your stack by pushing a canary value (determined, I think, by the gs:14 value) during the preamble, and then comparing this just before returning. An example is below:
0x080484e4: push %ebp
0x080484e5: mov %esp,%ebp
0x080484e7: sub $0x38,%esp
0x080484ea: mov 0x8(%ebp),%eax
0x080484ed: mov %eax,-0x24(%ebp)
0x080484f0: mov %gs:0x14,%eax
0x080484f6: mov %eax,-0x4(%ebp)
0x080484f9: xor %eax,%eax
0x080484fb: movl $0x0,-0x14(%ebp)
0x08048502: movl $0x0,-0x10(%ebp)
0x08048509: movl $0x0,-0xc(%ebp)
0x08048510: movl $0x0,-0x8(%ebp)
The above does the standard save the return address, and make space on the stack. Then it zeros out the local stack argument (vbuf[16] = {0}. See addrs 0x080484fb -> 0x08048510). So, what we see is the "bottom" of the stack contains the canary, followed by the argument(s). If, say, we were to try and copy more than 16 bytes into vbuf, it would overwrite the canary value starting at ebp-4. The check comes at the very end of the function:
0x08048543: mov -0x4(%ebp),%eax
0x08048546: xor %gs:0x14,%eax
0x0804854d: je 0x8048554
0x0804854f: call 0x8048418 <__stack_chk_fail@plt>
0x08048554: leave
0x08048555: ret
So, the function cleanup routine first puts the canary value into register eax, then does exclusive or of gs:14 with eax. If the values match, the exclusive or would give a 0 value, and the je branch would be taken allowing execution to resume. However, if we fail, execution would be transferred to __stack_chk_fail@plt function. This function looks as follows:
0x08048418 <__stack_chk_fail@plt+0>: jmp *0x804a014
0x0804841e <__stack_chk_fail@plt+6>: push $0x28
0x08048423 <__stack_chk_fail@plt+11>: jmp 0x80483b8 <_init+48>
The very first thing that happens is we jmp into the global offest table, and start our execution, which should print out some memory map information, as well as a debug message that stack corruption was detected, etc. The C code for this is as follows:
#include "string.h"
void foo(char *buf)
{
char vbuf[16] = {0};
memcpy(vbuf, buf, strlen(buf));
printf(vbuf); /* might as well be really vulnerable :) */
}
int main(int argc, char *argv[])
{
printf("%s: %s\n", argv[0], argv[1]);
foo(argv[1]);
printf("\n%s: %s\n", argv[0], argv[1]);
return 0;
}
And execution with some different length arguments yields the following results:
[12:07:11][aconole@ssh:~]
$ gcc -fstack-protector -m32 -o bodud bodud.c
bodud.c: In function âfooâ:
bodud.c:7: warning: incompatible implicit declaration of built-in function âmemcpyâ
bodud.c:7: warning: incompatible implicit declaration of built-in function âstrlenâ
[12:07:23][aconole@ssh:~]
$ ./bodud asdf
./bodud: asdf
asdf
./bodud: asdf
[12:07:25][aconole@ssh:~]
$ ./bodud asdfasdf
./bodud: asdfasdf
asdfasdf
./bodud: asdfasdf
[12:07:28][aconole@ssh:~]
$ ./bodud asdfasdfasdf
./bodud: asdfasdfasdf
asdfasdfasdf
./bodud: asdfasdfasdf
[12:07:29][aconole@ssh:~]
$ ./bodud asdfasdfasdfasdf
./bodud: asdfasdfasdfasdf
asdfasdfasdfasdföuóc
ÿ ¡e
ÿe
ÿ¡e
ÿù 0c
ÿôÿv÷c
؍bֈ 0
c
ÿçb÷
./bodud: asdfasdfasdfasdf
[12:07:32][aconole@ssh:~]
$ ./bodud asdfasdfasdfasdfasdf
./bodud: asdfasdfasdfasdfasdf
*** stack smashing detected ***: ./bodud terminated
======= Backtrace: =========
/lib/libc.so.6(__fortify_fail+0x48)[0xf7729da8]
/lib/libc.so.6(__fortify_fail+0x0)[0xf7729d60]
./bodud[0x8048554]
./bodud[0x804859b]
/lib/libc.so.6(__libc_start_main+0xe5)[0xf7659705]
./bodud[0x8048451]
======= Memory map: ========
08048000-08049000 r-xp 00000000 08:11 7602791 /home/aconole/bodud
08049000-0804a000 r--p 00000000 08:11 7602791 /home/aconole/bodud
0804a000-0804b000 rw-p 00001000 08:11 7602791 /home/aconole/bodud
0804b000-0806c000 rw-p 0804b000 00:00 0 [heap]
f7642000-f7643000 rw-p f7642000 00:00 0
f7643000-f7798000 r-xp 00000000 08:11 6381831 /lib/libc-2.9.so
f7798000-f7799000 ---p 00155000 08:11 6381831 /lib/libc-2.9.so
f7799000-f779b000 r--p 00155000 08:11 6381831 /lib/libc-2.9.so
f779b000-f779c000 rw-p 00157000 08:11 6381831 /lib/libc-2.9.so
f779c000-f779f000 rw-p f779c000 00:00 0
f77bf000-f77cc000 r-xp 00000000 08:11 6381883 /lib/libgcc_s.so.1
f77cc000-f77cd000 r--p 0000c000 08:11 6381883 /lib/libgcc_s.so.1
f77cd000-f77ce000 rw-p 0000d000 08:11 6381883 /lib/libgcc_s.so.1
f77ce000-f77d0000 rw-p f77ce000 00:00 0
f77d0000-f77ee000 r-xp 00000000 08:11 6381973 /lib/ld-2.9.so
f77ee000-f77ef000 r--p 0001d000 08:11 6381973 /lib/ld-2.9.so
f77ef000-f77f0000 rw-p 0001e000 08:11 6381973 /lib/ld-2.9.so
ffb6f000-ffb84000 rw-p 7ffffffea000 00:00 0 [stack]
ffffe000-fffff000 r-xp ffffe000 00:00 0 [vdso]
asdfasdfasdfasdfasdfÈ'¸ÿ
[12:07:36][aconole@ssh:~]
$
So, we see that we have triggered the buffer overflow condition and can correctly cause an exception in the stack guard protector. The big question is, can we reassign the stack protector entrypoint to yield a different result. Just from a theoretical standpoint.
Lets try the good 'ol bruteforce method. We'll set the entrypoint of __stack_chk_fail@plt to 195 (the decimal value of ret).
[01:27:32][aconole@ssh:~]
$ gdb ./bodud
GNU gdb (GDB; openSUSE 11.1) 6.8.50.20081120-cvs
Copyright (C) 2008 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-suse-linux".
For bug reporting instructions, please see:
...
(gdb) r asdf
Starting program: /home/aconole/bodud asdf
/home/aconole/bodud: asdf
Program received signal SIGSEGV, Segmentation fault.
0x08048595 in main (argc=2, argv=0xffffd484) at bodud.c:16
16 *f = 195;
(gdb)
As we can see, that region of memory seems to be protected from alteration which confirms the output of the following line (from our original dump):
08048000-08049000 r-xp 00000000 08:11 7602791 /home/aconole/bodud
So, it seems as though we'll need to find another way of either bypassing the call to __stack_chk_fail or putting gs:14 into our stack.
0x080484e4
0x080484e5
0x080484e7
0x080484ea
0x080484ed
0x080484f0
0x080484f6
0x080484f9
0x080484fb
0x08048502
0x08048509
0x08048510
The above does the standard save the return address, and make space on the stack. Then it zeros out the local stack argument (vbuf[16] = {0}. See addrs 0x080484fb -> 0x08048510). So, what we see is the "bottom" of the stack contains the canary, followed by the argument(s). If, say, we were to try and copy more than 16 bytes into vbuf, it would overwrite the canary value starting at ebp-4. The check comes at the very end of the function:
0x08048543
0x08048546
0x0804854d
0x0804854f
0x08048554
0x08048555
So, the function cleanup routine first puts the canary value into register eax, then does exclusive or of gs:14 with eax. If the values match, the exclusive or would give a 0 value, and the je branch would be taken allowing execution to resume. However, if we fail, execution would be transferred to __stack_chk_fail@plt function. This function looks as follows:
0x08048418 <__stack_chk_fail@plt+0>: jmp *0x804a014
0x0804841e <__stack_chk_fail@plt+6>: push $0x28
0x08048423 <__stack_chk_fail@plt+11>: jmp 0x80483b8 <_init+48>
The very first thing that happens is we jmp into the global offest table, and start our execution, which should print out some memory map information, as well as a debug message that stack corruption was detected, etc. The C code for this is as follows:
void foo(char *buf)
{
char vbuf[16] = {0};
memcpy(vbuf, buf, strlen(buf));
printf(vbuf); /* might as well be really vulnerable :) */
}
int main(int argc, char *argv[])
{
printf("%s: %s\n", argv[0], argv[1]);
foo(argv[1]);
printf("\n%s: %s\n", argv[0], argv[1]);
return 0;
}
And execution with some different length arguments yields the following results:
[12:07:11][aconole@ssh:~]
$ gcc -fstack-protector -m32 -o bodud bodud.c
bodud.c: In function âfooâ:
bodud.c:7: warning: incompatible implicit declaration of built-in function âmemcpyâ
bodud.c:7: warning: incompatible implicit declaration of built-in function âstrlenâ
[12:07:23][aconole@ssh:~]
$ ./bodud asdf
./bodud: asdf
asdf
./bodud: asdf
[12:07:25][aconole@ssh:~]
$ ./bodud asdfasdf
./bodud: asdfasdf
asdfasdf
./bodud: asdfasdf
[12:07:28][aconole@ssh:~]
$ ./bodud asdfasdfasdf
./bodud: asdfasdfasdf
asdfasdfasdf
./bodud: asdfasdfasdf
[12:07:29][aconole@ssh:~]
$ ./bodud asdfasdfasdfasdf
./bodud: asdfasdfasdfasdf
asdfasdfasdfasdföuóc
ÿ ¡e
ÿe
ÿ¡e
ÿù 0c
ÿôÿv÷c
؍bֈ 0
c
ÿçb÷
./bodud: asdfasdfasdfasdf
[12:07:32][aconole@ssh:~]
$ ./bodud asdfasdfasdfasdfasdf
./bodud: asdfasdfasdfasdfasdf
*** stack smashing detected ***: ./bodud terminated
======= Backtrace: =========
/lib/libc.so.6(__fortify_fail+0x48)[0xf7729da8]
/lib/libc.so.6(__fortify_fail+0x0)[0xf7729d60]
./bodud[0x8048554]
./bodud[0x804859b]
/lib/libc.so.6(__libc_start_main+0xe5)[0xf7659705]
./bodud[0x8048451]
======= Memory map: ========
08048000-08049000 r-xp 00000000 08:11 7602791 /home/aconole/bodud
08049000-0804a000 r--p 00000000 08:11 7602791 /home/aconole/bodud
0804a000-0804b000 rw-p 00001000 08:11 7602791 /home/aconole/bodud
0804b000-0806c000 rw-p 0804b000 00:00 0 [heap]
f7642000-f7643000 rw-p f7642000 00:00 0
f7643000-f7798000 r-xp 00000000 08:11 6381831 /lib/libc-2.9.so
f7798000-f7799000 ---p 00155000 08:11 6381831 /lib/libc-2.9.so
f7799000-f779b000 r--p 00155000 08:11 6381831 /lib/libc-2.9.so
f779b000-f779c000 rw-p 00157000 08:11 6381831 /lib/libc-2.9.so
f779c000-f779f000 rw-p f779c000 00:00 0
f77bf000-f77cc000 r-xp 00000000 08:11 6381883 /lib/libgcc_s.so.1
f77cc000-f77cd000 r--p 0000c000 08:11 6381883 /lib/libgcc_s.so.1
f77cd000-f77ce000 rw-p 0000d000 08:11 6381883 /lib/libgcc_s.so.1
f77ce000-f77d0000 rw-p f77ce000 00:00 0
f77d0000-f77ee000 r-xp 00000000 08:11 6381973 /lib/ld-2.9.so
f77ee000-f77ef000 r--p 0001d000 08:11 6381973 /lib/ld-2.9.so
f77ef000-f77f0000 rw-p 0001e000 08:11 6381973 /lib/ld-2.9.so
ffb6f000-ffb84000 rw-p 7ffffffea000 00:00 0 [stack]
ffffe000-fffff000 r-xp ffffe000 00:00 0 [vdso]
asdfasdfasdfasdfasdfÈ'¸ÿ
[12:07:36][aconole@ssh:~]
$
So, we see that we have triggered the buffer overflow condition and can correctly cause an exception in the stack guard protector. The big question is, can we reassign the stack protector entrypoint to yield a different result. Just from a theoretical standpoint.
Lets try the good 'ol bruteforce method. We'll set the entrypoint of __stack_chk_fail@plt to 195 (the decimal value of ret).
[01:27:32][aconole@ssh:~]
$ gdb ./bodud
GNU gdb (GDB; openSUSE 11.1) 6.8.50.20081120-cvs
Copyright (C) 2008 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-suse-linux".
For bug reporting instructions, please see:
(gdb) r asdf
Starting program: /home/aconole/bodud asdf
/home/aconole/bodud: asdf
Program received signal SIGSEGV, Segmentation fault.
0x08048595 in main (argc=2, argv=0xffffd484) at bodud.c:16
16 *f = 195;
(gdb)
As we can see, that region of memory seems to be protected from alteration which confirms the output of the following line (from our original dump):
08048000-08049000 r-xp 00000000 08:11 7602791 /home/aconole/bodud
So, it seems as though we'll need to find another way of either bypassing the call to __stack_chk_fail or putting gs:14 into our stack.
Friday, July 31, 2009
All's well that end's well
So, it looks like just a few tweaks were required to get my jffs2 filesystem working. So, with this phase of the project wrapped up, I'm now onto a really tricked out set of things to do.
1) Build a "from-scratch" version of the system which only contains qte/qtopia and konqueror (plus the associated required stuff)
2) build a version of ruby for the board which strips down to roughly 9 or so mb
3) strip down metasploit to 15mb
4) get a working cheops-esque software on the board
5) general debugging / testing.
I'll post on each one as I do it. Expect that this project will take months upon months.
1) Build a "from-scratch" version of the system which only contains qte/qtopia and konqueror (plus the associated required stuff)
2) build a version of ruby for the board which strips down to roughly 9 or so mb
3) strip down metasploit to 15mb
4) get a working cheops-esque software on the board
5) general debugging / testing.
I'll post on each one as I do it. Expect that this project will take months upon months.
Subscribe to:
Posts (Atom)