An Analysis of Modified VeraCrypt binaries (Part 2)

Continuing on the analysis of the fake VeraCrypt Windows installer distributed on httx://vera-crypt[.]com, I am now reverse-engineering the downloaded payloads. Before I can jump to the main functionalities of the malware, I have to go through obfuscation and anti-analysis techniques. This part goes in details into these techniques, and is targeted at above-beginner reverse-engineers. I am also sharing IDAPython scripts to decrypt encrypted strings. For the real payload analysis, see Part 3.

Part 1 summary: The distributed binary is a modified VeraCrypt installer that installs a modified copy of VeraCrypt which fetches a first stage payload from a remote server, which in turn downloads a bunch of binaries and stores them on disk, some of them encrypted (XORed) with an ID generated by the server.

Second-stage payload

After the first-stage payload drops several files on disk, it runs the [ID].exe image:

Making the [ID].exe file path, the URL, downloading and writing the payload to disk, and executing it

Let’s dig into [ID].exe! Although I was analyzing the 64-bit modified VeraCrypt and the first-stage payload was 64-bit accordingly, this executable is 32-bit.

Obfuscated starting function

Complications arise.

The binary is designed to waste my time.

First, the entry point is a function that places data on the stack, an integer at a time, with hundreds of mov instructions. Sometimes, this is a strategy malware uses to reconstruct a code section into the stack then jump to it. But not this time. This is just plain useless.

Lots of mov instructions to slowly put data on the stack

3,820 boring mov instructions and three useless functions later, we have something.

Reconstructed main function of [ID].exe

The waste_my_time functions (naming is mine) were designed to be useless, yet fake that something normal is happening to lure tools that use static analysis into thinking that this is a legit program.

I’m not going to detail what they do, because actually, they are never executed… Indeed, the loop variable is incremented by 2 starting from 0, thus it remains even and therefore never takes the value 143. This is what we call dead code.

The real start of this program is located in what I named the real_main function.

Start ok

I should have seen it coming. In previous payloads, the author(s) often made use of the OutputDebugStringA function, likely to help them (lazily) build the program. That’s the equivalent of putting printf("I'm here") everywhere to debug a program by observing the console output. I should have started looking for calls to this function first. Good to know for next time.

The function tries to load the DLL named after the md5 hash of the lowercased username (see Part 1 for how it’s been put to disk), and call the exported function data. If successful, it loads and decrypts the big_log file, and jumps to it somehow.

[ID].exe main payload

For the sake of digging into reverse-engineering details and getting exposed to obfuscation techniques, I will describe all these steps, although we sort of know that the next useful thing to look at is that big_log file.

data()

Let’s load the md5(lower(username)).dll first, and look at the data function.

Part of the data function

The decompiled code is a bit messy, so we will study the assembly directly.

Basically, the function gets the path of the %temp% folder (GetTempPathA), appends “start3.txt” to it, to be used later. Then it gets information about the available memory through GlobalMemoryStatusEx, checks whether there’s more than 4GB of RAM, in which case it continues to the next check. If not or if there’s an insane amount of memory (like, the higher 4 bytes or the DOUBLE DWORD is negative?), the function returns. If there is less than 4GB available, at least there should be 1GB to proceed further.

The next check is about trying to figure out whether the program is being emulated or dynamically analyzed in a framework that skips Sleeps. It does so by measuring whether a Sleep has been executed and enough time has elapsed. Otherwise, this indicates the program isn’t running in a normal environment, is likely being analyzed, and therefore the program wants to terminate.

End of the data fuction

The function then tries to check for the presence of %temp%\start3.txt, which should not normally exist. If it does, this likely means that an analysis environment is faking the presence of the file to let the program continue to run (allegedly it needs that file, right?).

Finally, the program checks whether there is at least two CPU cores. Single-core/single-thread CPUs are too old to be a realistic environment nowadays.

In conclusion, data is just an anti-analysis function.

Loading, decrypting, overflowing, jumping

After making sure the program isn’t run in an analysis environment, the next payload is executed.

The interesting thing is how the payload is executed.

It is not a full-fledged EXE or DLL, it is just pure code. To be executed, it has to be copied to the stack at the right location then jumped to. It also has to be PIC (Position-Independent Code), unless the hardcoded addresses are calculated accurately.

big_log is 4,257 bytes. Keep in mind this number.

Decrypting

To “decrypt” big_log the way [ID].exe does, you can simply do (PHP):

php > $id = "89D5ACAA6B4C4765CFD8F8";  //replace with your ID
php > $bl = file_get_contents($id.'\\big_log');
php > file_put_contents($id.'\\big_log.dll", $bl ^ str_repeat($id, ceil(strlen($bl)/strlen($id))));

Looking at the decrypted content, I found the end of the file interesting.

End of decrypted big_log

Plenty of NOPs as you would see for padding, then 8 bytes and 0xFF. We’ll get back to that later.

Overflowing

The function, which I called copy_to_stack, allocates 4,244 bytes on the stack, then calls memmove(4257, &biglog, &var_1094), with var_1094 being located just 0x1094 (4,244) bytes above EBP.

You see what’s going to happen?

Copy the decrypted payload to the stack

Buffer overflow!

The copy of the big_log code (4,257 bytes) to the stack is designed to overflow the allocated buffer (only 4,244 bytes) and overwrite part of the stack. Namely, the overflow is of 4257-4244 = 13 bytes. What is being overwritten?

In the stack at EBP, you are supposed to find the saved EBP (pushed in the function prologue at .text:004019F0 here) of the previous stack frame, to be restored during the epilogue (.text:00401A16 here).

At EBP+4, you find the return address, which is the address of the next instruction in the previous function, that’s where the EIP will go after the retn at 00401A17.

Then, at EBP+8 and EBP+C, there are the arguments to the function copy_to_stack, which were in this case the address of the buffer that contains big_log, and its size.

I debugged [ID].exe and put a breakpoint on this function. Before and after the memmove, you can see what’s changing around EBP (19EC7C here).

The previous return address (402E69) has changed to 402B39. The saved EBP has been NOP’d and the first and a bit of the second argument have been overwritten.

That means when copy_to_stack finishes, EBP becomes 90909090, and the flow continues at 402B39. And what do we find at this address, which is still located in [ID].exe’s code?

Code at 402B39

That’s a jmp esp.

And what is ESP here? Let’s count: at the beginning of copy_to_stack, after the initial push EBP, we have EBP=ESP, then there are three pushes that are compensated by the add esp, 12. Then, ESP=EBP, and the pop ebp consumes one DWORD, making ESP=EBP+4 (now pointing at the return address). After retn consumes another DWORD, ESP becomes EBP+8, pointing to the last 5 bytes of big_log at 19EC84: E95FEFFFFF.

E95FEFFFFF in turn is the machine code for a relative jmp to FFFFEF5F, the 2s complement of 10A1, making it effectively go backward by 0x10A1 bytes from EIP after the jump. 0x10A1 is actually 4,257, the size of big_log. That means this jump goes back to the beginning of the payload.

Let’s summarize: [ID].exe loads and decrypts big_log to a buffer, which gets copied onto the stack while overflowing the saved EBP and return address. The flow eventually uses a gadget in [ID].exe’s code to jump to ESP, freshly overwritten with a relative jump back to the beginning of the decrypted big_log payload on the stack.

That’s the convoluted way this piece of malware executes big_log!

Executing the payload

big_log wants to load kernel32.dll and call VirtualAlloc from it, but does not want any mention of “kernel32.dll” in its code, what can it do instead?

Answer: Iterate over loaded DLL names, calculate a checksum on the name, and compare it to a hardcoded value! That’s what I understood after debugging the program for a while.

Finding kernel32.dll in big_log, breakpoint set on the comparison with a hardcoded checksum

The checksum algorithm is trivial, and can be translated into PHP as follows:

<?php
// implements the ror (rotate) instruction over a dword
function ror($data, $bits) {
	$tmp = str_pad(decbin($data), 32, "0", STR_PAD_LEFT);
	return bindec(substr($tmp, -$bits).substr($tmp, 0, -$bits));
}

function checksum($name) {
	// convert name to Unicode
	$name = mb_convert_encoding(strtoupper($name), 'UTF-16LE', 'UTF-8');

	$state = 0;
	for($i=0;$i<strlen($name);$i++) {
		$state = ror($state, 0xD);
		$state += ord($name[$i]);
	}
	return dechex($state);
}

echo checksum("KERNEL32.DLL");  //6a4abc5b

Whenever “KERNEL32.DLL” gets hashed, the digest is 6a4abc5b, which matches the comparison. Then, each exported functions of this DLL is iterated over to find a match with “VirtualAlloc”.

Finding VirtualAlloc in kernel32.dll exports by iterating on all exports

The address of VirtualAlloc is then calculated.

Calculating the address of VirtualAlloc

Then, the function is called as VirtualAlloc(NULL, 0xA68, MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READWRITE).

Calling VirtualAlloc

Moving forward, the rest of big_log is copied to the newly allocated memory then executed. This is done by pointing to the location on the stack right after the jmp 19DCD0 at 19DD29, and copying 3,424 bytes until before the padding of NOPs we noticed before. The absolute offset in the decrypted big_log file is 324.

Copy the rest of big_log and call it

Second-stage big_log

After a very useful OutputDebugStringA("HelloShell"), the following code tries to locate the address of required functions from shell32.dll and user32.dll, by using similar techniques as we have covered before. In particular, it locates:

  • GetProcAddress
  • VirtualAlloc
  • LoadLibraryA
  • GetProcessHeap
  • HeapAlloc
  • HeapReAlloc
  • HeapFree
  • CreateFileA
  • GetFileSizeE
  • ReadFile
  • CloseHandle
  • lstrcatA
  • SHGetFolderPathA
  • wsprintfA

The next steps are as follows, as I reconstructed them after debugging the code:

VirtualAlloc(NULL, 260, MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READWRITE);
SHGetFolderPathA(NULL, CSIDL_APPDATA, NULL, 0, &path);
lstrcatA(path, "id.txt");
h = CreateFileA(path, 1, FILE_SHARE_DELETE|FILE_SHARE_READ|FILE_SHARE_WRITE, NULL, CREATE_NEW|CREATE_ALWAYS, FILE_ATTRIBUTE_NORMAL, 0);
GetFileSize(h, &size);
VirtualAlloc(NULL, size, MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READWRITE);
ReadFile(h, &f);
CloseHandle(h);
wsprintfA(&id, "%s", f);
VirtualAlloc(0, 258, MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READWRITE);
SHGetFolderPathA(NULL, CSIDL_APPDATA, NULL, 0, &path2)
lstrcatA(path2,'\\');
lstrcatA(path2,id);
lstrcatA(path2,'\\');
lstrcatA(path2,'data');
OutputDebugStringA(path2);
h2 = CreateFileA(path2, GENERIC_READ, FILE_SHARE_READ, NULL, CREATE_NEW | CREATE_ALWAYS, FILE_ATTRIBUTE_NORMAL, NULL);
GetFileSize(path2, &size2);
VirtualAlloc(NULL, size2, MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READWRITE);
ReadFile(h2, &data);
CloseHandle(h2);

This basically gets the ID from id.txt, then loads [ID]\data.

data is then decrypted in memory by XORing it with id.

Next, the code is reminiscent of the PE loader found in the fake VeraCrypt binary (see Part 1). That is, making sure the DLL is a valid PE file, preparing the Import Address Table, copying the code section into a newly allocated region then calling the start function of data.

In summary, big_log simply loads and passes control over to the encrypted data DLL, thereafter named data.dll to differentiate it from the data function exported from md5(lower(username)).dll).

data.dll: String obfuscation

OutputDebugStringA("Start In Exe Nu");

After this greeting, a function is called that deals with decrypting a bunch of strings. The calls look like this:

dword_40A6E8 = (int)sub_40585A((int)"85&1DO<-!*<7-R<.3-&|R*U", "IBCC06IDNKOSI4FMIUER1E8", 23);
dword_40A198 = (int)sub_40585A((int)&unk_4060A0, "9S2L", 4);
dword_40A4D0 = (int)sub_40585A((int)&unk_4060B4, "XQI90QYP", 8);
dword_40A6E4 = (int)sub_40585A((int)&unk_4060D0, "9F3L7XE6ZK5N6", 13);
dword_40A4C4 = (int)sub_40585A((int)"b\a2!%,16\r\"9]\b", "9WSSQBTDCCT8U", 13);

The function sub_40585A simply XORs the first two arguments, the third one being the length. For the first instance, the result is: "85&1DO<-!<7-R<.3-&|RU" ^ "IBCC06IDNKOSI4FMIUER1E8", which gives “qwertyuioasddfzczxc.com”.

Oh, a domain name! It doesn’t seem to exist anymore, but apparently it was also used in another similar piece of malware in 2018 to exfiltrate data with POST requests to /p1.php. A previously known IP resolved from the domain is 176.114.6.101, located in Ukraine.
Spoiler for Part 3: this other piece of malware is from the same family!

The other string decryptions with “unk_” variables simply correspond to non-printable strings. For instance, unk_4060A0 is “\x17\x30\x5D\x21”. Once XORed with “9S2L”, it gives “.com”.

Sequential string decryption

I see some “CryptEncrypt” and “CryptHashData” names, that means we may have fun later on with some crypto! Unless the code just prepares the strings for all exported functions and in the end do not use all of them 😦

Automatic name decryption with IDAPython

Instead of debugging and renaming the variables manually in IDA, I decided to try to make my first IDAPython script. I followed this tutorial on automatically decrypting strings in a specific malware sample, then instead of adding a comment, I renamed the variable to the decrypted string.

The code goes as follows. It first gets all the Xrefs to the decryption function (located at 0x40585A), then identifies the last three pushes as the two strings to XOR and their length. I found cases where the length was pushed with a push 123, pop ebx, push ebx, so in case I find a push e** in place of the length, I instead calculate the string length, hoping it captures everything (there might be null bytes in some strings).

After XORing the two identified strings, the script identifies the next mov dword_ABCDEF, eax and renames the variable to the decrypted string.

for x in XrefsTo(0x40585A, flags=0):
  ref = x.frm
  dec = decryptAtAddress(ref)
  print "Ref Addr: 0x%x | Decrypted: %s" % (x.frm, dec)
  renameNextVar(ref, dec)

def find_previous_push(addr):
  while True:
    addr = idc.PrevHead(addr)
    if GetMnem(addr) == "push":
      #print "We found a push at 0x%x" % GetOperandValue(addr, 0)
      if "e" in GetOpnd(addr, 0):
        return [addr, -1]
      return [addr, GetOperandValue(addr, 0)]
      break

def find_next_mov(addr):
  while True:
    addr = idc.NextHead(addr)
    if GetMnem(addr) == "mov" and "eax" in GetOpnd(addr, 1):
      #print "We found a mov at 0x%x" % GetOperandValue(addr, 0)
      return [addr, GetOperandValue(addr, 0)]
      break

def decryptAtAddress(addr):
  # get last push (first string)
  [addr, arg1] = find_previous_push(addr)
  # get second to last push (second string)
  [addr, arg2] = find_previous_push(addr)
  # get third to last push (length)
  [addr, length] = find_previous_push(addr)

  # could not identify length (e.g., push ebx instead of push 10)
  if length < 0:
    s1 = GetString(arg1,-1)
    s2 = GetString(arg2,-1)
    if s1 is None:
      ls1 = 0
    else:
      ls1 = len(s1)
    if s2 is None:
      ls2 = 0
    else:
      ls2 = len(s2)

    length = max(ls1, ls2)

  # sanity check
  if length > 500:
    length = 500
 
  out = ""
  # actually XOR the strings
  for i in range(0,length):
    out += chr(Byte(arg1) ^ Byte(arg2))
    arg1 += 1
    arg2 += 1

  return out

def renameNextVar(addr, name):
  # get next mov X, eax
  [addr, arg] = find_next_mov(addr)

  if MakeNameEx(arg, name, idc.SN_NOWARN | idc.SN_NOCHECK):
    print "Renamed 0x%x to %s" % (arg, name)
  else:
    print "FAILED to rename 0x%x to %s" % (arg, name)

And this is the result:

After automatically renaming variables after their decrypted content

Beautiful! There are 350+ strings like this, so the script was quite useful.

Afterwards, strings that correspond to a library function name are loaded through GetProcAddress. To rename the corresponding variables that designate the function pointer, I also made a lazy script adapted from the pattern I’ve seen in the code: I noticed that there is a push for the string, then five instructions later, the function pointer is moved to a variable. So my script simply identifies the string being pushed and renames the variable 5 instructions later…

addr = 0x40331F
while addr < 0x403B67:
  if GetMnem(addr) == "push" and "[ebp+hModule]" not in GetOpnd(addr, 0):
    name = "_%s" % GetOpnd(addr, 0)
    print name
  else:
    addr = idc.NextHead(addr)
    continue
  nextMov = idc.NextHead(addr)
  nextMov = idc.NextHead(nextMov)
  nextMov = idc.NextHead(nextMov)
  nextMov = idc.NextHead(nextMov)
  nextMov = idc.NextHead(nextMov)
  if GetMnem(nextMov) == "mov" and "eax" in GetOpnd(nextMov, 1):
    MakeNameEx(GetOperandValue(nextMov, 0), name, idc.SN_NOWARN | idc.SN_NOCHECK)
    print "Renamed 0x%x to %s" % (GetOperandValue(nextMov, 0), name)
  addr = idc.NextHead(addr)

This gives me:

After automatically renaming function pointer variables

The script doesn’t work perfectly, but the remaining variables can be corrected manually.

data.dll: Anti-analysis checks

After decrypting many strings, data.dll checks for the presence of an emulator or a virtual machine in a number of ways.

First, the Windows username is checked against known keywords such as “sandbox”, “virus”, “malware”, “nod”, “kas”, “av”, “kis”, “STRAZNJICA.GRUBUTT”, “esset”.

Checking for known malware sandbox Windows username

Other checks are reminiscent of our previous analysis in Part 1: the size of the memory is assessed (here, the malware stops if there is less than 2.33GiB of RAM), Sleep functions are verified to be executed and not skipped, the number of CPU cores is verified (at least two cores are needed), and an inexistant file does not suddenly exist.

In addition, two new checks look for artifacts of VMware and Virtualbox by proving the registry for a known key, and for known driver files.

Simple check for the presence of VMware Tools from the registry
Checking for VirtualBox from known driver files

Overall, the anti-analysis checks run as follows.

Overall anti-analysis checks

In the next part of this write-up, I will describe the main payload of data.dll. Stay tuned!

Update: Part 3 is available and deals with analyzing the main likely payloads.

3 thoughts on “An Analysis of Modified VeraCrypt binaries (Part 2)

  1. >The checksum algorithm is trivial
    Iterating through modules and hashing looks similar to zeus malware (check getKernel32Handle in core.cpp)

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: