Malicious Python Applications: Creation, Examples, Analysis, and Detection Methods

Malicious Python Applications: Creation, Examples, Analysis, and Detection Methods

For the past 30 years, the vast majority of serious malware has been written in assembler or compiled languages such as C, C++, and Delphi. However, over the last decade, malware development has become more diverse, with an increasing number of threats written in interpreted languages like Python. The low entry barrier, ease of use, rapid development speed, and a vast library ecosystem have made Python attractive to most programmers, including malware authors. Python is now a popular tool for creating trojans, exploits, information stealers, and more. As Python’s popularity continues to grow and the entry barrier for C-based malware remains high, it’s clear that Python will be increasingly used in cyberattacks, including the development of various types of trojans.

Figure 1: Popularity trends of major programming languages over the last decade

Changing Times

Compared to standard compiled languages (like C), writing malware in Python presents several challenges. First, Python must be installed on the target operating system to interpret and execute the code. However, as will be shown, Python applications can be easily converted into standalone executables using various methods.

Second, Python-based malware is usually large in size, consumes more memory, and requires more computing resources. Serious malware found in the wild is often small, stealthy, and uses minimal resources. For example, a compiled C sample might be about 200 KB, while a comparable Python sample converted to an executable could be around 20 MB. Thus, interpreted languages consume much more CPU and RAM.

By 2020, digital and information technologies have advanced significantly. Internet speeds are faster, computers have more RAM and larger hard drives, and processors are more powerful. Python is now pre-installed on macOS and most Linux distributions.

No Interpreter? No Problem!

Microsoft Windows remains the primary target for most malware attacks, but Python is not installed by default. To spread malware more effectively, Python scripts must be converted into executables. There are many ways to “compile” Python. Let’s look at the most popular tools.

PyInstaller

PyInstaller converts Python scripts into standalone executables for Windows, Linux, and macOS by “freezing” the code. This is one of the most popular methods for creating executables and is widely used for both legitimate and malicious purposes.

For example, let’s create a simple “Hello, world!” program and convert it to an executable using PyInstaller:

$ cat hello.py
print('Hello, world!')

$ pyinstaller --onefile hello.py
...

$ ./dist/hello
Hello, world!

$ file dist/hello
dist/hello: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 2.6.32, BuildID[sha1]=..., stripped

$ du -sh dist/hello
7.0M    dist/hello

Notice the size difference: 7 MB (Python) vs. 20 KB (C)! This is a major drawback regarding file size and memory usage. The Python executable is much larger because it must include the interpreter (as a shared object in Linux) to run.

Py2exe

Py2exe is another popular method for converting Python code into standalone EXE files. Like PyInstaller, it bundles the interpreter to create a portable executable. However, py2exe is likely to become obsolete, as it does not support Python versions after 3.4 due to significant changes in CPython bytecode in 3.6 and above.

Py2exe uses the distutils package and requires a small setup.py file. Here’s an example:

> type hello.py
print('Hello, world!')

> type setup.py
import py2exe
from distutils.core import setup
setup(
    console=['hello.py'],
    options={'py2exe': {'bundle_files': 1, 'compressed': True}},
    zipfile=None
)

> python setup.py py2exe
...

> dist\hello.exe
Hello, world!

The file size is similar to PyInstaller (about 6.83 MB).

Figure 2: File size of an executable created with py2exe

Nuitka

Nuitka is perhaps the most underrated yet advanced method for converting Python code into executables. It first translates Python code into C, then links it with the libpython library to execute code just like CPython. Nuitka supports various C compilers, including gcc, clang, MinGW64, Visual Studio 2019+, and clang-cl.

Let’s compile a simple “Hello, world!” program with Nuitka:

$ cat hello.py
print('Hello, world!')

$ nuitka3 hello.py
...

$ ./hello.bin
Hello, world!

$ file hello.bin
hello.bin: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=..., for GNU/Linux 3.2.0, stripped

$ du -sh hello.bin
432K    hello.bin

This time, we get a portable binary of 432 KB, much smaller than with PyInstaller or py2exe. Nuitka works by converting Python modules to C code, then using libpython and static C files for execution, just like CPython.

The result is impressive, and Nuitka is likely to continue developing as a “Python compiler.” For example, future features may include anti-reverse engineering protection. There are already tools that can easily analyze binaries created with PyInstaller and py2exe to recover the original Python code. If the executable is created with Nuitka and the code is converted to C, reverse engineering becomes much more difficult.

Other Useful Tools

The vast open-source Python package ecosystem is a huge advantage for malware authors. Almost any functionality can be found online, so malware authors rarely need to write complex features from scratch. Here are three categories of simple but powerful utilities:

  • Code obfuscation
  • Screenshot creation
  • Web requests

Code Obfuscation

Malware authors have many libraries for obfuscation to make code unreadable. Examples include pyminifier and pyarmor.

Example using pyarmor:

$ cat hello.py
print('Hello, world!')

$ pyarmor obfuscate hello.py
...

$ cat dist/hello.py
from pytransform import pyarmor_runtime
pyarmor_runtime()
__pyarmor__(...)
$ python dist/hello.py
Hello, world!

Screenshot Creation

Information-stealing malware often includes screenshot functionality. Python makes this easy with libraries like pyscreenshot and python-mss.

Example using python-mss:

from mss import mss

with mss() as sct:
    sct.shot()

Web Requests

Malware often uses web requests for various tasks, including command and control, retrieving external IP addresses, downloading payloads, and more. Python makes this easy with the standard library or open-source libraries like requests and httpx.

Example to get the external IP address using requests:

import requests

external_ip = requests.get('http://whatismyip.akamai.com/').text

The Power of eval()

The built-in eval() function is generally considered risky and a security concern, but it’s very useful for malware authors. eval() can execute Python code from a string within a script, allowing high-level scripts or “plugins” to be run on the fly. This is similar to how Lua engines are used in C-based malware. Such functionality was found in well-known malware like Flame.

For example, a hacker group interacting remotely with Python-based malware can use eval() to execute code directly on the target system, adding new features as needed to remain stealthy.

Real-World Malware Examples

SeaDuke

SeaDuke is probably the most famous Python-based malware. In 2015-2016, the US Democratic National Committee (DNC) was compromised by two groups attributed to APT 28 and 29. The Unit 42 team at Palo Alto conducted an impressive analysis of SeaDuke, and the decompiled source code is available. F-Secure also published a great document on SeaDuke and related malware.

SeaDuke is a trojan written in Python, converted to a Windows executable using PyInstaller and packed with UPX. The source code was obfuscated to hinder analysis. It has many features, including stealthy persistence in Windows, cross-platform execution, and web requests for command and control.

Figure 3: SeaDuke code sample

PWOBot

PWOBot is another well-known malware, also compiled with PyInstaller. Its main activity was between 2013-2015, targeting several European organizations, mainly in Poland. PWOBot had many features, including keylogging, persistence, file download and execution, Python code execution, web requests, and cryptocurrency mining. Unit 42 at Palo Alto conducted an excellent analysis of PWOBot.

PyLocky

PyLocky is ransomware compiled with PyInstaller. Its main activity was observed in the US, France, Italy, and Korea. It includes sandbox evasion, command and control, and file encryption using the 3DES algorithm. Trend Micro provided a good analysis, and Talos Intelligence created a file decryptor for victims.

PoetRAT

PoetRAT is a trojan that targeted the Azerbaijani government and energy sector in early 2020. It established persistence and stole information related to ICS/SCADA systems controlling wind turbines. The malware was delivered via Word documents and had many features, including FTP file downloads, webcam image capture, additional utility downloads, keylogging, browser data theft, and credential stealing. Talos Intelligence published a detailed article on the actor behind this malware.

Figure 4: Code snippet for webcam image capture

Open-Source Malware

In addition to wild malware, there are open-source trojans like pupy and Stitch. These demonstrate how complex and multifunctional such applications can be. Pupy is cross-platform, runs entirely in memory, leaves few traces, supports multiple encrypted command channels, process migration via reflective injection, and can remotely load Python code from memory.

Malware Analysis Tools

There are many tools for analyzing Python-based malware, even in compiled form. Here are some of the most useful:

uncompyle6

uncompyle6 is a cross-platform decompiler that converts Python bytecode back to source code. For example, a simple “Hello, world!” script compiled to a pyc file can be restored using uncompyle6:

$ xxd hello.cpython-38.pyc
...

$ uncompyle6 hello.cpython-38.pyc | grep -v '#'
print('Hello, world!')

pyinstxtractor.py (PyInstaller Extractor)

PyInstaller Extractor can extract Python data from executables compiled with PyInstaller:

python pyinstxtractor.py hello.exe
...

This produces pyc files, which can be decompiled with uncompyle6.

python-exe-unpacker

The pythonexeunpack.py script can be used to unpack and decompile executables compiled with py2exe:

> python python_exe_unpack.py -i hello.exe
...

Detecting Compiled Files

During compilation, PyInstaller and py2exe add unique strings to the executable, making detection with YARA rules easier.

PyInstaller writes the string “pyi-windows-manifest-filename” near the end of the executable, visible in a hex editor (HxD):

Figure 6: Unique string added by PyInstaller during compilation

Below is a YARA rule for detecting executables compiled with PyInstaller (source):

import "pe"

rule PE_File_pyinstaller
{
    meta:
        author = "Didier Stevens (https://DidierStevens.com)"
        description = "Detect PE file produced by pyinstaller"
    strings:
        $a = "pyi-windows-manifest-filename"
    condition:
        pe.number_of_resources > 0 and $a
}

Another YARA rule for detecting executables compiled with py2exe (source):

import "pe"

rule py2exe
{
  meta:
        author = "Didier Stevens (https://www.nviso.be)"
        description = "Detect PE file produced by py2exe"
  condition:
        for any i in (0 .. pe.number_of_resources - 1):
          (pe.resources[i].type_string == "P\x00Y\x00T\x00H\x00O\x00N\x00S\x00C\x00R\x00I\x00P\x00T\x00")
}

Conclusion

This concludes our overview of Python-based malware. It’s fascinating to watch trends change as computer systems become more powerful and easier to use. As security professionals, we must closely monitor Python-based malware, or we may face problems when we least expect them.

Leave a Reply