Welcome to the first experiment in the “Under the Scope” series where we analyze the make up and behaviors of malicious software (Malware). We are going to deep dive into malware within the safe compounds of a virtual environment or sandbox. We talked about how to create a virtual environment in a past article called How to Install Virtualbox on a Mac. The process to install Virtualbox is similar on a Windows computer. Within our virtual environment, we will perform static analysis and dynamic or behavioral analysis to get a better understanding of the malicious software. Malware can take the form of a virus, ransomeware, trojan horse, spyware, worms, or any other type of code that will perform some sort of malicious act to destroy, steal, copy, or hold data for ransom.
Static analysis is the process of dissecting the malware and studying its components without launching or executing the software. Performing static analysis may disclose visible strings that may give clues on what the malware will do or try to connect to. We can take the analysis deeper by trying to decompile or reverse engineer the malware to get a full in-depth view of the malicious code. By decompiling the software we can analyze the code and determine exactly what the malicious code is going to try to do when let loose in a live environment.
Dynamic or behavioral analysis is the process of executing the malware and observing it run on the host computer. Keep in mind that we are still in a virtual environment. With a behavioral analysis, we can visually watch what the malware is going to do as well as follow it’s network traffic. With a protocol analyzer, we can capture those packets for further analysis.
[ Under The Scope ]
The Portable Document Format, or PDF as its known, is one of the most used file types today, next to .doc, .xls, etc. PDFs are used to create digital documents for newsletters, brochures, resumes, books, interactive forms, etc. It is a flexible file format that has the capability of combining graphics and text to create digital documents that look like printed paper and portable to be digitally shared. The widely accepted PDF document is used in all industries and because of that, most individuals will double-click on a PDF document without a second thought.
People are the weakest link to any security infrastructure. Bad agents know that some networks are highly secured and difficult to compromise. However, with some social engineering, a bad agent could send a maliciously designed PDF document to an unsuspecting victim.
I obtained this CVE-2010-4091 Adobe Zero Day “printSeps()” malicious PDF from Lenny Zeltser’s website (zeltser.com) to examine for educational purposes only. This PDF exploits the EScript.api plugin that is available in Adobe Reader and Acrobat 10.x before 10.0.1, 9.x before 9.4.1, and 8.x before 8.2.6 on both Windows and OS X.
A Zero Day vulnerability is a security hole within software that is unknown to the vendor or any other 3rd party virus scanner. The Zero Day vulnerability can go undetected for days, weeks, months, or until someone goes hunting for them.
The malicious PDF downloaded as a password protected zipped file with it’s name as its MD5 Sum. To confirm the integrity of the file, the file was unzipped, revealing the file.pdf_ file, in a virtualized Kali Linux environment and compared the actual MD5 sum with it’s zipped filename.
The file was unzipped with a password revealing the file.pdf_ PDF document.
Matching the md5sum (d000e74163e34fc65914676674776284) of file.pdf_ to the d000e74163e34fc65914676674776284.zip filename. Also compared the md5 to an authority site like VirusTotal.
We cannot trust a file just because it has an extension of .pdf. To determine the type of file the file really is, file.pdf_ was checked with the file command. In the image below, we can see that it is a PDF document, version 1.4.
Before we do a deep dive into the file, we can do quick checks to see what this malware has to offer without launching the application or in this case opening the file in Adobe reader. We can check the file to see if it will reveal anything significant in the form of a string. Doing this could possibly expose an IP address, a URL, etc.
The string command did show something questionable.
Checking the hex dump with the xxd command will reveal the same information as we can see in the image below shows that this file is a PDF-1.4 and we can also see the questionable (this.exploit\(\)) string.
With pdf-parser, we can now dig a little bit deeper into the file. We can see that obj 14 is referencing 13 for the /JS folder.
In the code, we can see that there are two strings that are being obfuscated.
The number 32768 is a significant number because it is the range of an integer in a 16 bit environment. The maximum range would then be 65536 (-32767 and 32767).
The line var a=app.viewerVersion; retrieves the app’s version and the if-statements determines if the app version is greater than version 8 or lower than version 10. If no, exit. If yes, call the sdlfkasdfiasdflaksdflaf(number) function and pass the value of number to the function.
The string length of large_hahacode = 19 and large_heap = 2.
The while-loop concatenates the large_heap string with itself several times over until the length of the large_heap is less than or equal to 32768. The length of the large_heap is 2 multiplied by the number of loops (32768) equates to 65536. The while-loop is attempt to create data (the string) that is the length or the size of an unsigned integer which is 65536.
The variable large_heap is then assigned the value of the large_heap.substring (0,32768 – large_hahacode.length). Working out the math, 32768 – 19 (length of the unescaped string) equals 32749.
A new Array() named memory was created and then the for-loop takes the sum of the string value of large_heap and large_hahacode and inserts it into “memory” while [i]ndex is less than 4131 plus the zero or 4132, the conversion value of hex 0x1024.