A (quick) pathology of modern malware
In this post I’ll be mainly referencing Microsoft, but there is still a lot of the content that applies to all OSs. In the early days the huge adoption of MS meant that it was a natural target, but any system capable of executing code can be exploited by malware.
Malware advances at an astonishing rate, an advance that is not slowing down but rather speeding up. Early malware samples were basic constructs that exploited the workings of the Microsoft DOS operating system. Such basic subversions were easy to detect and led quickly to the birth of the first Anti-Virus programs. From this moment erupted a digital war between the malware coders and the Anti-Virus coders, a digital arms race that is fought ferociously across the internet, servers, desktops, laptops and mobile devices daily.
The malware lurking in the recesses of the digital world today owe a lot to their early DOS brethren, many of the techniques developed in these ancient viruses are still valid and in many instances are core concepts used by malware to this day. For example, malware is often considered to be ‘mobile code’. This is not a reference to the platform it runs on; rather that the code often cannot predict where it will run in memory. This presents a problem, most compiled code (legitimate code) knows where it is located and uses offsets to get data or control the flow of code … If the code can be located at different positions each time it is run then there needs to be a way of relocating the code and offsets (independent of the system’s own relocation mechanism). The code that malware uses to do this has remained mainly unchanged since the days of DOS.
Whilst Windows (and in more recent years Mac OS and Linux) presented a new challenge for the malware creators, it also presented new areas for development. Malware writers have found numerous ways to evade detection by Anti-Virus software by exploiting the structure of the Portable Executable (PE) format used by all modern Windows systems. The PE (32-bit programs) and PE+ (64-bit programs) formats allow for many more complex constructions than their DOS (16 bit programs) and therefore more complex malware techniques. Basic concepts developed in the days of DOS malware were developed further and became invaluable strategies for creating effective malware.
If anything, it could be argued that the advent of modern malware would not have been possible if it was not for the invention of the modern Operating Systems and a need for more complex functionality. This is not to say that malware would not exist, merely drawing attention to the fact that less complex systems limited the capabilities of the malware authors and the introduction of a complex operating system opened up new coding corridors that were once closed or even non-existent.
So where does this situation leave us today? Well to successfully evade detection on a modern operating system, malware must hide. In the past this led to strategies such as polymorphism and metamorphism, techniques that hide the malicious code through encryption and obfuscation. As Anti-Virus programs have become better at detecting this type of code, malware authors have adapted the strategy. For example, the 1260 virus (1990) used a cipher mechanism and a randomised decryption routine to thwart detection; effectively making each copy of the virus different to the last.
Early detection attempts revolved around key system files or sacrificial executables which were examined on a regular basis for any changes, which in turn would indicate that an infection was present. Looking forward we can find similar malware samples that use the same technique but are coded to prevent infection of key system or sacrificial executables. The Anti-virus programs also developed and heuristic / behaviour based scanning became an effective tool for detecting these advanced malware samples.
This is a cat and mouse game played out over the decades. Malware authors have found new and novel ways to make their creations more difficult to detect and remove, while the Anti-Virus community develop new techniques for accurate detection. Malware has used packing (a type of compression applied to executable code) to great effect over the years and new packing technologies have helped existing malware to evade detection. Today there are tools such as Hyperion and Veil which can embed malware code in legitimate executables making detection difficult.
A lot of modern malware samples are designed to exist only in memory, no on disk encrypted portion exists. This has a number of advantages (as well as a few disadvantages), the main ones being that as no file exists with malware in it there is nothing for Anti-Virus programs to scan and a power reset of the machine will erase all forensic evidence of the malware. This latter point is important as samples of malware are required for Anti-Virus companies to disassemble and analyse. Without analysis of a sample a detection method cannot be determined and the malware continues to evade detection.
The obvious drawback is the lack of persistence. Once the machine is rebooted the malware is removed from memory and re-infection is necessary for continued persistence on the target host. Often the malware is delivered via a dropper program, a program responsible for getting a copy of the malware from a trusted source and then executing it in memory. The dropper code will be heavily armoured to prevent accidental execution while being examined by Anti-Virus programs and will be difficult to classify as malicious in its own right.
“But surely it is just a case of working out all the possible malware code values …”. Whilst this sounds practical in theory it just falls apart in practise. In fact Fredric Cohen wrote a mathematical proof of why this is impossible in any complex system, it is rather technical but the layman’s interpretation is :
“For every virus that is detectable, there exists a subset of this virus which is undetectable using the same criteria”
Whilst Mr Cohen was referring to computer viri, however the same holds true for malware in general. We can understand this another way, if asked to write a simple program that adds two numbers together. There are obvious solutions to this problem there are also a plethora of ways to implement the program, all of which will result in different code. The code is different, but the end effect is the same. In fact, on a modern computer the number of different ways this can be done is so huge that there are more possible solutions than visible atoms in the universe. Not all implementations will be efficient or even make much sense (adding the number 200 times in a loop then subtracting the result of a multiplication of the number by 199) BUT they are variants of the same program and the end effect is the same. In fact, if memory is unlimited and time is not a factor; there are in theory an infinite number of ways the program can be implemented (all different to each other).
What does this all mean, is the game over? Did the malware authors win? Well … no. But neither has the Anti-Virus community. When new varieties of malware are created it often takes a paradigm shift in order to successfully detect and prevent their execution. This shift is evident today in new platforms aiming to achieve the same results as traditional Anti-Virus but using new technologies such as machine learning and expert systems to analyse running processes in memory. The real interesting question is what the future holds.
The increasing complexity of modern operating systems in combination with the ‘natural selection’ caused by the available Anti-Virus programs has led to a ‘manual evolution’ of malware code. The code seen today can be considered the current generation which has leveraged the most successful strategies of its ancestors. There are samples that exist today that attempt to evolve in the field, although such code is rare. Samples also exist that leverage genetic algorithms to improve survivability, such malware can turn a 99.9% detection ratio into a 0.01% detection ratio within one generation of the malware code. Each new generation after that will attempt to lower the ratio further.
Fortunately, such malware is still relatively easy to detect mainly as the evolution is constrained and contrived. The trend suggests that one day a truly evolving computer malware will be created and the equivalent of a digital immune system designed to combat this new threat will be incarnated, perhaps arguably the first true digital artificial lifeforms will emerge and interesting epidemic effects will be present on our systems … seasonal flu of the digital variety! Will our mobile phones ‘take the day off work’ due to high chip temperatures and a mild memory corruption? Using the extra time to run necessary immunity functions to restore normal operation…
Only time will tell what the future landscape of this digital war will be littered with, what can be extrapolated from experience is that the malware problem is not going away and the current trend is pushing towards more virulent and capable code, perhaps one day with a direction of its own design.