Malware has been increasing in complexity year on year for the last 10 years or. The more complexed versions used by cyber criminals use special internal algorithms to generate randomised domain names which are used for communicating with their Command and Control (C2) servers. These algorithms are called Domain Generating Algorithms or DGA.
The use of DGA was first used with malware back around 2008, one of the most famous ones around that time was Conficker, which surprisingly is still infecting machines today.
Why Domain Generating Algorithms?
The use of DGA is being used more and more as the lifeline of the malware depends upon communicating with domain names and/or IP addresses to obtain its instructions and payload from the command and control servers. However, these domain names are often taken down or are controlled by the authorities on a regular basis which end up stopping the malware infection.
This had led to the malware authors needing to design their malware software so that it can automatically move to new domain names on a regular basis and make it harder for detection and take down of their critical services.
Additionally, this method of being able to generate domain names automatically makes it harder for the authorities to shut down the botnet networks due to the sheer number of domain names that are in use as well as the way the malware communicates.
How does it all work?
The DGA algorithms have several features that make the whole thing work, but also have some steps that are put in place to try and make any reverse engineering as hard as possible for the anti-malware vendors and security researchers.
- Have routines which are written to generate domain names automatically that are predictable to both sides of the communication change, this being the malware and C2 servers.
- The cost of generating domain names must be low, over time thousands of domain names will be generated. The domain name registration fee will soon add up.
- The generation routines need to be unpredictable to outsiders, such as anti-malware companies. This ensures that their networks can’t be predicted and intercepted.
- The registration of domain names must remain untraceable. The last thing that the malware authors need is the authorities tracing their address.
An example of a DGA routine is shown below:
(Above image from Cisco Blog)
In the above routine, you can see the following features which help define the requirements of generating a domain:
- They are defining the top level domain names (TLDs), such as ‘.com’,.’co.uk’ etc.
- The code will change with time
- The element variable is seeded.
When the algorithm is executed the seed value is populated by some value, whether this is a number of a word. This is usually updated on each new version of the malware. Then the seed and the time-based element (the part of the algorithm that tries to make it hard for reverse engineering) are combined in the algorithm to create the domain name that is going to be used, and then suffixed with a random value from one of the valid TLDs that have been defined.
Domain Generating Algorithms (DGA) are used by malicious people to help prevent their malware and other services from being taken down by having their software automatically generating and using random domain names which can be unpredictable at the best of times.
This works as both the malware software and the command and control servers both use the same algorithm therefore they will both know what domains are going to be used at any given time allowing for reliable communication.
More information on DGA can be found below: