« A Study of Massmailing Worms | Main | Zotob Authors Jailed »
Worms, Bots and Holy Grails
This post may be long overdue, but I hope it still has value. This post comes out of thinking a lot about the relationship of bots, worms, and the role of Wormblog going forward.
The Worm Problem
The "worm problem" has a couple of facets to it. First and foremost, it's likely that the worm will attempt to propagate so rapidly and so aggressively that it will disrupt normal network operations. Secondly, it can introduce new points of unauthorized access or control in systems that are under your administrative domain. In the wake of Code Red and Nimda, the world woke up and saw the threat that network-aware malware can pose to the Internet's use. Since then, we've seen a dramatic upsurge in interest in worm "solutions" and research. These facets I list above seem to be the main concerns of anyone with an interest in worm research and hoping to apply it to the Internet at large. These are also the concerns that have lead to various commercial worm solutions.
Now that we have a clear problem statement, we can state the goals of worm protection clearly and succinctly. Any solution to "the worm problem" must be able to detect a new piece of autonomously propagating network malware (ie a worm) as fast as possible and be able to selectively identify the traffic caused by it. Optionally, for a real solution, it must enable the operator to selectively kill the malicious agent's traffic. In short, it must identify the worm and stop its propagation before it gets too far. Because network-centric views give you the biggest area of monitoring, and because we often have to think of worms as "zero day" threats (ie using an exploit specifically coded for the worm's use, even if it uses a well known vulnerability), I tend to focus Wormblog on naive, network-centric detection of novel worms.
I've spent a lot of time looking at how to detect worms reliably based on network traffic, and I've even helped implement such a solution in a commercial product (which I'll name in a few minutes). I've gone looking for other such products or techniques that are available, and what I've found is only a few classes of tools to help you detect a worm on the net.
The first, and most common, kind of tool is a host-based agent that relies on signatures, such as a traditional antivirus product. This is probably still the most common method of malware defense, good old desktop AV. In an enterprise setting, this can be a managed solution, so that when new signatures get published they can get pushed out as rapidly as possible to all of the desktop agents. Still, this is a reactive solution.
The second solution, and probably nearly as popular (in terms of coverage) as the first, is network IDS-based detection of malware, usually based on either the exploit payload of a worm or the worm executable's payload. IDS engines like Snort have rulesets for worms, like these Nimda rules, that you can load. But again, they're reactive. Someone has to see the worm to develop a signature.
The third type of detection that is used is commonly referred to as Dark IP or Blackhole monitoring, DarkNet techniques, or a Network Telescope. However, they're all basically the same thing: take some unused address space and have it hit a collector. Run some statistics and some analytics on the data collected and you may be able to detect a worm. You can even use this data for signature generation. Bear in mind that a lot of the Internet is monitored in this way. There isn't a stretch of the net that someone isn't watching.
The type of detection I most favor is based on anomaly detection. For example, a set of hueristics and rules that look for a growing pattern of alerts that can be due to a worm's propagation attempts. Products that apply this sort of analysis include Arbor Networks' Peakflow X, Mazu's Profiler, and Therminator (based on the Lancope engine).
Finally, there's a honeypot-based approach favored by Forescout and their Wormscout product. I haven't seen a lot of this in the field, to be honest, and in most cases a honeypot isn't the most sensible approach to detecting worms quickly. You're relying on the honeypot to selectively see attacks that you may not be able to anticipate (if the exploit or vulnerability isn't know), and to be able to distringuish that from generic background attacks (more on that later). Even then, you still have to honeypot a lot of addresses, either with real or virtual honeypots.
While I tend to focus on a network-centric view of things, I do recognize the value of a combined approach. You can't observe everything from the network, and so host-based approaches will always be needed. You need to capture a sample to analyze, and so a honeypot will be vital. However, I still think that because worms have such a significant network footprint, detecting them by using network traffic and patterns in that traffi cmakes a lot of sense.
The Bot Problem
Now, the bot problem is somewhat different. Because bots usually don't propagate nearly as aggressively as worms have in the past, their threat is not that they'll knock out your network with too much traffic. The big problem is that somewhere, someone else has access to your network and your assets, and that someone is unauthorized. The primary, tractable solution to this problem is quite clear, then: detect bots and block them as fast as possible. Stopping the bots from getting a foothold is a "nice to have", but because they allow for an external party to control the infected machine, most people want to block that communication and then worry about more bots coming online from within their networks.
However, bots will usually rally at some central point for instructions, and that's often how people detect their presence, through monitoring the communications channels. Some people do use IDS signatures for the well known exploits that most bots use, but that's not as unique as looking for a bot's specific actions. There are some Snort signatures to detect bots and their backdoors, but this effort isn't yielding a significant bumper crop of new signatures.
Ultimately, that's still reactive. People have to collect and analyze the bots to identify where they rally (even if it's sped up with sandboxing), and it's still signature-based. The first bot that comes around uses a new exploit will be mised by network signatures. Desktop AV is still the most popular way to detect bots.
The Importance of the Differences
There are a number of key differences between worms and bots, and their associated problems, that have a significant impact on the future of malware detection techniques. These may impact funding choices for future research, and will probably be visible as changes in the literature.
I don't think that anyone disagrees that a proactive solution to the problem is preferred. After all, a shorter gap between malware outbreak and it's detection by customers is still a reactive solution. To that end, I think that I'll keep focusing on naive approaches to malware detection.
Most bots out there are derivatives of common malware families, and why detection isn't based on those common characteristics is beyond me. While I can pick out a Spybot vs an Agobot from within IDA in just a few moments yet most AV engines can't detect a freshly repackaged Spybot is still a mystery to me. Given the pace of bot authorship, and that it wont be slowing down, this sort of lag in detection seems unacceptable to me.
However, I think that the biggest challenge to detecting malware on the network (and using only network traces) comes from the dramatic increase in the amount of background "radiation" on the Internet today compared to 2000 or so. The volume of scans due to infected boxes means that scan alarms on the open Internet are always going off. Most worm detection techniques rely on a clean, "happy" Internet, one where the noise generated by a worm would be unmistakable. Yet, this isn't the case, and rarely do we see people testing their worm detection tools (ie tools that look for TCP RST storms as an indicator of a worm) on the modern, polluted Internet. This is an important aspect of research that needs more attention, especially as many grant-funded worm-detection tools are being completed around this time.
This question, "can your worm detection tool work in the face of a thousand live botnets?" may be moot if the worm problem has really been diminished or gone away. However, naive detection of network threats is still just as important as ever. The good news is that humans will still be needed for the task, and probably more than ever before. And, if you're attracted to grand challenges in computer security, making sense of this mess is a great problem to tackle.
September 13, 2006 in botnets, detection, editorial, new trends | Permalink
Tell others: digg submit
|
del.icio.us this
|
Reddit
Comments
How honeypots CAN be effective worm detectors (even in the face of background radiation).
Our tech report on our Honeyfarm. Weidong did steller work with this:
http://www.icsi.berkeley.edu/~nweaver/gq-techreport.pdf
The key is for detecting with honeyfarms is you NEED demand allocation (so you detect propagation between honeypots) and you must have some radiation filtering of some sort, because the background hostility is just too high otherwise.
Posted by: Nicholas weaver | Sep 13, 2006 11:05:30 AM
Note also that GQ doesn't really distinguish between "bot" and "worm", the goal is to detect self propagation, be it a bot or a worm.
Personally, I prefer the taxonomy:
Worm desribes propagation: self propagation from system to system (regardless of speed).
Bot describes payload: a control network of some sort allowing the attacker to utilize the compromised machines in an efficient manner.
Thus I don't really see too much of a conflict between the "worm problem" and "bot problem", its the "large scale subversion problem"
Posted by: Nicholas weaver | Sep 13, 2006 11:07:34 AM
Also, you don't need too many addresses for a honeyfarm. We LIKE having a /14, we do see more, but..
IF the worm/bot is uniform, and you have K vulnerable honeypot addresses, and you have some way of excluding all the known crud, you will actually detect the worm very early (roughly when 1/K of the vulnerable systems on the net are infected).
Thus with even 100 addresess, you are doing OK, and with 1000-10,000 addresses, your sensitivity can start getting really good.
Posted by: Nicholas weaver | Sep 13, 2006 11:09:36 AM
Jose,
First of all, great post on the evolution of worms into botnets, etc.
Secondly, the problems you outline are real. Signature-based approaches are reactive. Anomaly-based systems are not reactive, but have lots of false alarms. To overcome these problems, FireEye has developed technology that uses virtual machines in a network-based appliance to identify and protect against zero-day threats, botnets, targeted attacks, etc. FireEye's approach validates threats in a virtual environment, eliminating false positives caused by "funny-looking but benign" traffic. You can learn more from these recent press articles, our web site (http://www.fireeye.com) or I'd be happy to talk with you personally.
http://www.networkworld.com/news/2006/080106-fireeye-nac.html?page=1
http://www.computerwire.com/industries/research/?pid=2A0F7ADB%2D0EBF%2D453C%2DBB4B%2DA21390B532ED
http://www.darkreading.com/document.asp?doc_id=93643
Thanks for the good work on this blog. Keep it up!
Chad
Posted by: Chad Harrington | Sep 13, 2006 7:08:43 PM
Anomaly-based systems are not reactive, but have lots of false alarms. To overcome these problems, FireEye has developed technology that uses virtual machines
Posted by: Watches | Mar 2, 2011 5:16:46 AM
The comments to this entry are closed.