Is BGP Update Storm a Sign of Trouble: Observing the Internet Control and Data Planes During Internet Worms
I recall doing some analysis when SQLSlammer hit, and looking over the routing data I had available to me. It was a pretty disruptive 36 hours or so, honestly, and looking over the changes in the network, I saw an avergae of three BGP messages for every network that was affected. This appearantly mapped to a BGP withdaw message, an announcement, and then a path update message. What's interesting in this work is that they line up such BGP data with reachability measurements, which I did not have access to at the time.There are considerable reasons to wish to understand the relationship between the Internet’s control and data planes in times for stress. For example, the much publicized Internet worms—Code Red, Nimda and SQL Slammer—caused BGP storms, but there has been comparatively little study of whether the storms impacted network performance. In this paper, we study these worm events and see whether the BGP storms observed during the worms actually corresponded to problems in the Internet’s data plane. By processing and analyzing two datasets from RIPE, we have found that while BGP update storms occurred in all three worms, the performance of the data plane degraded during the Slammer worm but did not during the Code Red and the Nimda. No direct correlation should be drawn between the degradation of the Internet data plane and the occurrence of a BGP update storm—it may not be a sign of trouble but a sign of the Internet control plane doing its job.Source: Is BGP Update Storm a Sign of Trouble: Observing the Internet Control and Data Planes During Internet Worms, by Matthew Roughan, Jun Li, Randy Bush, Zhuoqing Mao and Timothy Griffin, Proceedings of SPECTS 2006.
September 15, 2006 in Nimda, papers, routing, SQLSlammer | Permalink | Comments (2)
Early Detection of BGP Instabilities Resulting from Internet Worm Attacks
This is an interesting proposal, but I'm not sure that routing disruptions are the right place to detect the spread of a worm. After all, the preceeding days' worth of posts showed how large the routing disruptions can be, but there's always some BGP disruption that is going on. What's more, only a small number of worms have truly impacted BGP routing tables.
The increasing incidences of worm attacks in the Internet and the resulting instabilities in the global routing properties of the Border Gateway Protocol (BGP) routers pose a serious threat to the connectivity and the ability of the Internet to deliver data correctly. In this paper we propose a mechanism to detect/predict the onset of such instabilities which can then enable the timely execution of preventive strategies in order to minimize the damage caused by the worm. Our technique is based on online statistical methods relying on sequential change-point and persistence filter based detection algorithms. Our technique is validated using a year's worth of real traces collected from BGP routers in the Internet that we use to detect/predict the global routing instabilities corresponding to the Code Red II, Nimda and SQL Slammer worms.
Source: Early Detection of BGP Instabilities Resulting from Internet Worm Attacks, S. Deshpande, M. Thottan, B. Sikdar.
September 30, 2005 in Code Red, detection, Nimda, papers, routing, SQLSlammer | Permalink | Comments (0)
Observation and Analysis of BGP Behavior Under Stress
Continuing with the theme of a worm outbreak's effect on routing, here is a Nanog presentation on the effect of Code Red and Nimda on routing in September, 2001.
Despite BGP's critical importance as the de-facto Internet inter-domain routing protocol, there is little understanding of how BGP actually performs under stressful conditions when dependable routing is most needed. In this paper, we examine BGP's behavior during one stressful period, the Code Red/Nimda attack on September 18, 2001.
The attack was correlated with a 30-fold increase in BGP update messages at a monitoring point that peers with a number of Internet service providers. Our examination of BGP's behavior during the event concludes that BGP exhibited no significant abnormality, and that over 40% of the observed updates can be attributed to the monitoring artifact in current BGP measurement settings.
Our analysis, however, does reveal several weak points in both the protocol and its implementation, such as BGP's sensitivity to transport session reliability, its inability to avoid the global propagation of small local changes, and certain implementation features whose otherwise benign effects are only amplified under stressful conditions. We also identify areas for improvement in the current network measurement and monitoring effort.
Source: Abstract: Observation and Analysis of BGP Behavior Under Stress, Lan Wang, Xiaoliang Zhao, Dan Pei, Randy Bush, Daniel Massey, Allison Mankin, Felix Wu, Lixia Zhang.
September 29, 2005 in Code Red, Nimda, routing, slides | Permalink | Comments (0)
A first look at Saturday’s MS-SQL worm as seen by BGP activity recorded by the RIS project
Here's a very short (10 slides) slide deck presented at the RIPE 44 meting a couple of years ago. A first look at Saturday’s MS-SQL worm as seen by BGP activity recorded by the RIS project, James Aldridge, Arife Vural, RIPE NCC New Projects group. It's some more routing information on the effect of SQLSlammer, showing that while the disruptions were very real and widespread, their impact was minimal when the larger Internet topology is taken into account.
September 28, 2005 in routing, slides, SQLSlammer | Permalink | Comments (0)
Distributed Worm Simulation with a Realistic Internet Model
Another PADS paper, this one by researchers at the University of Delaware. I like this paper because they have developed a simulation framework much like the one I was hoping to have, which modeled topology and bandwith together to try and understand worm propoagation dynamics.
Internet worm spread is a phenomenon involving millions of hosts, who interact in complex and diverse environment. Scanning speed of each infected host depends on its resources and the defenses at work in its network. Aggressive worms further interact with the underlying Internet topology---the dynamics of the spread is constrained by the limited bandwidth of network links, and high-volume scan traffic leads to BGP router failure thus affecting global routing. Worm traffic also interacts with legitimate background traffic competing for (and often winning) the limited bandwidth resources. To faithfully simulate worm spread and other Internet-wide events such as DDoS, flash crowds and spam we need a detailed Internet model, a packet-level simulation of relevant event features, and a realistic model of background traffic on the whole Internet. The memory and CPU requirements of such simulation exceed a single machine's resources, creating a need for distributed simulation. We propose a design and present implementation of a distributed worm simulator, called PAWS. PAWS runs on Emulab testbed, which facilitates its use by other researchers. We validate PAWS in a variety of scenarios, and evaluate costs and benefits of distributed worm simulation.
Source: Distributed Worm Simulation with a Realistic Internet Model, Sonjie Wei, Jelena Mirkovic, Martin Swany.
April 24, 2005 in modeling, papers, routing | Permalink | Comments (0)
Software Diversity as a Defense against Viral Propagation: Models and Simulations
Several months ago, at the suggestion by David Nicol, we posted the call for papers for the PADS workshop. Nichols wrote me again to say that papers are up, and several worm papers are included in the lineup, and they look very good. The conference, 19th ACM/IEEE/SCS Workshop on Principles of Advanced and Distributed Simulation (PADS 2005), will be held on June 1-3, 2005, in Monterey, CA., USA. Nicol also notes, "We've made the registration costs as low as we can - particularly for students -in order to make it attractive for those in the Monterey, Santa Cruz, and Bay Areas. Important dates to remember are
Over the next few days I'll be posting several of the abstracts and links to the papers. The first one is from two researchers examining a very fundamental question in computer and network security, diversity. O'Donnell is a friend of mine, and I've read various drafts of his work in other areas. It's a rigorous treatment of a relatively simple question with a complex answer.
The use of software diversity has often been discussed in the research literature as an effective means to break up the software monoculture present on the Internet and to thus prevent malcode propagation. However, there have been no quantitative studies that examine the effectiveness of software diversity on viral propagation. In this paper, we study both real (an IPv6 BGP topology) and synthetically generated (an Erdos-Renyi random graph) network topologies and employ a popular metric called the epidemic threshold to measure resistance to viral propagation in the presence of software diversity. We show that one can increase the epidemic threshold of a network even with a naive, random distribution of diverse software on the nodes of a network. We also show that an algorithm-driven diversity assignment further increases the epidemic threshold. These results confirm the value of strategic topology-sensitive assignment of diversity to improving the tolerance of a network to malcode propagation.
Source: Software Diversity as a Defense against Viral Propagation: Models and Simulations, Adam O'Donnell, Harish Sethu. Interested readers will want to see this paper for more background information.
April 23, 2005 in modeling, papers, routing | Permalink | Comments (1)
The Routing Worm
Most well-known Internet worms, such as Code Red, Slammer, and Blaster, infected vulnerable computers by scanning the entire Internet IPv4 space. In this paper, we present a new scan-based worm called "routing worm", which can use information provided by BGP routing tables to reduce its scanning space without ignoring any potential vulnerable computer. In this way, a routing worm can propagate twice to more than three times faster than a traditional worm. In addition, the geographic information of allocated IP addresses, especially BGP routing prefixes, enables a routing worm to conduct fine-grained selective attacks: hackers or terrorists can selectively impose heavy damage to vulnerable computers in a specific country, an Internet Service Provider, or an Autonomous System, without much collateral damage done to others. Routing worms can be easily implemented by attackers and they could cause considerable damage to our Internet. Since routing worms are scan-based worms, we believe that an effective way to defend against them and all other scan-based worms is to upgrade IPv4 to IPv6 . the vast address space of IPv6 ( 264 IP addresses for a single subnetwork) can prevent a worm from spreading through scanning.
Source: Routing Worm: A Fast, Selective Attack Worm based on IP Address Information, Cliff C. Zou, Don Towsley, Weibo Gong, and Songlin Cai.
Unlike many worst case analyses, this one actually performs some analysis and is a worthwhile read. What typically bothers me about "worst case worms" is that they are based on conjecture, untested proposals, and untenable assumptions (ie infinite bandwidth, complete knowledge of the topology). This paper doesn't suffer greatly from these problems, and is better for it's efforts to test the conjecture.
January 14, 2005 in papers, routing | Permalink | Comments (0)
Routing Instabilities and Worms
Several studies have been performed on global routing instabilities caused by worm outbreaks.
Instability can be induced from without as well as within. In July 2001 the Internet worm known as Code Red 2 (CRv2) spread across the globe. It exploited holes in the Microsoft IIS server to gain access and port itself to a machine. Once on-board it randomly chose IP addresses looking for other computers running IIS with the same access vulnerability, and continued the propagation. The intensity of the attack could be monitored as a function of time by sniffing network traffic in a given domain for scans (i.e. a particular type of connection request) that were characteristic of the worm. Another worm, Nimda, struck in September 2001. It propagated in a similar fashion, using a larger arsenal of vulnerabilities and a faster means of spreading across the Internet. Temporal correlation between these two worm attacks routing instability was noticed by researchers at Renesys Corporation[2]. Figure 1 illustrates an example that plots the number of route withdrawals per 30 second time interval observed at one router, against the rate of .probes. generated by Nimda in the same time period, observed in one network. It turns out that this spike in withdrawals occurs in a wave, across routers situated all over the globe. Recall that a router withdraws a route to an AS only when it cannot reach that AS. The global wave shows that somehow the traffic induced by the worm caused a wave of distress among the routers.
Source: Challenges In Using Simulation to Explain Global Routing Instabilities, David M. Nicol.
Additional measurements have been made for specific worm outbreaks:
This paper examines the surge in BGP updates that coincide with events such as the recent Internet worm attacks. Although the Internet routing infrastructure was not a direct target of the January 2003 Slammer worm attack, the worm attack coincided in time with a large increase in the number of BGP routing update messages observed globally. Our analysis shows that the current global routing protocol BGP allows local connectivity dynamics to propagate globally. As a result, any small number of edge networks can potentially cause wide-scale routing overload. For example, two small edges ASes, which announced less than 0.25% of BGP routing table entries, contributed over 6% of total update messages during the worm attack as observed at the major monitoring points. Although BGP route flap damping has been proposed to eliminate such undesirable global consequences of edge instability, our analysis shows that damping has not been fully deployed even within the Internet core. Our simulation further reveals that partial deployment of BGP damping not only has limited effect but may also worsen the routing performance under certain topological conditions. The results show that it remains a research challenge to design a routing protocol that can prevent local dynamics from triggering global messages in order to scale well in a large, dynamic environment.
Source: Analysis of BGP Update Surge during Slammer Worm Attack, Mohit Lad, Xiaoliang Zhao, Beichuan Zhang, Dan Massey, and Lixia Zhang.
We speculate that, although most of the traffic in the Internet continued to flow normally through the small fraction of links that make up the global backbones, most of the links at the Internet edge had serious performance problems during the worms' probing and propagation phases. A complete list of reasons still needs to be documented, but we suspect i) congestion-induced failures of BGP sessions due to timeouts; ii) flow-diversity induced failures of BGP sesions due to router CPU overloads; iii) proactive disconnection of certain networks; and iv) failures of other equipment at the Internet edge such as DSL routers and other devices.
Source: Global Routing Instabilities during Code Red II and Nimda Worm Propagation, James Cowie, Andy Ogielski, BJ Premore and Yougu Yuan.
Given the number of worm outbreaks in the past four years, the amount of global routing instability that ensures is rare. However, when it does happen, several thousand BGP routes are dropped and up to 36 hours pass before stablity is fully restored. An interesting phenomenon.
December 10, 2004 in Nimda, papers, routing | Permalink | Comments (0)
More Papers on Worm Outbreak Simulations and Modeling
Some more on the topic of simulating worm outbreaks that we posted on Saturday. Here's a few papers on the subject covering the two tools I listed (SSFNet, ns2) and others, as well as theoretical modeling of worm propagation. What this provides researchers with is a way to understand the likely impact of a new worm, how we can change our network topologies and infrastructures to be both resistant to worm outbreaks and even integrate defensive tools into the networks.
Abstract modeling, such as using epidemic models, has been the general method of choice for understanding and analyzing the high-level effects of worms. However, high-fidelity models, such as packet-level models, are indispensable for moving beyond aggregate effects, to capture finer nuances and complexities associated with known and future worms in realistic network environments. Here, we first identify the spectrum of available alternatives for worm modeling, and classify them according to their scalability and fidelity. Among them, we focus on three high-fidelity methods for modeling of worms, and study their effectiveness with respect to scalability. Employing these methods, we are then able to, respectively, achieve some of the largest packet-level simulations of worm models to date; implant and attack actual worm monitoring/defense installations inside large simulated networks; and identify a workaround for real-time requirement that fundamentally constrains worm modeling at the highest fidelity levels.
Source: High-Fidelity Modeling of Computer Network Worms, Kalyan S. Perumalla, Srikanth Sundaragopalan.
In this paper, we investigate epidemiological models to reason about computer viral propagation. We extend the classical homogeneous models to incorporate two timing parameters: Infection delay and user vigilance. We show that these timing parameters greatly influence the propagation of viral epidemics, and that the explicit treatment of these parameters gives rise to a more realistic and accurate propagation model. We validate the new model with simulation analysis.
Source: Modeling the Effects of Timing Parameters on Virus Propagation, Yang Wang, Chenxi Wang.
Email viruses constitute one of the major Internet security problems. In this paper we present an email virus model that accounts for the behaviors of email users, such as email checking frequency and the probability of opening an email attachment. Email viruses spread over a logical network defined by email address books. The topology of email network plays an important role in determining the behavior of an email virus spreading. Our observations suggest that the node degrees in an email network are heavy-tailed distributed and we model it as a power law network. We compare email virus propagation on three topologies: power law, small world and random graph topologies. The impact of the power law topology on the spread of email viruses is mixed: email viruses spread more quickly than on a small world or a random graph topology but immunization defense against viruses is more effective on a power law topology.
Source: Email Virus Propagation Modeling and Analysis, Cliff C. Zou, Don Towsley, Weibo Gong.
An unexpected consequence of recent worm attacks on the Internet was that the routing infrastructure showed evidence of increased BGP announcement churn. As worm propagation dynamics are a function of the topology of a very large-scale network, a faithful simulation model must capture salient features at a variety of resolution scales. This paper describes our efforts to model worm propagation and its affect on routers and application traffic. Using our implementations of the Scalable Simulation Framework (SSF) API, we model worm propagation, its affect on the routing infrastructure and its affect on application traffic using multiscale traffic models.
Source: Multiscale modeling and simulation of worm effects on the internet routing infrastructure, David M. Nicol, Michael Liljenstam, and Jason Liu.
Reproducing the effects of large-scale worm attacks in a laboratory setup in a realistic and reproducible manner is an important issue for the development of worm detection and defense systems. In this paper, we describe a worm simulation model we are developing to accurately model the large-scale spread dynamics of a worm and many aspects of its detailed effects on the network. We can model slow or fast worms with realistic scan rates on realistic IP address spaces and selectively model local detailed network behavior. We show how it can be used to generate realistic input traffic for a working prototype worm detection and tracking system, the Dartmouth ICMP BCC: System/Tracking and Fusion Engine (DIB:S/TRAFEN), allowing performance evaluation of the system under realistic conditions. Thus, we can answer important design questions relating to necessary detector coverage and noise filtering without deploying and operating a full system. Our experiments indicate that the tracking algorithms currently implemented in the DIB:S/TRAFEN system could detect attacks such as Code Red v2 and Sapphire/Slammer very early, even when monitoring a quite limited portion of the address space, but more sophisticated algorithms are being constructed to reduce the risk of false positives in the presence of significant "background noise" scanning.
Source: Simulating realistic network worm traffic for worm warning system design and testing, Michael Liljenstam, David M. Nicol, Vincent H. Berk, Robert S. Gray. (Note that this is article requires payment to receive.)
Large-scale worm infestations, such as last year.s Code Red, Code Red II, and Nimda, have led to increased interest in modeling these events to assess threat levels, evaluate countermeasures and investigate possible influence on the Internet infrastructure. However, the inherently large scale of these phenomena pose significant challenges for models that include infrastructure detail. We explore the use of selective abstraction through epidemiological models in conjunction with detailed protocol models as a means to scale up simulations to a point where we can ask meaningful questions regarding a hypothesized link between worms and inter-domain routing instability. We find that this approach shows significant promise, in contrast to some of our early attempts using all-out packet level models. We also describe some approaches we are taking to collect the underlying data for our models.
Source: A Mixed Abstraction Level Simulation Model of Large-Scale Internet Worm Infestations, Michael Liljenstam, Yougu Yuan, BJ Premore, David Nicol.
The Code Red worm incident of July 2001 has stimulated activities to model and analyze Internet worm propagation. In this paper we provide a careful analysis of Code Red propagation by accounting for two factors: one is the dynamic countermeasures taken by ISPs and users; the other is the slowed down worm infection rate because Code Red rampant propagation caused congestion and troubles to some routers. Based on the classical epidemic Kermack-Mckendrick model, we derive a general Internet worm model called the two-factor worm model. Simulations and numerical solutions of the two-factor worm model match the observed data of Code Red worm better than previous models do. This model leads to a better understanding and prediction of the scale and speed of Internet worm spreading.
Source: Code red worm propagation modeling and analysis, Cliff Changchun Zou, Weibo Gong, Don Towsley.
In recent years, fast spreading worms have become one of the major threats to the security of the Internet. In order to defend against future worms, it is important to understand how worms propagate and how different scanning strategies affect their propagation. In this paper, we model and analyze worm propagation under various scanning strategies, such as idealized scan, uniform scan, divide-andconquer scan, local preference scan, sequential scan, target scan, etc. We also analyze and discuss how attackers could optimize their scanning strategies, and provide some guidelines for building up a monitoring infrastructure to defend against future worms.
Source: On the Performance of Internet Worm Scanning Strategies, Cliff Changchun Zou, Don Towsley, Weibo Gong.
November 18, 2004 in mass mailers, modeling, papers, routing | Permalink | Comments (3)