On the Impact of Dynamic Addressing on Malware
Propagation
Moheeb Abu Rajab Fabian Monro Andreas Terzis
Computer Science Department
Johns Hopkins University
{moheeb,fabian,terzis}@cs.jhu.edu
ABSTRACT
While malware models have become increasingly accurate over the past few years,none of the existing proposals accounts for the u of Network Address Translation(NAT).This oversight is prob-lematic since many network customers u NAT in their local net-works.In fact,measurements we collected from a distributed hon-eynet show that approximately19%of the infected hosts reside in NATted domains.To account for this fact,we prent a model that can be ud to understand the impact of varying levels of NAT de-ployment on malware that spread by preferentially scanning the IP space.Usin
g this model,we show that NATting impedes malware propagation in veral ways and can have a significant impact on non-uniform scanning worms as it invalidates the implicit assump-tion that vulnerable hosts reside in denly populated subnets. Categories and Subject Descriptors
D.4.6[Operating Systems]:Security and Protection—Invasive Soft-ware
General Terms
Security,Measurement
Keywords
Network Security,Internet worms,Network Address Translation, Private Address Space
第一次历史性飞跃1.INTRODUCTION
The rearch community has been on a quest over the past v-eral years to discover ways to accurately capture the spreading be-havior of malware on the Internet.Understanding the intricacies of such behavior continues to be an important problem becau the re-sulting insights are invaluable when designing and evaluating mal-ware countermeasures.Indeed,analysis of past outbreaks has le
ad to a deeper understanding of malware dynamics and thefindings have already been incorporated in a number of analytical models (e.g.,[8,15,20]).
Permission to make digital or hard copies of all or part of this work for personal or classroom u is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on thefirst page.To copy otherwi,to republish,to post on rvers or to redistribute to lists,requires prior specific permission and/or a fee.
WORM’06,November3,2006,Alexandria,Virginia,USA.
Copyright2006ACM1-59593-551-7/$5.00.
However,all of the models that have been prented thus far as-sume that the infection views on both sides of a network boundary are identical.Unfortunately,the widespread deployment offire-walls coupled with the u of Network Address Translation(NAT) verely distort the two views,and can lead to inaccurate model predictions.In this paper,we explore the influence of NAT on the spreading of malware that u non-uniform and localized scanning to spread.Our exposition is bad on a refined model that incor-porates the fact that many vulnerable hosts are deployed in private address spaces.
To gauge the impact of address translation,wefirst estimate the number of infected sources located in private address spaces by analyzing traces collected from a conglomeration of network tele-scopes.As we show,dynamic addressing is a fairly common prac-tice—approximately19%of the sources in our trace reside in NATted domains.The model we develop shows that,at this level of usage,address translation techniques introduce significant skew in the prediction capabilities of existing malware spreading mod-els.The predictions will increasingly depart from reality as NAT usage grows.
The rest of the paper is organized as follows:In Section2we elaborate on the impact of NAT on malware infections and the chal-lenges it creates for accurate forensic analysis.Section3prents our data collection efforts and the methodology we u to infer the prevalence of NATted sources.In Section4we provide the ana-lytical model and u it to examine the impact of varying levels of NAT deployment on malware spreading in Section5.We prent related work in Section6and conclude in Section7.
2.OVERVIEW
It should come as no surpri that the u of private address space and network address translation
techniques influences how mal-ware spreads.First,NAT devices reduce the percentage of vulner-able hosts that are globally reachable.The reason is that the de-vices block connection attempts that originate from the outside by default,thus protecting internal vulnerable hosts from external in-fections.Even when port forwarding is enabled—usually to allow specific rvices to be accessible from the global Internet—only a subt of the potentially vulnerable hosts is visible to external malware scans.Second,when a new host inside a private address space is compromid,NATting affects how efficiently this host can find other vulnerable hosts.This is especially true for malware that spread through preferential scanning,including non-uniform ,CodeRed-II[6],Nimda[7],and MSBlaster[11])and localized scanning,in which infected hosts(predominantly)scan their local address prefix.Recently,Rajab et al.[17]showed that localized scanning is widely ud by botnets,and hence models that capture localized behavior may become increasingly important in
victim
source
( Stage II )
( Stage I )
Figure 1:A Multi-stage Malware infection.
the near term.
The mere fact that NATted hosts are usually located in large ad-dress spaces (e.g.,10/8,192.168/16)caus preferential scanning malware to divert the majority of its scans towards the NATted space rather than the globally routable IP space.While the infected machine can still contaminate other vulnerable hosts within the pri-vate address space,locating the hosts can take a prohibitively long time.This slowdown in infection speed aris becau the density of active hosts within private address spaces is orders of magnitude lower than the host density in the global
address space.This is certainly the ca when a private /8address prefix (e.g.,10/8)is ud.Networks that u /16private address spaces induce another interesting behavior;preferential scans from infected hosts in tho networks will not only target the NATted space,but also contact the encompassing routable /8prefix.The net effect is that the parts of the IP space will receive a disproportionate percent-age of scans by veral kinds of malware 1.While this creates an attractive measurement hot-spot as reported in [4,9],the incread traffic is an annoyance to the networks operating in that prefix.The u of NAT pos another obstacle to malware that employ a multi-stage infection process.This multi-stage infection process,shown in Figure 1,is a common occurrence in botnets [17].In the first stage,a vulnerability that is remotely exploitable is ud to transfer a shellcode that instructs the victim to initiate a connection back to the infector’s IP address to download the actual malware binary.The download constitutes the cond stage of the exploit and usually occurs through a file transfer protocol such as TFTP.If however the infector is located behind a NAT device the provided address points to a globally unreachable IP address,thereby caus-ing the cond-stage transfer to fail.
Aside from slowing the spread of malware,NATting pos v-eral challenges to forensic analysis of malware [13,16].The challenges are related to the difficulty of uniquely identifying NAT-ted hosts in
the abnce of explicit information (e.g.,[3,4]).On one hand,a group of infected hosts behind a NAT device with a single public address will appear at a network monitor as a single infected host thereby under-estimating the number of infected hosts.Con-verly,few hosts behind a NAT device with a large number of external address can inflate the estimation becau subquent scans from the same infected host will most likely be mapped to veral source address as they are re-written by the NAT device.Shannon jectured that this was indeed the ca for a t of address obrved in the Witty worm outbreak [18].
In the next ction we derive an initial estimate of the prevalence of NAT in malware traces.In Section 4we quantitatively analyze the impact of NAT on the spreading of different class of malware.
2
Casado et al.themlves acknowledged that the inferred NAT ratio did not include hosts that u 10/8and 172.16/12address and is not generalizable to the overall NAT usage on the Internet.
4.ANALYSIS OF THE IMPACT OF’NAT’
ON MALWARE PROPAGATION
We prent a model that predicts the evolution of malware infec-tions,accounting for the effect of NAT deployment in the Internet. The proposal is an extension to a model we previously developed to study the impact of vulnerable population distributions on Internet infections[15].We consider the general ca in which malware in-stances apply preferential scanning,using different probabilities to locate and exploit victims in their surrounding/16and/8prefixes as well as random scanning tofind victims in the global Internet. We account for the effect of NATting by dividing the vulnerable population into two categories:(i)the publicly reachable vulnera-ble population including vulnerable hosts with public IP address in addition to NATted vulnerable hosts which are however publicly ,due to port forwarding),and(ii)the vulnerable pop-ulation that resides behind NAT devices and is inaccessible from the public
Internet.
NAT space
ram是什么
NAT space
Figure2:The incoming scanning activity to a single/16prefix with NATted domains.
Figure2illustrates the scanning activity that reaches a routable /16prefix containing a number of NATted domains.The number of incoming scans in this ca is simply the sum of the scans from in-fected hosts within that prefix(indicated as P16in Figure2),from infected hosts within the encompassing/8prefix(indicated as P8), and from the entire infected population(the P0component).Ob-rve that in the ca of the NATted infectees,the encompassing prefixes will be tho of the private address rather than their exter-nal routable space.As a result,preferential scans from the hosts will be diverted towards private(un-routable)space.For this rea-son,the number of incoming scans into each routable/16prefix excludes any preferential scanning activity originating from NAT-ted hosts.Using the notation from Table1,the sum of the three scanning components above can be written as:
C i,j=P
16
s(I i,j−N i,j)+
P
8s(I(/8)
i,j
−N(/8)
i,j
)
216
(1)
in this ca,P
16,P
8
,
P
are the probabilities that an infected host
will nd a scan to the encompassing/16,/8prefix,and the en-tire Internet,respectively3.I i,j is the number of infected hosts
within the j th/16prefix at time i;I(/8)
i,j is defined similarly for the
surrounding/8prefix.N i,j is the total number of infected NAT-ted hosts that are publicly reachable within the j th/16prefix,and N(/8)aaaa级景区
i,j
is the total number of infected NATted and reachable hosts in the/8prefix surrounding the j th/16prefix.
The C i,j scans will infect members of thefirst population cate-gory.The expected number of infected hosts in the j th/16prefix
Total number of infected hosts at time i
s
震耳欲聋是什么意思
福字剪纸图片龚大明Probability of scanning a random address
P8
/8prefix as the infectee
P16
/16prefix as the infectee
V j
in the j th/16prefix
I i,j
Total number of incoming scans into the j th/16
Total number of NATted networks within the j th
Total number of scans within a particular NATted
Initial number of vulnerable hosts in a particular
Number of infected hosts in a particular NATted
216”C i,j
#(2)
in which,V j is the initial number of vulnerable hosts in the j th/16 prefix.
In addition to the infections due to the scanning activity in the public IP space,infected hosts within NATted domains will infect other vulnerable hosts within the same private space,including vul-nerable hosts from the cond population ,publicly in-accessible vulnerable hosts)—assuming,of cour,that no internal countermeasures,such as“hard-LANs”[19],are locally deployed. If we consider NATted domains that u/8private address5 and assume,for simplicity,that hosts in the private spaces are co-located in the same/16private address prefix,then the number of scans within that network domain can be written as:
L i=sd i…P16+P8216«
in which,d i is the number of infected hosts within a given NATted network.Therefore,the number of additional infections within a single private address space can be expresd as:
d i+1=d i+(f−d i)"1−“1−1
4To isolate the impact of NAT we do not consider the node removal rate due to patching or failure.
5As we show in Section5,using NAT with/16address 192.168/16)caus the P
8
scan component from all NATted in-fected hosts to target a single prefi192/8).
Number of Vulnerable hosts
10scans/c
per infected host
100
Local domain size
1
hosts per NATted domain
10per experiment Table2:Analysis parameters.
I i+1=2
16
X j=10@I i+1,j+T j X l=1d i+1,l1A(4)
in which T j is the number of NAT domains in the j th/16prefix. Notice that private space is usually sparly populated so the infec-tion rate within that space is substantially slower than the global infection rate.In the next ction,we show that this effect reduces the overall propagation speed of malware and could have a consid-erable impact as NAT deployment increas.
Finally,the obrvant reader will note that for malware that uni-formly scans the entire IP space,the a
ddress of any particular in-fected machine does not impact its scanning behavior.The only impact of NATting in this ca is that it decreas the reachable vulnerable population.Therefore,for the remainder of the paper we only evaluate the impact of address translation on the evolution of malware that u preferential scanning.
5.EV ALUATION
We now make u of the model prented in Section4to eval-uate the impact that NAT has on malware spreading.As we have previously shown in[15],analytical models must u realistic vul-nerable population distributions if they are to accurately model the behavior of worm outbreaks.For this reason,we drive our evalua-tion with a vulnerable population distribution extracted from a real datat.In particular,the datat is provided by DShield[12]and contains intrusion traces collected over a period of three months from over1,600intrusion detection systems distributed around the globe.Given that the logs were obtained from IDS reports,it is safe to assume that they reprent unwanted traffic originating ei-ther from compromid hosts or active scanners.We construct a vulnerable host t by extracting the sources that attempt connec-tions to port806.Overall,the data contains632,472such sources. We emulate the impact of NAT by gmenting the t of vulnera-ble hosts into different network domains.For simplicity,we assume that all domains reside in equal sized/
24public address prefixes. We acknowledge that this is not necessarily the ca in the Inter-net today and different domain sizes can alter the rate of malware evolution7.Incorporating more realistic domain size distributions is part of our ongoing work.
We assume that each NATted network has one vulnerable host that is publicly accessible.This reflects common network adminis-tration practices in which a number of hosts behind a NAT device are made accessible so that certain ,web rvers)are publicly available.The remaining vulnerable hosts are unreachable by external scans and can therefore only be contaminated by scans
0 %
10 %20 %30 %40 %50 %60 %70 %80 %90 %100 % 0
500
1000 1500
2000
P r e c e n t a g e o f I n f e c t e d H o s t s
新员工入职申请书
Time (c.)
no NAT 20%40%60%80%
Figure 4:Non-uniform scanning worm spreading for different levels of NAT deployment.
0 %
20 %
40 %
60 %
80 %
100 %
500 1000 1500 2000 2500 3000 3500 4000 4500
P r e c e n t a g e o f I n f e c t e d H o s t s
Time (c.)
(192 /8) with 50% NAT
(192 /8) no NAT
Figure 5:Non-uniform scanning worm spreading in the 192/8prefix in the ca of zero and 50%NATted domains.
becau all the scans will be directed at a single /8prefix (namely,192/8),the overall increa in the speed of the worm is minimal.That said,a disproportionate percentage of scans originating from all NATted infected hosts will conquently target the 192/8prefix.The resulting outcome is that the worm propagates much faster in that prefix compared to the rate of spread obrved in other parts of the IP address space (e Figure 5).
5.2Localized Malware Spreading
As mentioned earlier,localized scanning,in which each infected host scans its local address prefix,reprents an important infection vector in botnets [17].Therefore,it is also important to understand the impact of network address translation on the spread of the malware strains.
Figure 6reprents the spreading behavior of botnets that scan the encompassing /8prefix of each infected host using the parame-ters listed in Table 2.It is evident from the graph that the infection spread is slower than the non-uniform ca.This can be explained by the fact that unlike non-uniform scanning worms,the localized scanning malware has no “island hopping”component that allows
0 %
10 %
20 %30 %40 %50 %60 %70 %80 %90 % 0
500 1000
1500 2000 2500 3000
P r e c e n t a g e o f I n f e c t e d H o s t s
Time (c.)
no NAT 20%40%60%80%
Figure 6:Spreading behavior of malware using /8prefix localized-scanning under varying degrees of infected NATted hosts.
the infection to move across different prefixes.As a result,mal-ware instances ulessly scan the same prefix after all its vulnerable hosts have been infected.More importantly,the impact of address translation is amplified in this ca since NAT devices completely contain the scan activity within the perimeter of the sparly popu-lated private networks.
Finally,for malware class that spread via the multi-stage in-fection process illustrated in Figure ,botnets),NAT pos another obstacle;regardless of the scanning technique ud by the malware,an infected host behind a NAT device will not succeed in transferring the malware binary to a new infectee outside the network perimeter.Therefore,we conjecture that increasing NAT deployment will impede botnets that spread by active scanning.
6.RELATED WORK
Worm models have undergone a ries of refinements over the past few years,leading to increasingly accurate reprentations of worm behavior in the wild.For example,Zou et al.prented a “two-factor”worm model that extended the classic epidemic model to account for the removal of inf
ected hosts (due to patching or failure)and demonstrated how accounting for that factor more ac-curately reflects the infection dynamics of Code Red I [20].Chen et al.subquently prented the “AAWP”model which was the first attempt to model non-uniform scanning worms [8].More recently,Rajab et al.demonstrated the significance that the distribution of vulnerable hosts has on the spreading of non-uniform scanning worms and prented an extended model that accounts for this fac-tor [15].However,none of the models account for the skew in-troduced by NAT and evaluate its impact on malware spreading.The development of techniques for reliably detecting hosts be-hind a NAT device remains an open problem.Bellovin [3]pre-nted a technique to count the number of hosts behind a NAT de-vice by exploiting the evolution quence of the IP
distributed network monitors.Similar insights were also noted by Cooke et al.who showed that non-uniformity in the scanning be-havior of infected hosts,due in this ca toflaws in the worm’s random number generator and side-effects of NATting,can cre-ate worm“hot-spots”[9].Our work complements the efforts by exploring another avenue for estimating NAT usage by examining malicious traces and studying the failed-connection rate for multi-stage infections.
Finally,some distantly related work is that of Antonatos et al. that illustrated the potential of address space randomization to pro-tect against hit-list worms by continuously changing the IP ad-dress of
active hosts[1].Our work,on the other hand,is focud on illustrating the overall impact of NATting as an impediment to malware spreading,and we argue that it is an important factor that must be considered in modeling non-uniform malware spreading.
7.SUMMARY
In this paper,we show that the widespread u of network ad-dress translation has significant implications on how different fam-ilies of malware spread on the Internet.Using analytical model-ing,we quantitatively show that NATting acts as an impediment to the propagation of malware that spread by preferentially scanning the Internet.This effect is due to the fact that NAT effectively in-creas the address space that active scanners must explore.More-over,NATting decreas the density of the vulnerable host popu-lation residing in network domains that u private address space and in doing so,negates the advantage that non-uniform scanning provides.Finally,we note that the u of NAT caus multi-stage infections to fail at a high rate since the URLs transmitted in the infections hold private network address that are unreachable from the public Internet.
Acknowledgments
This work is supported in part by National Science Foundation grant CNS-0627611.We thank DShiel
d and CAIDA for graciously providing access to their IDS logs and Witty Worm datat,respec-tively.We also extend our gratitude to the anonymous reviewers for their insightful comments.
8.REFERENCES
[1]S.Antonatos,P.Akritidis,E.P.Markatos,and K.G.Anag-
nostakis.Defending Against Hitlist Worms Using Network Address Space Randomization.In WORM’05:Proceedings of the2005ACM workshop on Rapid malcode,pages30–40, 2005.
[2]Paul Baecher,Markus Koetter,Thorsten Holz,Maximillian
Dornif,and Felix Freiling.The Nepenthes Platform:An Efficient Approach to Collect Malware.In Proceedings of the 9th International Symposium on Recent Advances in Intru-sion Detection(RAID),to appear Sept.2006.
喇叭声[3]Steven M.Bellovin.A Technique for Counting NATted Hosts.
In Proceedings of the2nd ACM SIGCOMM Workshop on In-ternet measurment(IMW),pages267–272,2002.
[4]Martin Casado,Tal Gankel,Weidong Cui,Vern Paxson,and
Stefan Savage.Opportunistic Measurement:Extracting In-sight from Spurious Traffic.In Proceedings of the4th ACM Workshop on Hot Topics in Networks(HotNets-IV),College Park,MD,November2005.
[5]Colleen Shannon and David Moore,The CAIDA Datat on
the Witty Worm-March19-24,2004,See www.
caida/passive/witty/.[6]CERT.Code Red II:Another Worm Exploiting Buffer Over-
flow in IIS Indexing Service DLL.See www.
cert/incident\