A Survey of Visualization Systems
for Network Security
Hadi Shiravi,Ali Shiravi,and Ali A.Ghorbani,Member,IEEE Abstract—Security Visualization is a very young term.It express the idea that common visualization techniques have been designed for u cas that are not supportive of curity-related data,demanding novel techniques fine tuned for the purpo of thorough analysis.Significant amount of work has been published in this area,but little work has been done to study this emerging visualization discipline.We offer a comprehensive review of network curity visualization and provide a taxonomy in the form of five u-ca class encompassing nearly all recent works in this area.We outline the incorporated visualization techniques and data sources and provide an informative table to display our findings.From the analysis of the systems,we examine issues and concerns regarding network curity visualization and provide guidelines and directions for future rearchers and visual system developers.
Index Terms—Information visualization,network curity visualization,visualization techniques.
Ç
1I NTRODUCTION
A LTHOUGH the visualization of network curity events is
the subject of this survey,this paper does not focus on designing and developing a specific visualization system. Instead,we consider network curity with respect to information visualization and introduce a collection of u-ca class.In this study,we provide an overview of the increasing relevance of curity visualization.We explore a novel classification approach and review the artifacts most commonly associated with curity visualization systems. We provide a historical context for this emerging practice and outline its surrounding concerns while providing design guidelines for future developments.
Visual data analysis help to perceive patterns,trends, structures,and exceptions in even the most complex data sources.As the quantity of network audit traces produced each day grows exponentially,communicating with visuals allows for comprehension of the large quantities of data. Visualization allows the audience to identify concepts and relationships that they had not previously realized.There-by,explicitly revealing properties and relationships inher-ent and implicit in the underlying data.Identifying patterns and anomalies enlightens the ur,provides new knowl-edge and insight,and provokes further explorations.It is the fascinating capabilities that influence the u of information visualization for network curity.Visualiza-tion is not only efficient but also very effective
at communicating information[1].A single graph or picture can potentially summarize a month’s worth of intrusion alerts(depending on the type of network),possibly showing trends and exceptions,as oppod to scrolling through multiple pages of raw audit data with little n of the underlying events.
Security Visualization is a very young term[2],[3].It express the idea that common visualization techniques have been designed for u cas that are not supportive of curity-related data,demanding novel techniques fine tuned for the purpo of thorough analysis.It may not always be possible to fully predict how an end ur will perceive and interpret a design due to the varying nature of the audience’s cognitive characteristics.Yet careful con-sideration of the ur’s needs,cognitive skills,and abilities can determine the appropriate content and design.Often associated with human-computer interaction,the philoso-phy of ur-centered design places the end ur at the center of the design process.Network curity is a highly specialized and technical discipline and operation.It deals with packets and flows,intrusion detection and prevention systems,vulnerabilities,exploits,malware,honeypots,and risk management and threat mitigation.The complex, dynamic,and interdependent nature of network curity demands extensive rearch during the development process.Without an in-depth understanding of curity operations and exten
sive hands on experience,developing a curity visualization system will not be possible.A design process centered on the needs,behaviors,and expectations of curity analysts can greatly influence and impact the usability and practicality of such systems.For best results, curity experts and visual designers must thereby colla-borate to complement each other’s skills and experti to innovate informative,interactive,and exploratory systems that are technically accurate and aesthetically pleasing.
In this survey,we begin by looking into different categories of data sources incorporated in the design of curity visualizations and provide an informative list of sources accessible to the rearch community.We continue in Section3by expressing our main contribution in the classification of network curity visualization systems. We provide a detailed description of the propod taxonomy
内双化妆.The authors are with the Information Security Centre of Excellence,
Faculty of Computer Science,University of New Brunswick,540Windsor
Street,Gillin Hall,Room E128Fredericton,NB E3B5A3,Canada.
E-mail:{hadi.shiravi,ali.shiravi,ghorbani}@unb.ca.
Manuscript received30Aug.2010;revid26June2011;accepted12Aug.
2011;published online23Aug.2011.
Recommended for acceptance by K.-L.Ma.
For information on obtaining reprints of this article,plea nd e-mail to:儿童节快乐英文怎么说
tvcg@computer,and reference IEEECS Log Number TVCG-2010-08-0203.
Digital Object Identifier no.10.1109/TVCG.2011.144.
1077-2626/12/$31.00ß2012IEEE Published by the IEEE Computer Society
together with an analysis of the derived u-ca class.We follow by giving a thorough description of each system as we outline its strengths and weakness.An overall asssment of systems in each u-ca class in addition to guidelines and directions for future systems is also provided.We summarize the multiple attributes of recent network curity visualization systems in a table for better future references. We continue in Section4by outlining issues and concerns surrounding curity visualization by elaborating on ven potential pitfalls.We conclude this rearch in Section5by summarizing our findings.
Papers studied in this survey were lected bad on the following metrics:
1.Relevance to network curity:As the title of the
paper indicates,this study focus specifically on
network curity visualization systems.Visualiza-
tions of code curity,binary files,or visual
cryptanalysis are subjects that could span another
volume of similar size and are thereby not consid-
ered in this study.
strongest
2.Contribution of system and visual techniques:Due
to the chronological study of papers,systems that
have utilized a specific visualization technique or
method with highly similar characteristics to tho
of previous systems have not been lected for this
survey.Similarly,visualization systems that lack
contextual,perceptive,and cognitive considerations
logare also not considered.
3.Satisfactoriness of evaluation:Although most sys-
tems surveyed in this paper lack formal evaluation,
plane
yet many have been validated through ad hoc u-
ca attack scenarios.Systems that lack even this
basic validation strategy are also not considered in
this survey.
limitations
We believe the three metrics impact the quantity and quality of papers surveyed in this work to remble systems that are focud explicitly on network curity,are novel in their incorporated visual techniques,and are validated on at least a u-ca scenario.Systems that do not adhere to the metrics are thereby not considered in this study.
2D ATA S OURCES
Visualization cannot happen without data or information. Many of the systems surveyed in this paper have been created bad on a single source of data.Looking at network events from multiple perspectives by incorporat-ing different data sources into a system can provide an analyst with a richer insight into the underlying events. Therefore,a nonexhaustive list of potential data sources that are available to the rearch community and may be incorporated in the design of network curity visualization systems is given in Table1.The decision on the type and number of incorporated data sources and the t of extracted features from each data source is a critical act. The data sources mentioned in Table1are very generic and in some ,network traces,hundreds of features can be extracted from them.The importance of lecting the appropriate features,as a first step in designing a visualization system,has been extensively studied in the
TABLE1
Potential Data Sources for Security Visualizations
fields of statistics,pattern recognition,machine learning,and data mining and the resulting efforts have been applied to the fields of artificial intelligence,text categorization,and also intrusion detection.The studies are of great benefit to curity visualization rearchers as often the required steps of lecting an optimal subt of features (subt generation,subt evaluation,stopping criterion,and result validation)have been examined extensively before.Bad on a particular problem a rearcher is facing and the data sources available to him or her,a subt of features may be extracted and incrementally validated until a desired optimality is achieved.
3C LASSIFICATION A PPROACH
The approach taken in many visualization systems is data driven.In network curity for instance,one may take a single data source like packet traces and try to develop a visualization system bad on that.The methodology behind the design of visualization systems should be u-ca driven.A visualization system should be built to support answering specific questions.In this approach,the system may incorporate one or multiple data sources.
Bad on this mindt,we have classified the recent works of network curity visualization into five
u-ca class.We provide a detailed description of each class,discuss veral recent examples of each approach,specify the incorporated visualization techniques of each system,and challenge the applicability of each u-ca class in regard to modern day networks.Guidelines for future rearch,and directions for informative and efficient visuals are also provided for each u-ca class at the end of each ction.
3.1Host/Server Monitoring
In this class of visualization,the main display is devoted to the reprentation of hosts and rvers.The intent is to display the current state of a network by visualizing the number of urs,system load,status,and unusual or unexpected host or rver activities.Systems of this class should also be able to correlate communicating process of a single host or rver with the network traffic.This feature enhances the ability of a ur to identify malware as they often manifest themlves in irregular and often anon-ymous system process.
The work of Erbacher et al.[4],[5]constitutes one of the earlier works in this class.As illustrated in Fig.1,hosts are arranged around five concentric circles with the monitored rver placed in the center.The ring of a node depicts the difference between its IP address and that of the monitored sys
tem,resulting in hosts residing inside the local subnet to appear clost to the monitored system.The position of a host on the circular ring is also recorded to ensure that a specific host always appears in the same position.Multiple visual attributes are assigned to each node as they are depicted using glyphs.For the monitored rver,for example,spokes extending from its perimeter reprent the number of connected urs.As connections are made from hosts to the monitored rver and bad on the connection type,communication links are shown with different line patterns.The visual illustrations give an analyst an exploratory framework to work with as it
strengthens her abilities to detect unknown relationships within the underlying data.
Tudumi [6]is also one of the earlier systems belonging to this category aimed at monitoring and auditing ur behavior on a rver.In a 3D visualization,Tudumi visualizes connections using lines and system nodes using 3D glyphs as they are displayed on multilayered concentric disks.Similar to Erbacher’s system,Tudumi us line patterns to encode different access methods including coar dashed lines to reprent a terminal rvice and thin dashed lines to reprent file transfer.
The previous two systems are more concerned with the activities of a single or a limited number of hosts or rvers rather than incorporating a larger portion of the network.NVisionIP [7],[8]takes on a
different approach.It reprents an entire class B IP network on a single 256Â256matrix grid with each cell of the matrix reprenting interactions between the corresponding network hosts.In the galaxy view of the system,all network subnets are listed along the horizontal axis while hosts of each subnet are listed along the vertical axis.As the number of visualized elements increas,inevitably the portion of the screen allocated to each object decreas.NVisionIP us a magnifier function to allow the ur to hover over the display screen.If an analyst is interested in a particular part of the display,she can lect it using the magnifier function.A bar graph is then displayed for each host,depicting their activity over common and uncommon port numbers.
Portall [9]digs deeper into the monitored hosts and tends to correlate TCP connections with the host process that generate them,allowing an end-to-end visualization of communications between distributed process.As dis-played in Fig.2,the main display consists of two parallel axes with the left side reprenting clients and the right side reprenting rvers and their respective process.A line is drawn from a client to a rver to depict a TCP connection.Portall is one of the first systems that visually correlates
启德教育
SHIRAVI
ET AL.:A SURVEY OF VISUALIZATION SYSTEMS FOR NETWORK SECURITY 1315
Fig.1.Basic visual reprentation of network and system activity in [4].
network traffic to host process,allowing spywares and ad-wares to be easily detected.
jane by design
Similar in nature to Portall is the Host Network(HoNe) [10]visualization system that also visualizes communicat-ing process of a host with network traffic.The authors
filledargue that the reason behind not being able to correlate process to network traffic is inherent in the design of the TCP/IP networking model of modern operating systems. The system visualizes client side hosts and their respective process and port numbers on the left side of the display while external sources and their respective port numbers are displayed on the right side.Different to Portall,HoNe us splines rather than simple straight lines to connect process of a client to external rvers.
Perlman and Rheingans[11]extend existing approaches in host/rver monitoring by adding and encoding rvice and temporal information inside the visualized node itlf. Each host inside the network is illustrated using a circular glyph node much like a pie chart.Each glyph reprents the exi
stence and amount of activity for a particular rvice. The size of the glyph reprents the total amount of activity of the node,measured by the number of packets.Wedge sizes identify what percentage of the total activity belongs to a particular rvice.A collection of different colors is ud to distinguish between different rvices of a host.The system also incorporates time by using a stacked pie chart approach where the most outer ring reprents the most recent time slice.Hosts are laid out in a simple node-link layout with straight lines connecting the communicating hosts together.
The Radial Traffic Analyzer[12]visualizes the distribu-tion of network traffic of a particular host using a radial reprentation.The system is compod of four concentric circles,each mapped to an attribute of the underlying data. In its default tting,the innermost ring is assigned to source IP address,the cond ring to destination IP address,and the third and fourth rings are mapped to source and destination port numbers,respectively.The notion of assigning port numbers to rvices and applica-tions,devid in this system,is no longer accurate in modern networks as many applications tend to piggyback or tunnel through common port numbers such as HTTP(80) and HTTPS(443).
The work of Mansmann et al.[13]is one of the recent works in this class.In their propod visual anal
ytics tool, by incorporating a force directed graph layout,host behavior is monitored and irregular positional changes are flagged as suspicious.The authors believe that change in network traffic over time is well suited for detecting uncommon system behaviors.As illustrated in Fig.3,in the first step of the visualization,a t of dimension nodes,each reprenting a network rvice are laid out using a circular force directed layout.In the cond step,the obrvation nodes reprenting a particular host are placed on the display and are connected to their corresponding dimen-sion nodes through virtual springs.Node size is calculated bad on the sum of transferred bytes using a logarithmic scale.Since the state of a monitored host is displayed through multiple time stamps and due to the large number of visualized elements on the display,depicting multiple hosts without overlap is a challenge for this visual.
Overall asssment of the host/rver monitoring class. The ability of the visualization systems of this class in displaying a restrained number of hosts or rvers within the monitored network is a perceptible issue.Most,if not all, of the systems of this class are constrained by their incorporated visualization techniques.As networks tend to grow in size and complexity at an exponential rate,there is an unprecedented need to create meaningful contexts.Even the smallest of university campus networks can consist of thousand of hosts,with which the aforementioned systems are less t
han capable of displaying in a clear and perceivable manner.For an analyst,simpler graphics are easier to understand and interpret than complex ones,since complex-ity can often influence the ability of the viewer in perceiving and decoding a visual.The overwhelming number of hosts in a monitored network,accompanied by the many hundreds of events generated for each,and the complexity of relations between events limit the cognitive process of situation awareness for analysts.For visualizations of this class to be effective and to clearly convey meaning,it is esntial for them to devi an automated process that prioritizes situations and projects critical events.If a visualization system,due to its incorporated visualization technique,is limited in displaying a comprehensible range of hosts,envisioning a situation asssment process is inevitable.In this ca,the process of identifying hosts with anomalous behavior and the mechanism of correlating
1316IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS,VOL.18,NO.8,AUGUST
2012
Fig.2.A screen shot of Portall[9]with monitored hosts and rvers stacked on the left and right
sides.
Fig.3.A sketch showing the coordinate calculation of a host position at a
particular point of time as depicted in[13].
events is partly undertaken in a parate background component,and the procesd results are projected to the visualization system.In this manner,the load on the visualization system is reduced considerably;allowing for a near real time analysis of events and a more responsive system.Packet traces,rver logs,and network flows constitute primary data sources for this class of visualiza-tions.Node link graphs,glyphs,and scatter plots are also primary visualization techniques incorporated in this class.
3.2Internal/External Monitoring
Visualizations of this class are concerned with the interac-tion of internal hosts with respect to external IPs.Similar to the above-mentioned class,this class of visualization also incorporates a display of internal hosts,but in relation to communicating external IPs.Since the art of displaying internal hosts in a nonoccluding and meaningful manner is by itlf a delicate act,adding the burden
of displaying hundreds and thousands of external IPs is a nontrivial process for systems of this class.
VISUAL [14]is one of the earliest systems of this class.It is a curity visualization system with the goal of allowing an analyst to e communication patterns between an internal network in regard to external sources.As displayed in Fig.4,the internal network is reprented by a grid with each cell depicting one of the internal hosts.External sources are reprented as squares outside the internal grid with the square size denoting the level of activity.Simple straight lines are ud to reprent a connection between internal and external hosts.Multiple filtering mechanisms can be ud to filter out internal or external hosts leading to a less cluttered display.Various detailed information regarding a host can also be displayed upon ur request.VizFlowConnect [15]us parallel axes to display net-work traffic between internal and external hosts.The goal of the system is to display relationships between communicat-ing machines of a network.The main display consists of
three distinct parallel axes.The center axis reprents internal hosts.The left axis corresponds to machines originating network traffic to the internal network while the right axis reprents the destination machines of internal traffic.Each point on an axis reprents an IP address and connections between points on parallel axes reprent network communication.Time is incorporated
in the system by using animation and various multiple views allow for further exploratory analysis.VizFlowConnect also shows individual host statistics,but further drill down depth is desired.
Erbacher et al.[16]have come up with a cond visualization system;this time aimed at internal/external host monitoring and geared toward filtering unwanted data,allowing focus on more critical events.The visual system incorporates a radial panel design consisting of multiple concentric disks each showing a constant period of time.Local IP address are placed around the radial disks while remote hosts are located on the top and bottom of the display.In order to avoid overlapping lines,an IP address located on the top half of the circle is connected to remote hosts located along the top of the display while hosts located on the bottom half of the circle are connected to remote hosts located on the bottom of the display.In the same manner,port numbers are also allocated on the left and right sides.The outer ring of the display shows the most recent period with interior rings displaying previous periods.This feature allows an analyst to e trends and patterns within the communicating hosts.Hosts are identified by dots on the circular rings resulting in difficult ur interaction.
In a visual network traffic analysis system,TNV [17],Goodall et al.believe that analysts often lo sight of the big picture while examining low-level details of attacks.In order to prevent this loss of con
text,they propo TNV with the goal of providing a focud view on packet level data in the high-level network traffic context.As illustrated in Fig.5,the main visual component of TNV is a matrix displaying
SHIRAVI ET
AL.:A SURVEY OF VISUALIZATION SYSTEMS FOR NETWORK SECURITY 1317
Fig.4.VISUAL [14]
displaying 80hours of network data on a network of 1,020hosts.
Fig. 5.TNV [17]showing 50,000network packets in a 90minute time span.
network activity of hosts over time,with connections between hosts overlaid on the matrix.TNV is designed bad on a focus and context paradigm where the center of the display,the focal area,shows communicating hosts within wider columns.In order to prerve continuity throughout the display,the context area,located to the left and right sides of the display,has gradually decreasing width.Each host inside the matrix is colored according to its level of activity and multiple linked views are ud to illustrate port activity and details of raw packets.TNV is one of the few curity visualization systems that has been fully implemented and is freely available for download.
Overall asssment of the internal/external monitoring class.Similar to the recommendations mentioned for the host/rver monitoring class,the visualizations of this class can greatly benefit from a situation asssment component.This component can be defined in two different styles.One,as a process that automatically identifies and evaluates the impact of underlying events and relates them to asts of the monitored network or two,as an exploratory system that provides the facility for an analyst to validate various hypothes.In the first style,due to the processing of eve
nts in a background component,the visual component can focus better toward richer and more responsive ur interfaces.A necessity that is lacking and often overlooked in curity visualization systems.In the cond style,it is the analyst’s job to po queries,correlate disparate events,and derive insightful meanings from the visualization.The activities impo the need for visual exploration and filtering mechanisms to be implemented.Dynamic queries,details on demand techniques,and linking and brushing interaction techniques are esntial concepts that need to be addresd and considered in this class of visualizations.Color maps,radial panels,scatter plots,and parallel coordinates are common visual techniques ud in this class.Packet traces and network flows are also ud as the main data sources for visualizations of this class.
3.3Port Activity
Designers of this class of visualization argue that various malicious programs like virus,Trojans,and worms manifest themlves through unusual and irregular port activity.Visualizations of this class can aid in the detection of malicious software running inside a network.Scaling techniques must be incorporated in the design of visualiza-tions of this class,due to the amount of traffic as well as the large range of possible port numbers and IP address.One of the earlier visualization systems designed speci-fically for this class is the work of Abdullah et al.[18].
In their developed system,a port-bad overview of network activity is prented through stacked histograms of aggre-gated port activity.The authors believe that port activity can be ud to detect zero-day exploits that are not detectable by conventional methods.As displayed in Fig.6,port numbers are aggregated into multiple groups bad on the rvices provided in the network.Well-known ports (<1;024)are assigned to major rvices on a system making them more vulnerable to attacks.For this reason,they are placed into bins of 100’s,registered ports (<50;000)are placed into bins of 10,000’s,and the remaining private/dynamic ports (50,000-65,535)are placed into a single bin.Color and
scaling methods are also ud effectively to distinguish between the aggregated port groups.In their developed system,the ur has the ability to drill down in order to view finer details of the visualization.Displaying data over time also helps to highlight any patterns or trends appearing in irregular activities.The visualization is intuitive,easy to work with,and meets its intended design goals.
The Spinning Cube of Potential Doom [19]is an interesting example of curity visualization.A system that visualizes real-time port and IP data in a three dimensional cube,displayed as a rotating scatter plot.Each axis of the 3D display reprents a component of a TCP connection.Destination IP
address are mapped to the X -axis,port numbers to the Y -axis,and source IP address to the Z -axis.TCP connections are displayed as individual dots with color ud to distinguish a successful connection from an unsuccessful one.Time is displayed through the u of animation.While quite uful to e coar trends in large-scale networks,it lacks drill down mechanisms,multiple views,and interactive capabilities.The system is good for solo attacks and can only be ud for port scan detection.PortVis [20]employs a colored-bad grid visualization to map network activity to cells of a grid.As depicted in Fig.7,the main display contains a 256Â256grid where each point reprents one of the possible 65,536port numbers.The location of a port on the gird is determined by breaking the port number into a 2-byte (X,Y)location.X being the high byte of the port number and Y being the low byte.Changes and variations of each point,with respect to time,is depicted using color.Black portrays no variation or change,blue depicts a small level of variance,red refers to a larger level of variance,while white denotes the most variant.The grid can be magnified to provide further detailed information about specific ports.A drawback of the system is en when a port with suspicious activity is located among a collection of ports with a high,legitimate,level of activity.In this ca,the ability to identify and focus on that region is not an easy task.NetBytes Viewer [21]allows a detailed inspection of the behavior of an individual host over time.It facilitates in identifying behavioral changes that manifest themlves as unusual port usage or traffic volume regarding a sin
gle host.NetBytes offers multiple views in both two and three dimensions,making it possible for an analyst to view the
1318IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS,VOL.18,NO.8,AUGUST
2012
beagle
Fig.6.Botnet traffic capture displayed using a cube root scale histogram in [18].