Towards measuring anonymity
Claudia D´ıaz,Stefaan Seys,Joris Claesns,and Bart Preneel
K.U.Leuven ESAT-COSIC
Kasteelpark Arenberg10,B-3001Leuven-Heverlee,Belgium
claudia.diaz@esat.kuleuven.ac.be
www.esat.kuleuven.ac.be/cosic/
Abstract.This paper introduces an information theoretic model that
enjoyallows to quantify the degree of anonymity provided by schemes for
anonymous connections.It considers attackers that obtain probabilis-
tic information about urs.The degree is bad on the probabilities an
attacker,after obrving the system,assigns to the different urs of the
system as being the originators of a message.As a proof of concept,the
model is applied to some existing systems.The model is shown to be
very uful for evaluating the level of privacy a system provides under
various attack scenarios,for measuring the amount of information an at-
tacker gets with a particular attack and for comparing different systems
amongst each other.
1Introductionabs是什么意思
In today’s expanding on-line world,there is an increasing concern about the protection of anonymity and privacy in electronic rvices.In the past,many technical solutions have been propod that hide a ur’s identity in various ap-plications and rvices.Anonymity is an important issue in electronic payments, electronic voting,electronic auctions,but also for email and web browsing.
A distinction can be made between connection anonymity and data anonymity. Data anonymity is ab
outfiltering any identifying information out of the data that is exchanged in a particular application.Connection anonymity is about hiding the identities of source and destination during the actual data transfer. The model prented in this paper focus on the level of connection anonymity a system can provide,and does not indicate any level of data anonymity.
Information theory has proven to be a uful tool to measure the amount of information(for an introduction,e Cover and Thomas[4]).We try to measure the information obtained by the attacker.In this paper,a model is propod, bad on Shannon’s definition of entropy[11],that allows to quantify the degree of anonymity of an electronic system.This degree will be dependent on the power of the attacker.The model is shown to be very uful to evaluate the anonymity a system provides under different circumstances,to compare different systems, and to understand how a system can be improved.
Appeared in Proceedings of PET2002,April14-15,2002,San Francisco,
In Hannes Federath(Ed.),Designing Privacy Enhancing Technologies,Lecture Notes
in Computer Science,2002.
1.1Related work
To our knowledge,there have been veral attempts to quantify the degree of anonymity of a ur provided by an anonymous connection system.
Reiter and Rubin[9]define the degree of anonymity as1−p,where p is the probability assigned to a particular ur by the attacker.We believe that this degree is uful to get an idea of the anonymity provided by the system to the ur who is in the worst ca,but it does not give information on how distinguishable the ur is within the anonymity t.For a system with a large number of possible nders the ur who is in the worst ca may have an assigned probability that is less than1/2but still be distinguishable by the attacker becau the rest of the urs have very low associated probabilities.
Berthold et al.[2]define the degree of anonymity as A=log2(N),where N is the number of urs of the system.This degree only depends on the number of urs of the system,and does not take into account the information the attacker may obtain by obrving the system.Therefore,it is not uful to measure the robustness of the system towards attacks.The degree we propo in this paper measures the information the attacker gets,taking into account the whole t of urs and the probabilistic information the attacker obtains about them.
Wright et al.analyze the degradation of anonymous protocols in[12].They assume that there is a recurring connection between the nder of a message an the receiver.
An anonymity measurement model similar to the one propod in this paper has been independently propod by Serjantov and Danezis in[10].The main difference between the two models is that their system does not normalize the degree in order to get a value relative to the anonymity level of the ideal system for the same number of urs.
1.2Outline of the paper
This paper is organized as follows:Section2describes the system and attack model;the actual measurement model is then propod in Section3.As a proof of concept,this model is applied to some existing systems in Section4.Finally, our conclusions and some open problems are prented.
burst>logo什么意思2System model
In this paper we focus on systems that provide anonymity through mixes.The system model we consider,thus consists of the following entities:
Senders.The are urs who nd(or have the ability to nd)messages to recipients.The messa
ges can be emails,queries to a databa,requests of web pages,or any other stream of data.The nders can be grouped into the t of nders,that is also called the anonymity t.The are the entities of the system who anonymity we want to protect.
Appeared in Proceedings of PET2002,April14-15,2002,San Francisco,
In Hannes Federath(Ed.),Designing Privacy Enhancing Technologies,Lecture Notes
in Computer Science,2002.
During the attack,we consider the number of nders constant,and nders behaving as independent,identical,stationary stochastic Poisson process.This is a standard assumption for modeling the behavior of urs making phone calls[5].This means that all urs nd,in average,the same amount of mes-sages,and the interval of time between one message and the next one follows a Poisson distribution.
Recipients.The are the entities that receive the messages from the nders. Recipients can be active(if they nd back answers to the nders)or passive (if they do not react to the received message).Depending on the system there is a large variety of recipients.Some examples are web
rvers,databas,email accounts or bulletin boards where urs can post their messages.The attacker may u the reply messages to gain information.
Mixes.The are the nodes that are typically prent in solutions for anony-mous connections.They take messages as input,and output them so that the correlation with the corresponding input messages is hidden.There are many different ways to implement a mix;if more than a single mix is ud(which is usually done in order to achieve better curity),there are veral methods to route the message through a chain of mixes;a summary can be found in[2,7]. In some of the ,Crowds,the nodes do not have mixing properties as the ones described by Chaum[3].In the cas the actual properties of the intermediate nodes will be mentioned.
Note that in some systems the interction between the different ts might be ,a nder could be at the same time a recipient or a mix).
Examples of systems that provide anonymous connections are Crowds[9]and Onion Routing[8].The propod measurement model is shown to be suitable for the systems.It is however generally applicable to any kind of system.
2.1Attack model
The degree of anonymity depends on the probabilities that the urs have nt a particular message;the probabilities are assigned by the attacker.The degree is therefore measured with respect to a particular attack:the results obtained for a system are no longer valid if the attack model changes.Concrete assumptions about the attacker have to be clearly specified when measuring the degree of anonymity.
We briefly describe the attacker properties we consider:
–Internal-External:An internal attacker controls one or veral entities that are part of the ,the attacker can prevent the entity from nding messages,or he may have access to the internal information of the entity);
an external attacker can only compromi communication ,he can eavesdrop or tamper with messages).
–Passive-Active:A passive attacker only listens to the communication or reads internal information;an active attacker is able to add,remove and modify messages or adapt internal information.
Appeared in Proceedings of PET2002,April14-15,2002,San Francisco,
In Hannes Federath(Ed.),Designing Privacy Enhancing Technologies,Lecture Notes
in Computer Science,2002.
–Local-Global:A global attacker has access to the whole communication sys-tem,while a local attacker can only control part of the resources.
Different combinations of the previous properties are possible,for instance a global passive external attacker is able to listen to all the channels,while a local internal active attacker can control,for example,a particular mix,but is unable to get any other information.
In our model,an attacker will carry out a probabilistic attack.It has been pointed out by Raymond in[7]that the attacks have not been thoroughly addresd so far.With such an attack,the adversary obtains probabilistic infor-mation of the form with probability p,A is the nder of the message.
3Propod measurement model
First of all,we should give a preci definition of anonymity.In this paper we adopt the definition given by Pfitzmann and K¨o hntopp in[6].Anonymity is the state of being not identifiable within a t of subjects,the anonymity t.A nder is identifiable when we get information that can be linked to him,
<,the IP address of the machine the nder is using.
In this paper we only consider nder anonymity.This means that for a particular message the attacker wants tofind out which subject in the anonymity t is the originator of the message.The anonymity t in this ca is defined as the t of honest1urs who might nd a message.It is clear that the minimum size of the anonymity t is2(if there is only one ur in the anonymity t it is not possible to protect his identity).
Our definition for the degree of anonymity is bad on probabilities:after obrving the system,an attacker will assign to each ur a probability of being the nder.
3.1Degree of anonymity provided by the system
According to the previous definitions,in a system with N urs,the maxi-mum degree of anonymity is achieved when an attacker es all subjects in the anonymity t as equally probable of being the originator of a message. Therefore,in our model the degree of anonymity depends on the distribution of probabilities and not on the size of the anonymity t,in contrast with previous work[1,2].This way,we are able to measure the quality of the system with respect to the anonymity it provides,independently from the number of urs who are actually using it.Nevertheless,note that th
e size of the anonymity t is ud to calculate the distribution of probabilities,given that the sum of all probabilities must be1.
The propod model compares the information obtained by the attacker after obrving the system against the optimal situation,in which all honest urs 1Urs controlled by the attacker are not considered as part of the anonymity t, even if they are not aware of this control.
Appeared in Proceedings of PET2002,April14-15,2002,San Francisco,
In Hannes Federath(Ed.),Designing Privacy Enhancing Technologies,Lecture Notes
in Computer Science,2002.
em to be equally probable as being the originator of the message,that is,in a system with N urs,the situation where the attacker es all urs as being the originator with probability1/N.fighting是什么意思
After obrving the system for a while,an attacker may assign some probabil-ities to each nder as being the originator of a message,bad on the information the system is leaking,by means of traffic analysis,timing attacks,message length attacks or more sophisticated attacks.
For a given distribution of probabilities,the concept of entropy in information theory provides a measure of the information contained in that distribution[4]. We u entropy as a tool to calculate the degree of anonymity achieved by the urs of a system towards a particular attacker.The entropy of the system after the attack is compared against the maximum entropy(for the same number of urs).This way we get an idea of how much information the attacker has gained, or,in other words,we compare how distinguishable the nder is within the t of possible nders after the attack.
Lex X be the discrete random variable with probability mass function p i= P r(X=i),where i reprents each possible value that X may take.In this ca, each i corresponds to an element of the anonymity t(a nder).We denote by H(X)the entropy of the system after the attack has taken place.For each nder belonging to the nders t of size N,the attacker assigns a probability p i.H(X)can be calculated as:
H(X)=−
怜悯的意思
N
i=1p i log2(p i).
Let H M be the maximum entropy of the system we want to measure,for the actual size of the anonymity t:
H M=log2(N),
where N is the number of honest nders(size of the anonymity t).
The information the attacker has learned with the attack can be expresd as H M−H(X).We divide by H M to normalize the value.We then define the degree of anonymity provided by the system as:
mydayd=1−H M−H(X)
H M
=
H(X)
mba有学费
H M
.
报考公务员的学历要求
For the particular ca of one ur we assume d to be zero.
This degree of anonymity provided by the system quantifies the amount of information the system is leaking.If in a particular system a ur or a small group of urs are shown as originators with a high probability with respect to the others,this system is not providing a high degree of anonymity.2 It follows immediately that0≤d≤1:
2On the other hand,note that any system with equiprobable distribution will provide
a degree of anonymity of one,therefore a system with two nders will have d=1if
both of them are assigned probability1/2.This is becau the definition of anonymity we are using is independent of the number of nders.
Appeared in Proceedings of PET2002,April14-15,2002,San Francisco,
In Hannes Federath(Ed.),Designing Privacy Enhancing Technologies,Lecture Notes
in Computer Science,2002.
–d=0when a ur appears as being the originator of a message with proba-bility1.
–d=1when all urs appear as being the originator with the same probability (p i=1/N).
4Measuring the degree of anonymity provided by some systems
In this ction we apply our propod measurement model in order to analyze the degree of anonymity provided by some existing systems,in particular Crowds and Onion Routing.
4.1A simple example:mix bad email.
As afirst example,let us consider the system shown in Fig.1.Here we have a system that provides anonymous email with10potential nders,a mix network and a recipient.The attacker wants tofind out which of the nders nt an email to this particular recipient.By means of timing attacks and traffic analysis,the attacker assigns a certain probability to each ur as being the nder.The aim of this example is to give an idea on the values of the degree of anonymity for different distributions of
probabilities.
我最喜欢的玩具
Active attack.Wefirst consider an active internal attacker who is able to control eight of the nders(that means that the eight urs have to be excluded from the anonymity t).He is also able to perform traffic analysis in the whole mix network and assign probabilities to the two remaining nders.Let p be the probability assigned to ur1and1−p the probability assigned to ur2.
The distribution of probabilities is:
p1=p;p2=1−p,
and the maximum entropy for two honest urs is:
H M=log2(2)=1.
Appeared in Proceedings of PET2002,April14-15,2002,San Francisco,
In Hannes Federath(Ed.),Designing Privacy Enhancing Technologies,Lecture Notes
in Computer Science,2002.