fc2010_name_guessing

更新时间:2023-05-07 17:06:35 阅读: 评论:0

What's in a Name?
Evaluating Statistical Attacks on
Personal Knowledge Questions
Joph Bonneau,1Mike Just,2Greg Matthews2
1 niversity of g m ridge
2 niversity of idin urgh
Abstract. e study the e0 ien y of st tisti  l tt  ks on hum n uE
thenti  tion systems relying on person l knowledge questionsF e d pt
te hniques from guessing theory to me sure urity g inst tr wlE
ing tt  ker ttempting to ompromi l rge num er of str ngers9  E
ountsF e then ex mine diver orpus of re lEworld st tisti  l disE
tri utions for likely nswer  tegories su h s the n mes of peopleD petsD
nd pl  es nd(nd th t person l knowledge questions re signi(  ntly
less ure th n gr phi  l or textu l p sswordsF e lso demonstr te
th t st tisti s  n e ud to in re urity y pro  tively sh ping
the nswer distri ution to lower the prev len e of ommon responsF
1Introduction
Secret knowledge stored in human memory remains the most widely deployed means of human-computer authentication.It is often referred to as something you know in contrast to biometrics(something you are)or hardware tokens (something you have).While human memory is limited,the high deployment costs of alternatives mean we will continue to rely on it for the foreeable future.
The most common human-memory systems require recalling data speci cally remembered for authentication.Passwords and PINs are the most well-known, but there exist a variety of graphical a
nd textual schemes to aid in recalling cret data[31,29,22,6].Among other problems,passwords are forgotten frequently enough[31]that many deployed systems also u personal knowledge for backup authentication.In contrast to passwords,personal knowledge questions such as  who was my rst-grade teacher? query facts remembered independently of the system so they are hoped to be recalled successfully when passwords fail.
In the majority of online banking,e-commerce,webmail and social network-ing websites,urs register a question-answer pair on enrolment which can later be ud to authori a password ret.The systems can be no more cure than the di culty of guessing the answers to the questions.This risk was highlighted in the past year as hackers exploited personal knowledge questions to compromi accounts of politician Sarah Palin and top executives at Twitter.
Despite their ubiquity,personal knowledge questions have received relatively little attention from the curity community until recently.Ur studies have
P toph fonne uD wike tustD qreg w tthews
demonstrated the ability of friends,family,and acquaintances to guess answers correctly[26,13,24],while other rearch has found some questions ud in prac-tice have a tiny t
of possible answers[15,25].Many common questions have also been shown to have answers readily available in public databas or on-line social networks[18].For example,at least30%of Texas residents'mothers' maiden names can be deduced from birth and marriage records[12].
Designers may be able to avoid easily looked-up questions,but it remains an open question as to how cure typical questions are against a statistical attacker that attempts to break into a small fraction of anonymous accounts by guessing the most likely answers.While this threat has been brie y touched on in previous rearch[15,26],we contribute a formal curity model bad on the information-theoretic model of guessing developed over the past decade.We then examine a range of public statistics that we collected to bound the e ciency of statistical attacks.Our results show most questions to be highly incure,calling into rious question the continued u of personal knowledge questions.
2Security Model
2.1Authentication Protocol
Most deployed systems u personal knowledge questions in a simple challenge-respon protocol.The party eking access,called the prover or claimant, rst nds its identity i to the veri er.T
he veri er then responds with a challenge question q,to which the prover nds back an answer x.Unlike most challenge-respon protocols,the prover's cret knowledge x is usually revealed to the veri er.Replay attacks can be partially addresd by having the veri er include a nonce r along with q,and having the prover respond with H(x,q,r)for some one-way function H.However,an eavesdropper still gains the ability to perform o ine arch for likely values of x using H as an oracle(and as we shall e,few personal-knowledge questions are resistant to o ine arch).
Additionally,while the challenge from a veri er is typically a fresh random nonce for cryptographic challenge-respon,the t Q of personal knowledge questions registered with the veri er is often very small or even a single question. Some non-traditional question types may increa|Q|,such as preference-bad authentication [14],but the upper limit appears low due to the fundamental requirement of human e ort to lect and answer questions on enrolment.
Finally,unlike many challenge-respon protocols,the veri er must maintain a counter t i of failed authentication attempts from each prover i to limit the number of guess an attacker can make.Such a protocol is said to be online,in contrast to stateless protocols in which the attacker can make as many guess as bandwidth allows.O ine protocols rarely u personal knowledge questions due to the di culty of preventing brute-force attacks,though systems have been propod f
or personal password-backup which require simultaneously answering many questions[8,11].
h t9s in x mec Q
2.2Threat Model
Our attacker's goal is to impersonate some legitimate prover i and successfully complete the protocol.The attacker may only desire to gain access on behalf of one speci c ur in a targeted attack,or may be content to gain access on behalf of any ur in a trawling attack.In the former ca,the attacker knows the ac-count i reprents some real-world person Peggy,enabling rearch attacks using arch engines,online social networks,or public records.An active attacker could conduct more advanced rearch by dumpster diving,burgling Peggy's home,or social engineering to trick Peggy into revealing her answer.Targeted attacks may also be performed by somebody who knows Peggy personally.Schechter et al. explored this attack in a laboratory tting and found a high rate of success by acquaintances at guessing personal knowledge questions[26].
Targeted attacks are powerful but do not scale.Trawling attacks,in contrast, require little per-ur work and can be ud to simultaneously attack many accounts.We assume that a trawling attacker,although they must provide a value for i when initiating the protocol,has no information abou
t the real-world person behind i and must guess answers bad on population-wide statistics.
A blind attacker guess without even understanding the question q[15]. This scenario aris if the question is either not transmitted in the clear[21], is transmitted in a CAPTCHA-id form,or is ur-generated and di cult to automatically process.3We argue that a more successful attack strategy is to u a weighted combination of answers to likely questions.
An attacker who is able to correctly understand q but not i is a statistical attacker(called a focud attacker in[15]),who strategy is to guess the most likely answers to q.Our main goal is to evaluate the curity of common ques-tions against statistical attack.While some , What is my favourite colour? )obviously have too few plausible answers to be cure,the most com-mon class of answer found repeatedly in practice are the proper names of people,pets,and places,who curity against guessing is not obvious.
The attackers we have identi ed are not exclusive.While there is a gen-eral hierarchy between blind,statistical,and rearch attacks,an attacker may combine statistics and targeted rearch.For example,partial knowledge of an account-holder's identity may enable an attacker to re ne her statistical tables (e Section5).
3Quantifying Resistance to Guessing
3.1Mathematical Formulation of Guessing
We now turn to the mathematical problem of quantifying how cure a personal knowledge question q is against guessing.This problem has been previously 3 ome urs m y even purpofully o fus  te their questionsD su h s  h t do s w nt to doc  IS F
R toph fonne uD wike tustD qreg w tthews
considered abstractly[4,23,3,20]and in the ca of PINs[2],graphical pass-words[6,29,22],and biometrics[1];we synthesi previous analysis and de ne new metrics most applicable to trawling attackers.
Becau a statistical attacker will respond equally to what is my boss'last name? or who was my kindergarten teacher? by guessing common surnames, we ek to measure curity of the underlying answer space.We consider the correct answer to be a random variable X drawn from a nite distribution X which is known to the attacker,with|X|=N and probability p i=P(X=x i) for each possible answer x i,for i∈[1,N].We assume that X is arranged as a monotonically decreasing distribut
ion with p1≥p2≥···≥p N.Our attacker's goal is to guess X using as few queries of the form is X=x i? as possible.
Intuitively,we may rst think of the Shannon entropy
H1(X)=−
N
i=1
p i lg p i(1)
as a measure of the uncertainty of X.Introduced by Claude Shannon in1948, entropy has entered common cryptographic parlance as a measure of curity, with high-entropy crets being considered advantageous[8,11].As has been argued previously[4,23,3,20,2,6,1],H1is a poor estimator of guessing di culty for curity purpos,as it quanti es the average number of subt membership queries of the form Is X∈S? for arbitrary subts S⊆X.4
Becau cryptographic protocols are speci cally designed to require quen-tial guessing,a better m
etric is the expected number of attempts required to correctly guess X if the attacker takes up the obvious strategy of guessing each possible event in order of its likeliness,known as the guessing entropy:
G(X)=E
#guess(X R←X)
=
N
i=1
p i·i(2)
This measure was introduced by Masy[20]and later named by Cachin[4].
3.2Marginal Guessing
Guessing entropy models an attacker who will never give up in her arch,and thus it can be skewed
by exceedingly unlikely events.A simple thought experi-ment demonstrates why this is inadequate for our purpos.Suppo Eve must quentially guess k challenge questions with answers drawn from X.Some ques-tions will have uncommon answers,and Eve must make∼k·G(X)guess.
Now consider a cond adversary Mallory who goal is to guess the answers to k questions from a t of m>k total questions.Her optimal strategy is to rst guess the most likely value for each question in quence,then the cond-most 4 he proof of this is str ightforw rd onquen e of h nnon9s sour e oding theoE remF ym ols X R←X  n e en oded using ru'm n ode with ver ge it length ≤H1(X)+1D nd the dvers ry  n le rn one it t time with t queriesF
h t9s in  x mec S
likely value for each question,and so on.Mallory's e ciency will greatly increa as m increas,as she may never need to guess uncommon answers.Guessing entropy is inadequate as it doesn't account for Mallory's willingness to give up on the questions which have less probable answers.
To bound an attacker who only requires some probability αof guessing cor-rectly,we de ne the marginal guesswork µα:
µα(X )=min
j ∈[1,N ]
j  i =1
p i ≥α (3)
This function,introduced by Pliam [23],is also referred to as the α-work-factor .We de ne a similar metric λβ,the marginal success rate ,slightly adapted from Bozta³[3],as the probability of success after βguess have been made:
λβ(X )=
β i =1
p i (4)
3.3E ective Key Length Metrics
While it is important to remember that µαand λβare not measures of entropy,
we nonetheless  nd it convenient to convert them into units of bits.This makes all the metrics H 1,G ,
µαand λβdirectly comparable and has an intuitive interpretation as (logarithmically-scaled)attacker workload.We convert each metric by calculating the logarithmic size of a discrete uniform distribution U N
of size |U N |=N with p i =1
N
for all 1≤i ≤N ,which has the same value of the guessing metric.This can be thought of as the  e ective key length as it reprents the size of a randomly-chon cryptographic key which would give equivalent curity.The guessing entropy of U N is:
G (U N )=N
i =1
p i ·i =1N N  i =1i =1N ·N (N +1)2=
N +1
2The entropy of this distribution is lg N ,so given the guessing entropy of an arbi-trary distribution G (X )we can  nd the logarithmic size of a uniform distribution
with equivalent guessing entropy as:
˜G
(X )=lg[2·G (X )−1](5)
The quantity ˜G
(X )can then be interpreted as the e ective key length of X with respect to guessing entropy.We can similarly derive formulas for e ective key length with respect to marginal guesswork and marginal success rate:
˜µα(X )=lg
µα(X )α ˜λβ(X )=lg  βλβ(X )
(6)

本文发布于:2023-05-07 17:06:35,感谢您对本站的认可!

本文链接:https://www.wtabcd.cn/fanwen/fan/89/866348.html

版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。

标签:
相关文章
留言与评论(共有 0 条评论)
   
验证码:
推荐文章
排行榜
Copyright ©2019-2022 Comsenz Inc.Powered by © 专利检索| 网站地图