Joe Celko’s SQL for Smarties. DOI: 10.1016/B978-0-12-382022-8.00022-3 Copyright © 2011 by Elvier Inc. All rights rerved.22
EXISTS() PREDICATE
The EXISTS predicate is very natural. It is a test for a nonempty t
(read: table). If there are any rows in its subquery, it is TRUE; oth-
erwi, it is FALSE. This predicate does not give an UNKNOWN result.
The syntax is:
<exists predicate> ::= [NOT] EXISTS <table subquery>
It is worth mentioning that a <table subquery> is always inside
parenthes to avoid problems in the grammar during parsing.
In SQL-89, the rules stated that the subquery had to have a
SELECT clau with one column or an asterisk (*). If the SELECT *
option was ud, the databa engine would (in theory) pick one
column and u it. This fiction was needed becau SQL-89 defined
subqueries as having only one column. Things are better today.
Some early SQL implementations would work better with
EXISTS(SELECT <column> ..), EXISTS(SELECT <constant> ..),
or EXISTS(SELECT * ..)versions of the predicate. Today, there
is no difference in the three forms in any SQL I know. The
EXISTS(SELECT * ..) is now the preferred form since it shows that
we are working at the table level, without regard to columns.
Indexes are very uful for EXISTS()predicates becau they
can be arched while the ba table is left completely alone. For
example, we want to find all the employees who were born on the
same day as any famous person. The query could be
p_name AS famous_person_birth_date_guy
FROM Personnel AS P
WHERE EXISTS
(SELECT *
小儿夏季热
FROM Celebrities AS C
WHERE P.birth_date = C.birth_date);
If the table Celebrities has an index on its birth_date column,
the optimizer will get the current employee’s birth_date P.birth_
date and look up that value in the index. If the value is in the
index, the predicate is TRUE and we do not need to look at the
Celebrities table at all.
381
382Chapter 22 EXISTS() PREDICATE
If it is not in the index, the predicate is FALSE and there is still
no need to look at the Celebrities table. This should be fast, since
indexes are smaller than their tables and are structured for very
fast arching.
However, if Celebrities has no index on its birth_date column,
the query may have to look at every row to e if there is a birth_
date that matches the current employee’s birth_date. There are
some tricks that a good optimizer can u to speed things up in
this situation.
22.1 EXISTS and NULL s
A NULL might not be a value, but it does exist in SQL. This is often
a problem for a new SQL programmer who is having trouble with
the concept of NULL s and how they behave.
Think of them as being like a brown paper bag—you know
that something is inside becau you lifted it up and felt a weight,
but you do not know exactly what that something is. If you felt
an empty bag, you know to stop looking. For example, we want
to find all the Personnel who were not born on the same day as
a famous person. This can be answered with the negation of the
original query, like this:
p_name AS famous_birth_date_person
FROM Personnel AS P
WHERE NOT EXISTS
(SELECT *
FROM Celebrities AS C
WHERE P.birth_date = C.birth_date);
But assume that among the Celebrities, we have a movie star
who will not admit her age, shown in the row ('Gloria Glamor',
职务和职级
NULL). A new SQL programmer might expect that Ms. Glamor
would not match to anyone, since we do not know her birth_date
yet. Actually, she will match to everyone, since there is a chance
that they may match when some tabloid newspaper finally gets
a copy of her birth certificate. But work out the subquery in the
usual way to convince yourlf:
..
WHERE NOT EXISTS
(SELECT *
FROM Celebrities
WHERE P.birth_date = NULL);
becomes
Chapter 22 EXISTS() PREDICATE 383 ..
WHERE NOT EXISTS
(SELECT *
FROM Celebrities
WHERE UNKNOWN);
which then becomes
..
WHERE TRUE;
and you will e that the predicate tests to UNKNOWN becau of
the NULL comparison, and therefore fails whenever we look at
Ms. Glamor.
Another problem with NULL s is found when you attempt to
convert IN predicates to EXISTS predicates. Using our example
of matching our Personnel to famous people, the query can be
rewritten as:
p_name AS famous_birth_date_person
FROM Personnel AS P
WHERE P.birth_date
NOT IN
(SELECT C.birth_date
FROM Celebrities AS C);
However, consider a more complex version of the same query,
where the celebrity has to have been born in New York City. The
IN predicate would be:
p_name, 'was born on a day without a famous New
Yorker!'
FROM Personnel AS P
WHERE P.birth_date
NOT IN
(SELECT C.birth_date
FROM Celebrities AS C
WHERE C.birth_city_name = 'New York');
and you would think that the EXISTS version would be:
p_name, 'was born on a day without a famous New
Yorker!'
FROM Personnel AS P
WHERE NOT EXISTS
(SELECT *
FROM Celebrities AS C
WHERE C.birth_city_name = 'New York'
AND C.birth_date = P.birth_date);
Assume that Gloria Glamor is our only New Yorker and we still
do not know her birth_date. The subquery will be empty for every
384Chapter 22 EXISTS() PREDICATE
employee in the NOT EXISTS predicate version, becau her NULL
birth_date will not test equal to the known employee birthdays.
That means that the NOT EXISTS predicate will return TRUE and
we will get every employee to match to Ms. Glamor. But now look
at the IN predicate version, which will have a single NULL in the
subquery result. This predicate will be equivalent to (Personnel.
birth_date = NULL), which is always UNKNOWN, and we will get no
Personnel back.
Likewi, you cannot, in general, transform the quantified
comparison predicates into EXISTS predicates, becau of the
犹豫的意思possibility of NULL values. Remember that x <> ALL <subquery>
is shorthand for x NOT IN <subquery> and x = ANY <subquery> is
shorthand for x IN <subquery>, and it will not surpri you.
In general, the EXISTS predicates will run faster than the IN
predicates. The problem is in deciding whether to build the query
or the subquery first; the optimal approach depends on the size
and distribution of values in each, and that cannot usually be
known until runtime.
22.2 EXISTS and INNER JOIN s
The [NOT] EXISTS predicate is almost always ud with a corre-
lated subquery. Very often the subquery can be “flattened” into
a JOIN, which will often run faster than the original query. Our
sample query can be converted into:
p_name AS famous_birth_date_person
FROM Personnel AS P, Celebrities AS C
WHERE P.birth_date = C.birth_date;
The advantage of the JOIN version is that it allows us to show
columns from both tables. We should make the query more infor-
mative by rewriting the query:
p_name, C.emp_name
FROM Personnel AS P, Celebrities AS C工作意义
WHERE P.birth_date = C.birth_date;
This new query could be written with an EXISTS()predicate,少儿兴趣班
but that is a waste of resources.
p_name, 'has the same birth_date as ',
FROM Personnel AS P, Celebrities AS C
WHERE EXISTS
(SELECT *联想u盘启动
FROM Celebrities AS C2
WHERE P.birth_date = C2.birth_date
p_name = C2.emp_name);
Chapter 22 EXISTS() PREDICATE 385
22.3 NOT EXISTS and OUTER JOIN s
The NOT EXISTS version of this predicate is almost always ud with
a correlated subquery. Very often the subquery can be “flattened”
into an OUTER JOIN, which will often run faster than the original
query. Our other sample query was:
p_name AS Non_famous_New_Yorker_birth_date
FROM Personnel AS P
WHERE NOT EXISTS
(SELECT *
FROM Celebrities AS C
WHERE C.birth_city_name = 'New York'
AND C.birth_date = P.birth_date);
which we can replace with:
p_name AS famous_New_Yorker_birth_date
FROM Personnel AS P
LEFT OUTER JOIN折叠扇子
Celebrities AS C
ON C.birth_city_name = 'New York'
AND C.birth_date = E2.birth_date
p_name IS NULL;
This is assuming that we know each and every celebrity name
in the Celebrities table. If the column in the WHERE clau could
have NULL s in its ba table, then we could not prune out the gen-
erated NULL s. The test for NULL should always be on (a column of)
the primary key, which cannot be NULL. Relating this back to the
example, how could a celebrity be a celebrity with an unknown
name? Even The Unknown Comic had a name (“The Unknown
Comic”).
22.4 EXISTS() and Quantifiers
Formal logic makes u of quantifiers that can be applied to
propositions. The two forms are “For all x, P(x)” and “For some
芹菜猪肉饺子馅的做法x, P(x)”. If you want to look up formulas in a textbook, the tra-
ditional symbol for the universal quantifier is , an inverted
letter A, and the symbol for the existential quantifier is , a
rotated letter E.
The big question over 100 years ago was that of existential
import in formal logic. Everyone agreed that saying “All men are
mortal” implies that “No men are not mortal,” but does it also
imply that “Some men are mortal”—that we have to have at least
one man who is mortal?