12. Exists and Not Exists

We've left the shallow end behind and are starting to get into deeper waters now. Our next operator is the exists operator, one of the most useful operators in SQL. exists allows you to test whether a subquery returns any rows. That probably sounds confusing and not helpful, let's look at an example.

Let's say you want to find a list of all transactions to the "Trees" fund where the donor who gave the gift has previously given at least once to the "Birds" fund. Breaking this task down, we want to do this by:

  1. Building a list of gifts to the "Trees" fund.
  2. For each gift in that list, check whether the donor who made it has given a gift with an earlier transaction date to the "Birds" fund.

This is a great use case for exists. We are asking the database, "For each of these Trees gifts, does there exist a prior gift from the same donor to the Birds fund?" Our SQL might look like this:

select
    *
from transactions as trees
where trees.Fund = 'Trees'
    and exists (
            select 'X'
            from transactions as birds
            where birds.account_id = trees.account_id
                and birds.transaction_date < trees.transaction_date
                and birds.Fund = 'Birds'
        )

There's a lot of new stuff here, so let's unpack it piece by piece. First, we're using a table alias, trees for our transactions table. In this case, the table alias is mandatory because we are also going to reference the same table in our exists subquery, and we want to be able to tell the database which of our where clause elements should come from each table.

Second, our subquery looks a bit strange. We're again selecting from transactions which we have aliased as birds. But what heck is select 'X'? What's happening is that the database is going to test every single row in our trees transactions table by running this subquery and checking to see if there's at least one one result of any kind. It doesn't matter what the result is. If a result exists, the subquery returns true otherwise it returns false. We just select a simple string value, X. You can use select 1 or select 'this_awesome_string', or nearly anything you want. But by convention, people often use select 'X'.

Finally, notice that we are testing a series of conditions between the two tables in our subquery where clause. Inside the subquery, you can refer to any and all tables that are part of the main query's from clause (so our subquery can "see" our trees table). For each row in our trees table, we want to consult birds to determine:

  1. Is the birds transaction from the same account_id as our trees transaction (birds.account_id = trees.account_id)?
  2. Was the birds transaction made before the trees transaction (birds.transaction_date < trees.transaction_date)?
  3. Was the birds transaction made to the "Birds" fund (birds.Fund = 'Birds')?

If all of these things are true, then such a gift exists, so our exists statement returns true for that row. Otherwise, it returns false. The database then continues like this, checking each and every row in our trees table using the same tests.

I know this is a lot and seems complicated. Don't worry, there's going to be ample opportunity to practice. The main thing I want you to understand right now is that we have a way in SQL to say "hey database, does some specific thing exist in a table I'm about to specify?" If you understand that we can do this in SQL, you're on the right track. Comfort will come with practice.

Exercise 1: Exists

You've been asked to identify all gifts from 2024 where the donor who gave the gift had previously given at least one gift of $100. Put another way, for each transaction in 2024, does there exist a transaction from the same donor with an earlier transaction date that is $100+?

No Results

Not Exists

Just like with in and between, exists has an opposite: not exists. It works much as you'd expect: it checks to see if a specific thing exists. If it does, it returns false, otherwise it returns true. For example, if I wanted to find everyone in the state of Minnesota who has never made a gift of $100 or more, I might write:

select *
from addresses as ad
where ad.state = 'MN'
    and not exists (
        select 'X'
        from transactions as t
        where t.account_id = ad.account_id
            and t.amount >= 100
    )

exists and not exists are equally useful. You will likely find yourself turning to them time and time again to identify donors who have/haven't done certain things or given certain types of gifts.

Exercise 2: Not exists

You've been asked to identify all gifts from 2024 where the donor who gave the gift was brand new. Put another way, for all gifts in 2024, find records where there does not exist a gift from the same person with an earlier transaction date.

No Results