11. In and Not In

Earlier in the course, I asked you to write an SQL query that would select everyone in the database whose last name is Fuentes, Williams, or Lopez. For that exercise, we were working to get a handle on the or operator, and I suggested the following solution:

select * 
from accounts 
where last_name = 'Fuentes'
    or last_name = 'Williams'
    or last_name = 'Lopez'

This construction seems kind of tedious, doesn't it? Fortunately, SQL has an operator that makes this a bit easier, the in operator. in allows you to check whether a given value is in some list of values. Using in, we can rewrite this query as follows:

select *
from accounts
where last_name in ('Fuentes','Williams','Lopez')

When using in, you must always enclose your list of values in parenthesis (). Note that in works with data of all types: dates, text, numbers, etc. Practically, in is most useful with text and date information.

Exercise 1: Basic in

For an upcoming campaign, you are asked to generate a list of all donors that have an address in Oregon, Washington, or California. The output should only include their account_id and the state they live in. Write an SQL query to produce this list.

No Results

Not in

As we saw with the between operator, you can also use not with in to identify rows where some value is not in a list of values. This is substantially more useful than not between and you will find yourself using it often. For example, let's say a natural disaster has impacted Florida, Georgia, and South Carolina. You need to pull a list of addresses for an upcoming campaign, but you need to exclude donors in FL, GA, and SC. You might write something like this:

select *
from addresses
where state not in ('FL','GA','SC')

Just like with in, not in requires your list of values to be inside parenthesis (). not in also works with data of all types, though you will mostly find it useful with text and date data.

Exercise 2: Not in

Membership needs a list of all offline transactions from 2023. Write an SQL query to identify all transactions in 2023 where the transaction channel is not: Website, Email, Digital Advertising, or Unpaid Social.

No Results

In and Not In a Subquery

We're going to talk about subqueries in more depth later, but one of the most useful types of subqueries is used in conjunction with in/not in, so we're going to go over that now. Briefly, a subquery is a query that is used within another query to gather data from the database for direct use in your main query.

For example, let's say we want a list of all transactions from donors who live in the state of Utah. Thinking about this in SQL terms, it would be nice if we could solve the problem like this:

  1. Build a list of all account_ids with an address in Utah
  2. Select all transactions from the transactions table where the account_id is in that list

This seems like a great use for in, because we need to work with a list. But do we really need to type in the account_id of every person who lives in Utah? No, we don't. We can use a subquery to get that information for us, like this:

No Results

This query uses a subquery (a query within a query) to build a list of all account_id values from the addresses table where the state is Utah. The database will always evaluate all your subqueries before the main queries, so behind the scenes, the database is going to take your subquery and turn it into an actual list of IDs and then use that list to execute the main query.

Let's pause for a second and notice a few things:

  • Even though we've referenced the addresses table in our subquery, our results from the select * statement only include the columns from the transactions table. This is because only the transactions table appears in our from clause.
  • Just like other in usage, we enclose our subquery in parenthesis () to let the database know that what's inside should be treated as a list.

Exercise 3: In with subquery

Your major gifts team is about to make a trip to the northeast for some donor meetings. In advance of this trip, they want a list of all transactions of $5,000 or more made by donors who live in NY, NJ, and CT. Write an SQL query to get this data for them.

No Results