16. Case statements

SQL uses case statements to express conditional logic. When you want to implement something that uses if-then-else logic, case is what you'll use. A case statement has four parts, one of which is optional:

  • case - A required keyword that tells the database we're about to evaluate some if-then statements.
  • when and then - Each individual if-then statement itself takes the form when some_condition then some_output. At least one when/then pair is required.
  • else - A default value to use as output if none of the when conditions are met. This is optional. If you don't include it and a value has no matches in your when statements, the database will output NULL.
  • end - A required keyword that tells the database we're done with all our if-then statements.

A case statement may look like this:

case 
    when source_code like 'A%' then 'Starts with A'
    when source_code like 'B%' then 'Starts with B'
    else 'Starts with another letter'
end

This simple case statement will look at source codes and output either "Starts with A", "Starts with B", or "Starts with another letter" depending on the first letter of the source code.

General notes about case

They are evaluated top to bottom

The database evaluates each of your when statements in order from top to bottom as you've written them. Once any when condition is satisfied, the then statement is immediately used to generate output and the value is no longer able to be classified by when statements lower in the list. It's crucially important to understand this behavior. For example in this case statement we evaluate donor first_name values:

case
    when first_name like 'M%' then 'Starts with M'
    when first_name like '%y' then 'Ends with y'
end

If we consider the name "Mary", we can see that it meets both conditions: it starts with "M" and ends with "y". But if we run this case statement as part of a query, the value for any row with first name "Mary" will always be "Starts with M". The logic for each row we evaluate terminates as soon as any when statement criteria is met. The database will never even check if "Mary" meets the second criteria, because it's already met the first one. Remember to always pay careful attention to the order you write your when statements in.

Use them anywhere

case statements can be used pretty much anywhere that an expression is valid. This means you can use case in your where clauses, in your select statements, and even in clauses we haven't covered yet like join on, having, and order by. case statements can also be used inside function calls or inside other operators (e.g., inside a list for an in statement, inside the substring() function, etc.) You can generally use case anywhere except the from clause, so get creative.

They can contain complex logic

The conditions evaluated by case can be as simple or complex as you'd like. You can include function calls and other operators like in, like, and between as part of your if-then logic in a case statement. Pretty much every single thing you've learned in this section so far can be used inside a case statement in one way or another.

They can be nested

Let's say you need to evaluate a list of source codes and if they end in "X" you need to implement one set of logic, but if they end in any other letter, you need to implement some other set of logic. This calls for nested case statements, like these:

select
    case 
        when source_code like '%X' then 
            case 
                when source_code like 'M%' then 'Starts M, Ends X'
                when source_code like 'B%' then 'Starts B, Ends X'
                else 'Starts other, ends X'
            end
        when source_code like 'M%' then 'Starts M, ends other'
        when source_code like 'K%' then 'Starts K, ends other'
    end as nested_case
from source_codes

Notice that to create my nested case statements, I've used a case statement in my then clause. This case statement tells the database, "when the source code ends in X, I want to execute this second case statement to determine what to do. If it doesn't end in X, then keep on going with this stream of logic." Inserting a case statement to be the result of a then clause is exactly what I meant when I said you can (and should!) use them pretty much anywhere.

Two ways to use case

There are two ways to use case:

  1. Simple case statements which compare one variable to several specific possible values.
  2. Searched case statements which evaluate a series of true/false conditions, not just a single value.

We will consider both, but note that generally you will use searched case statements much more frequently (all the examples above are searched case statements). Also note that any simple case statement can always be re-written as a searched case statement (but not vice versa).

Simple Case Statements

These statements evaluate a single expression and check through a series of of possible values for that expression. For example, let's say we want to select all the addresses in our database where the state is WI, IL, or MI. But instead of including the state abbreviations in our output, we want to output the full state names. We can do this with case as follows:

No Results

Let's break that case statement down:

  • case state tells the database that we are going to perform some if/then checks on the state column in our data.
  • when 'WI' then 'Wisconsin' tells the database that when it finds a state value of WI, it should output the value Wisconsin.
  • Additional when clauses tell the database what to do with IL and MI state abbreviations.
  • end tells the database that our case statement is done, there are no more if/then conditions to check.
  • as full_state_name is our normal method for aliasing the column in our output.

As you can see in the output, we have all the WI, IL, and MI addresses but we have the full state names rather than the abbreviations in our output.

This type of case statement can only check for equality, i.e. "does the value in this column exactly equal some value, and if so, do something." You cannot use operators like >= or like in these simple case statements. As a result, their utility is limited.

Exercise 1: Simple case

Direct marketing needs a list of all test and control source codes they have used in 2023. Write a SQL query using a simple case statement to select all source codes from the source codes table that end with the letters "C" or "T" where the campaign date is in 2023. The output should include the source code as well as a column that says "Test" for all codes ending in "T" and "Control" for all codes ending in "C".

No Results

Searched case

We've already seen several examples of searched case statements in this chapter. These statements evaluate a series of true/false conditions rather than just a single value. As a result, they are significantly more flexible and useful in daily work.

In a searched case statement, each of your when clauses will contain some logic that will either return true, false, or NULL. If the result is true, the database will evaluate your then statement. If it's false or NULL, the database will move down to the next when statement in your list. Here's another example:

No Results

This case statement looks at all the transactions in the database and puts them into one of four groups depending on their values for transaction_channel and payment_method. Note that unlike simple case statements, searched case statements allow you to evaluate multiple columns (in this case both transaction_channel and payment_method) in a single case statement.

Exercise 2: Searched case

Your gift processing team is trying to reconcile invoices from the bank and credit card payment processor. They need a list of all transactions in the database. Output should include:

  • Transaction ID
  • Transaction Amount
  • Transaction Date
  • Payment Method
  • Transaction Cost: 3% of the transaction amount for credit card gifts, 1% of the transaction amount for EFT gifts, and 0 for all other gift types.

Write an SQL query to generate this list.

No Results

Exercise 3: Nested case

The gift processing team came back to you after noticing a problem with their initial instructions. It turns out that the payment processor provides a discount for sustainer transactions. The new logic for transaction cost is:

  • Program = Sustainer
    • Credit card cost is 2% of the transaction amount
    • EFT cost is 0.5% of the transaction amount
    • Other costs are 0
  • All other programs
    • Credit card cost is 3% of the transaction amount
    • EFT cost is 1% of the transaction amount
    • All other costs are 0

Write an SQL query using nested case statements to provide this list, and add Program to the list of columns in the output.

No Results