30. What is a Subquery?

A subquery is an SQL query that is nested inside another query. The results returned by a subquery can be used in the outer query as a table, a value, or a filter. Subqueries are always enclosed by parenthesis () . There are three principal use cases for subqueries. Subsequent chapters will go into more detail on two of these use cases.

In a where Clause

When used in a where clause, subqueries are used to to filter results based on another table. We have seen this already with in and exists. Here's an example of a subquery in a where clause:

select
    *
from transactions
where account_id in (
    select account_id
    from addresses
    where state = 'FL'
)

This query contains a subquery in the where clause that builds a list of all the account_id values for people who live in Florida. The results of that subquery are made available to the outer query which uses them as a list to verify whether the account_id values in transactions are in the list or not.

In a from Clause

In a from clause, the subquery is treated by your database as though it were a table itself. Think of it like a kind of temporary table that the database builds on the fly. Here's an example of this usage:

select
    t.account_id
    ,t.transaction_date
    ,fgd.first_gift_date
from transactions as t
    inner join
        (
            select
                account_id
                ,min(transaction_date) as first_gift_date
            from transactions
            group by account_id
        ) as fgd
    on fgd.account_id = t.account_id

In this query, the subquery is used to build a table of each account_id value and the date of their first ever gift. The subquery returns a table aliased as fgd with two columns: account_id and first_gift_date. This table can then be used just like any other table in your database. As you can see in this example, it is joined to another table and one of its columns (first_gift_date) is included in the select clause.

In a select Clause

Subqueries can be used directly in a select clause to return a calculated value. In practice, this is very rarely done as it can come with performance downsides and can be a confusing design pattern. Almost every instance of a subquery being used in a select clause can be rewritten a different (and better) way. As a result, I do not recommend using this structure. If you find yourself needing to, it's very likely that there is a better way to accomplish your goal and I recommend rethinking your query from the ground up.

That said, because this is a design pattern you may see in other people's code (and because once every 5 years you may have a legitimate need for this usage), here's an example of how it is done:

select
    account_id
    ,first_name
    ,(select count(*) from transactions where transactions.account_id = accounts.account_id) as transaction_count
from accounts

In this query, we select the full list of account_id values from accounts and we use a subquery in our select clause to get the count of transactions for each account_id from the transactions table. Again, this is usually not the best way to accomplish what you're looking to do, but you may see it from time to time. We will not be reviewing this use case in detail in this course. The following chapters will cover the first two use cases in more detail and provide some additional information about subqueries.

Exercise 1: Rewrite

Rewrite the query below to get identical results using a join instead of a subquery.

select
    account_id
    ,first_name
    ,(select count(*) from transactions where transactions.account_id = accounts.account_id) as transaction_count
from accounts
No Results