30. What is a Subquery?
A subquery is an SQL query that is nested inside another query. The results returned by a subquery can be used in the outer query as a table, a value, or a filter. Subqueries are always enclosed by parenthesis ()
. There are three principal use cases for subqueries. Subsequent chapters will go into more detail on two of these use cases.
In a where
Clause
When used in a where
clause, subqueries are used to to filter results based on another table. We have seen this already with in and exists. Here's an example of a subquery in a where
clause:
select
*
from transactions
where account_id in (
select account_id
from addresses
where state = 'FL'
)
This query contains a subquery in the where
clause that builds a list of all the account_id
values for people who live in Florida. The results of that subquery are made available to the outer query which uses them as a list to verify whether the account_id
values in transactions
are in
the list or not.
In a from
Clause
In a from
clause, the subquery is treated by your database as though it were a table itself. Think of it like a kind of temporary table that the database builds on the fly. Here's an example of this usage:
select
t.account_id
,t.transaction_date
,fgd.first_gift_date
from transactions as t
inner join
(
select
account_id
,min(transaction_date) as first_gift_date
from transactions
group by account_id
) as fgd
on fgd.account_id = t.account_id
In this query, the subquery is used to build a table of each account_id
value and the date of their first ever gift. The subquery returns a table aliased as fgd
with two columns: account_id
and first_gift_date
. This table can then be used just like any other table in your database. As you can see in this example, it is joined to another table and one of its columns (first_gift_date
) is included in the select
clause.
In a select
Clause
Subqueries can be used directly in a select
clause to return a calculated value. In practice, this is very rarely done as it can come with performance downsides and can be a confusing design pattern. Almost every instance of a subquery being used in a select
clause can be rewritten a different (and better) way. As a result, I do not recommend using this structure. If you find yourself needing to, it's very likely that there is a better way to accomplish your goal and I recommend rethinking your query from the ground up.
That said, because this is a design pattern you may see in other people's code (and because once every 5 years you may have a legitimate need for this usage), here's an example of how it is done:
select
account_id
,first_name
,(select count(*) from transactions where transactions.account_id = accounts.account_id) as transaction_count
from accounts
In this query, we select the full list of account_id
values from accounts
and we use a subquery in our select
clause to get the count of transactions for each account_id
from the transactions
table. Again, this is usually not the best way to accomplish what you're looking to do, but you may see it from time to time. We will not be reviewing this use case in detail in this course. The following chapters will cover the first two use cases in more detail and provide some additional information about subqueries.
Exercise 1: Rewrite
Rewrite the query below to get identical results using a join
instead of a subquery.
select
account_id
,first_name
,(select count(*) from transactions where transactions.account_id = accounts.account_id) as transaction_count
from accounts