8. Distinct

The SQL distinct operator is used to remove duplicate rows from the result of a select statement. It ensures that only unique values are returned. One way this is helpful is in generating lists of donors from simple queries. For example, let's say we want a list of all donors who gave at least one gift in 2024. We could query our transactions table, like this:

No Results

Hm, something about our query isn't quite right. Notice for example that account_id 633 is repeated multiple times, as are many other IDs. This happens because our query is returning one row for for each transaction because we have selected from our transactions table. Donor 633 made multiple gifts in 2024, so they show up multiple times. We can fix this with the distinct operator. Usage of distinct is simple: just put it right after the word select to form select distinct:

No Results

This tells the database, "Find me the account_id that belongs to each transaction and return a list sorted by account_id. Then take that list and remove any duplicate values, leaving only unique values." As you can see, donor 633 is now only listed once, as are all other donors. Note that the order by clause is not required to use distinct, I have only used it here to make it easier to compare the outputs of these two queries.

Exercise 1: Select distinct

Your community outreach team is going to be doing some work in the state of Wyoming and needs a list of each city in Wyoming where the organization has at least one donor. Use select distinct to generate a duplicate-free list of cities in Wyoming from our addresses table.

No Results

Distinct with multiple columns

In our first example, we used distinct in a query where we only selected a single column, account_id. distinct also works when selecting multiple columns. When used this way, the database will return every distinct combination of values from your selected columns. Let's look at an example.

No Results

This query returns every unique combination of first_name and last_name in the accounts table. Notice, however, that the name Aaron is repeated. This happens because the distinct operator works over the combination of all columns in our select statement. So while there are many Aarons in our output, there is only one Aaron Barker, only one Aaron Boyd, etc.

Exercise 2: Distinct with multiple columns

The operations department is looking to change credit card processing companies. In order to get a sense of which channels are using which payment methods, they need a list of every payment method and channel combination from the transactions table. Write a query using select distinct to generate this list.

No Results