SQL Tutorial
Basic SQL
Intermediate SQL
Putting it together
SQL Aggregate Functions
SQL COUNT
SQL SUM
SQL MIN/MAX
SQL AVG
SQL GROUP BY
SQL HAVING
SQL CASE
SQL DISTINCT
SQL Joins
SQL INNER JOIN
SQL Outer Joins
SQL LEFT JOIN
SQL RIGHT JOIN
SQL Joins Using WHERE or ON
SQL FULL OUTER JOIN
SQL UNION
SQL Joins with Comparison Operators
SQL Joins on Multiple Keys
SQL Self Joins
Advanced SQL
SQL Analytics Training
Python Tutorial
Learn Python for business analysis using real-world data. No coding experience necessary.
Start Now
Mode Studio
The Collaborative Data Science Platform
SQL DISTINCT
Starting here? This lesson is part of a full-length tutorial in using SQL for Data Analysis. Check out the beginning.
In this lesson we'll cover:
- Using SQL DISTINCT for viewing unique values
- Using DISTINCT in aggregations
- DISTINCT performance
- Practice problems
Using SQL DISTINCT for viewing unique values
You'll occasionally want to look at only the unique values in a particular column. You can do this using SELECT DISTINCT
syntax. To select unique values from the month
column in the Apple stock prices dataset, you'd use the following query:
SELECT DISTINCT month
FROM tutorial.aapl_historical_stock_price
If you include two (or more) columns in a SELECT DISTINCT
clause, your results will contain all of the unique pairs of those two columns:
SELECT DISTINCT year, month
FROM tutorial.aapl_historical_stock_price
Note: You only need to include DISTINCT
once in your SELECT
clause—you do not need to add it for each column name.
Practice Problem
Write a query that returns the unique values in the year
column, in chronological order.
DISTINCT
can be particularly helpful when exploring a new data set. In many real-world scenarios, you will generally end up writing several preliminary queries in order to figure out the best approach to answering your initial question. Looking at the unique values on each column can help identify how you might want to group or filter the data.
Using DISTINCT in aggregations
You can use DISTINCT
when performing an aggregation. You'll probably use it most commonly with the COUNT
function.
In this case, you should run the query below that counts the unique values in the month
column.
SELECT COUNT(DISTINCT month) AS unique_months
FROM tutorial.aapl_historical_stock_price
The results show that there are 12 unique values (other examples may be less obvious). That's a small enough number that you might be able to aggregate by month and interpret the results fairly early. For example, you might follow this up by taking average trade volumes by month to get a sense of when Apple stock really moves:
SELECT month,
AVG(volume) AS avg_trade_volume
FROM tutorial.aapl_historical_stock_price
GROUP BY month
ORDER BY 2 DESC
Okay, back to DISTINCT
. You'll notice that DISTINCT
goes inside the aggregate function rather than at the beginning of the SELECT
clause. Of course, you can SUM
or AVG
the distinct values in a column, but there are fewer practical applications for them. For MAX
and MIN
, you probably shouldn't ever use DISTINCT
because the results will be the same as without DISTINCT
, and the DISTINCT
function will make your query substantially slower to return results.
DISTINCT performance
It's worth noting that using DISTINCT
, particularly in aggregations, can slow your queries down quite a bit. We'll cover this in greater depth in a later lesson.
Sharpen your SQL skills
Practice Problem
Write a query that counts the number of unique values in the month
column for each year.
Practice Problem
Write a query that separately counts the number of unique values in the month
column and the number of unique values in the `year` column.
Next Lesson
SQL Joins