SQL Distinct: The Complete Guide
If you want to remove duplicate items from the final result set, then SQL Distinct is the answer to your problem. It filters out duplicate items and returns unique values from the table.
The distinct keyword is used with the select statement for fetching unique values from a table. However, there also exist some situations where there are multiple duplicate records. So, it makes more sense to retrieve unique values rather than retrieving duplicate records.
SQL Distinct
SQL distinct is a built-in statement used to remove duplicate records from a column when displaying the resulting sets from any table present in a database. SQL Select DISTINCT statement returns the distinct (different) values.
Inside a table, a column often contains many duplicate values; and sometimes you only want to list the different (distinct) values.
Syntax
Select distinct column1, column2, …., column n from table_name where[condition];
Parameters
- Distinct is a keyword that is used for retrieving unique records.
- Columns are the name of the columns.
- Table_name is the name of the condition.
- Where is the condition used for retrieving records based on some condition?
Note
- When only one expression is provided in a distinct clause, the query will generate unique values for only that expression.
- When more than one expression is listed using a distinct clause, the unique combinations for the expressions will be listed.
- The distinct clause does not ignore NULL values, so if there is any NULL value in the table, then that NULL value will be displayed in the resulting set representing it as unique.
Let’s clear this by looking at an example.
Example: (For Single columns)
Consider Table: (CUSTOMERS)
CUST_CODE | CUST_NAME | CITY | GRADE | AMOUNT |
101 | SHUBH | KOLKATA | 1 | 10000 |
201 | SOURAV | KOLKATA | 1 | 10000 |
301 | KISHAN | PATNA | 2 | 20000 |
401 | DILEEP | ALLAHABAD | 2 | 30000 |
501 | SHRUTI | PATNA | 3 | 40000 |
601 | SHOUVIK | JALANDHAR | 4 | 20000 |
701 | ROHIT | PATNA | 2 | 40000 |
Query: (Without distinct keyword)
select city from customers;
Output
Explanation
The query has generated all the city names from the customers’ duplicates without removing the duplicates.
Query: (Using distinct keyword)
select distinct city from customers;
Output
Explanation
The query has generated the city names, which are unique in the customers’ table.
Example: (For multiple Columns)
Consider Table: (CUSTOMERS)
CUST_CODE | CUST_NAME | CITY | GRADE | AMOUNT |
101 | SHUBH | KOLKATA | 1 | 10000 |
201 | SOURAV | KOLKATA | 1 | 10000 |
301 | KISHAN | PATNA | 2 | 20000 |
401 | DILEEP | ALLAHABAD | 2 | 30000 |
501 | SHRUTI | PATNA | 3 | 40000 |
601 | SHOUVIK | JALANDHAR | 4 | 20000 |
701 | ROHIT | PATNA | 2 | 40000 |
QUERY: (Without Distinct Keyword)
select city, amount from customers order by amount;
Output
Explanation
In this query, all city along with the amount is displayed in the resulting set without any removal of duplicate values.
QUERY: (With Distinct Keyword)
Select distinct city, amount from customers order by amount;
Output
Explanation
This query displays the combinations of unique values from the city and amount columns. For example, you can see from the above output that city Patna and Allahabad were listed at once whose amount was the same.
Example: (With NULL VALUES)
Consider Table: (CUSTOMERS)
CUST_CODE | CUST_NAME | CITY | GRADE | AMOUNT |
101 | SHUBH | KOLKATA | 1 | 10000 |
201 | SOURAV | KOLKATA | 1 | 10000 |
301 | KISHAN | PATNA | 2 | 20000 |
401 | DILEEP | ALLAHABAD | 2 | 30000 |
501 | SHRUTI | PATNA | 3 | 40000 |
601 | SHOUVIK | JALANDHAR | 4 | 20000 |
701 | ROHIT | PATNA | 2 | 40000 |
801 | ROUNAK | NULL | 5 | 10000 |
#Query
select distinct city from customers;
Output
Explanation
NULL values are also displayed in the city name as a distinct clause does not ignore it.
EXAMPLE: (Using Where Condition)
Consider Table: (CUSTOMERS)
CUST_CODE | CUST_NAME | CITY | GRADE | AMOUNT |
101 | SHUBH | KOLKATA | 1 | 10000 |
201 | SOURAV | KOLKATA | 1 | 10000 |
301 | KISHAN | PATNA | 2 | 20000 |
401 | DILEEP | ALLAHABAD | 2 | 30000 |
501 | SHRUTI | PATNA | 3 | 40000 |
601 | SHOUVIK | JALANDHAR | 4 | 20000 |
701 | ROHIT | PATNA | 2 | 40000 |
801 | ROUNAK | NULL | 5 | 10000 |
Query
select distinct city, grade from customers where amount=10000;
Output
Explanation
The above query city and grades are displayed whose amount is equal to 10000. Compare the output table with the original table from there, and you can see that Kolkata, along with the grade 1 and amount with 10000, was displayed once, which was present twice in the customer’s table.
EXAMPLE: (Group BY VS Distinct)
Consider Table: (CUSTOMERS)
CUST_CODE | CUST_NAME | CITY | GRADE | AMOUNT |
101 | SHUBH | KOLKATA | 1 | 10000 |
201 | SOURAV | KOLKATA | 1 | 10000 |
301 | KISHAN | PATNA | 2 | 20000 |
401 | DILEEP | ALLAHABAD | 2 | 30000 |
501 | SHRUTI | PATNA | 3 | 40000 |
601 | SHOUVIK | JALANDHAR | 4 | 20000 |
701 | ROHIT | PATNA | 2 | 40000 |
801 | ROUNAK | NULL | 5 | 10000 |
QUERY
select city, grade, amount from customers group by city, grade, amount order by city, grade, amount; AND select distinct city, grade, amount from customers order by city, grade, amount;
OUTPUT
Explanation
In the above queries, you can see that the resulting sets of both group by and distinct clause combinations are the same.
Compare the output table from the Original table to spot the difference. You will notice that KOLKATA, under the city column whose grade was 1 under the column grade and whose amount was 10000 under the column amount, was displayed once.
We have also used order by clause in the above query. DISTINCT can be used with aggregates: COUNT, AVG, MAX, etc.
That’s it for this tutorial.