The main difference between UNION and UNION ALL is how they handle duplicate records.
UNION combines the results of two or more SELECT queries and automatically removes duplicate rows from the final output. It basically performs a distinct operation on the combined dataset.
UNION ALL, on the other hand, simply combines the results and keeps all duplicates. It doesn’t check for uniqueness, so it’s faster since it skips that extra comparison step.
For example, let’s say we have two tables — Sales2024 and Sales2025. If we run:
SELECT CustomerID FROM Sales2024
UNION
SELECT CustomerID FROM Sales2025;
This will return a list of unique customers across both years.
But if we use:
SELECT CustomerID FROM Sales2024
UNION ALL
SELECT CustomerID FROM Sales2025;
Then, if the same customer made purchases in both years, their ID will appear twice — once from each table.
I’ve applied this concept in a data migration project where we had to merge transactional data from multiple systems. Initially, I used UNION, but it slowed down when handling millions of records because of the duplicate check. So, after verifying that duplicates were acceptable, I switched to UNION ALL, which gave a huge performance boost.
The main limitation of UNION is performance, especially on large datasets due to the distinct comparison. Whereas the limitation of UNION ALL is that you might get duplicates if you’re not careful.
A simple alternative when you need uniqueness but still want performance is to use UNION ALL with a DISTINCT on the final output. This way, you control when and how the duplicate elimination happens — sometimes after filtering or aggregating the data.
