Subqueries in SQL are basically queries written inside another query. They’re used when we need to use the result of one query as a condition or input for another. It helps break down complex logic into smaller steps and makes the main query more readable.
For example, suppose I have two tables — Employees and Departments. If I want to find all employees who earn more than the average salary in their department, I can use a subquery like this:
SELECT EmployeeName, Salary, DepartmentID
FROM Employees
WHERE Salary > (SELECT AVG(Salary)
FROM Employees
WHERE DepartmentID = Employees.DepartmentID);
Here, the inner query calculates the average salary for each department, and the outer query compares each employee’s salary against that average. So, the subquery provides dynamic data that the main query uses.
I’ve used subqueries in one of my reporting projects where we needed to identify top-performing sales reps who performed better than the average in their region. Instead of creating temporary tables or joins, a correlated subquery made it simpler and more efficient to write and maintain.
One of the challenges I faced was with performance, especially when dealing with large datasets. Correlated subqueries tend to execute the inner query multiple times, which can slow things down. In such cases, I often rewrite the query using joins or common table expressions (CTEs) for optimization.
As an alternative, when performance becomes critical, I use joins or window functions. For example, using AVG(Salary) OVER (PARTITION BY DepartmentID) can achieve the same result without repeatedly executing the inner query, making it much faster.
