To filter records in SQL, we primarily use the WHERE clause, which allows us to specify conditions that determine which rows are returned from a table. The database only retrieves rows that satisfy those conditions.
For example, if I want to fetch all employees who work in the βITβ department and earn more than 50,000, Iβd write:
SELECT *
FROM employees
WHERE department = 'IT' AND salary > 50000;
This query filters the dataset based on two conditions using the AND operator β both must be true for a record to appear.
We can also use other operators with WHERE, like:
- =, !=, >, <, >=, <= for comparisons
- AND, OR, NOT for combining conditions
- BETWEEN for range filtering
- IN for checking multiple possible values
- LIKE for pattern matching with wildcards
For example:
SELECT *
FROM customers
WHERE city IN ('Chennai', 'Bangalore')
AND email LIKE '%@gmail.com';
This fetches all customers from Chennai or Bangalore whose email ends with β@gmail.comβ.
In a real-world project, I used WHERE extensively in reporting queries β for instance, filtering transactions within a specific date range or customers with pending payments. For example:
SELECT customer_id, amount_due
FROM invoices
WHERE due_date < CURRENT_DATE;
This helped our finance team easily identify overdue accounts.
One challenge Iβve faced with filtering was performance β especially when the WHERE clause used functions on columns (like WHERE YEAR(order_date) = 2025)), which prevented index usage. To optimize that, I rewrote conditions to use indexed columns directly, e.g.,
WHERE order_date BETWEEN '2025-01-01' AND '2025-12-31';
This small change significantly improved query execution time.
A limitation of WHERE is that it filters before aggregation. If we need to filter after an aggregate operation (like after GROUP BY), we use the HAVING clause instead.
As an alternative, in complex cases, Iβve used CTEs (Common Table Expressions) or subqueries to apply layered filters for readability and modularity.
So overall, the WHERE clause is the foundation of data filtering in SQL β it gives precise control over which records are returned, ensuring queries are both accurate and efficient when properly optimized.
