SQL Joins and Order of Execution: An In-Depth Guide

SQL Joins:

  1. INNER JOIN:

    • Definition: Retrieves records that have matching values in both tables.
    • Use Case: When you only want the records where there is a match in both tables.
    • Example:

      SELECT a.column1, b.column2 FROM table1 a INNER JOIN table2 b ON a.common_column = b.common_column;
  2. LEFT JOIN (LEFT OUTER JOIN):

    • Definition: Returns all records from the left table and the matched records from the right table. For unmatched rows from the right table, NULL values are returned.
    • Use Case: When you need all records from the left table regardless of whether they have a match in the right table.
    • Example:
      SELECT a.column1, b.column2 FROM table1 a LEFT JOIN table2 b ON a.common_column = b.common_column;
  3. RIGHT JOIN (RIGHT OUTER JOIN):

    • Definition: Similar to LEFT JOIN, but returns all records from the right table and the matched records from the left table.
    • Use Case: When you need all records from the right table regardless of whether they have a match in the left table.
    • Example:
      SELECT a.column1, b.column2 FROM table1 a RIGHT JOIN table2 b ON a.common_column = b.common_column;
  4. FULL JOIN (FULL OUTER JOIN):

    • Definition: Combines the results of both LEFT JOIN and RIGHT JOIN. Returns all records when there is a match in either table.
    • Use Case: When you need all records from both tables, with NULLs in places where there is no match.
    • Example:
      SELECT a.column1, b.column2 FROM table1 a FULL OUTER JOIN table2 b ON a.common_column = b.common_column;
  5. CROSS JOIN:

    • Definition: Returns the Cartesian product of both tables, pairing each row from the first table with every row from the second table.
    • Use Case: When you need all possible combinations of rows from the two tables.
    • Example:
      SELECT a.column1, b.column2 FROM table1 a CROSS JOIN table2 b;
  6. SELF JOIN:

    • Definition: A join in which a table is joined with itself to compare rows within the same table.
    • Use Case: When you need to compare rows within the same table.
    • Example:
      SELECT a.column1, b.column2 FROM table a INNER JOIN table b ON a.common_column = b.common_column;

SQL Order of Execution:

  1. FROM:

    • Purpose: Specifies the tables involved in the query.
    • Details: This is the first step where the SQL engine identifies the source tables and builds a Cartesian product if multiple tables are specified.
  2. WHERE:

    • Purpose: Filters records based on specified conditions.
    • Details: Applies conditions to filter out rows that do not meet the criteria.
  3. GROUP BY:

    • Purpose: Groups records that have identical data in specified columns.
    • Details: Aggregates data to prepare for summary functions (e.g., COUNT, SUM).
  4. HAVING:

    • Purpose: Filters groups based on specified conditions.
    • Details: Similar to WHERE but operates on groups created by GROUP BY.
  5. SELECT:

    • Purpose: Specifies the columns to be returned.
    • Details: Determines the final columns to be included in the result set.
  6. ORDER BY:

    • Purpose: Sorts the result set based on specified columns.
    • Details: Orders the rows in the result set according to one or more columns.
  7. LIMIT:

    • Purpose: Restricts the number of rows returned.
    • Details: Used to limit the number of rows in the result set, useful for pagination.

Example Query with Detailed Execution:

Let's consider a complex query to see the order of execution in action:

SELECT department, AVG(salary) AS avg_salary FROM employees WHERE hire_date > '2020-01-01' GROUP BY department HAVING AVG(salary) > 60000 ORDER BY avg_salary DESC LIMIT 5;

Order of Execution:

  1. FROM: Identify the employees table.
  2. WHERE: Filter rows where hire_date is after '2020-01-01'.
  3. GROUP BY: Group the remaining rows by department.
  4. HAVING: Filter groups where the average salary is greater than 60,000.
  5. SELECT: Choose the department and calculate the average salary as avg_salary.
  6. ORDER BY: Sort the results by avg_salary in descending order.
  7. LIMIT: Return only the top 5 rows.

No comments:

Post a Comment