Nested Loops Join – SQL Server Graphical Execution Plan
Nested Loops Join operator (represented by an inner and an outer looping arrows in SQL Server’s Graphical Execution Plan) is used by SQL Server to join together two tables or indexes, this is generally an efficient operator, and superior to Hash or Merge Joins when operating on small data-sets.
This post will goes through why the Nested Loops operator is chosen by SQL Server’s Database Engine, the different types of Nested Loop Joins, and finally how to try and optimize a costly Nest Loops Join operation.
Nested Loops Usage in SQL Server
Nested Loops Join operates on two tables, an outer table and an inner table, as indicated in the image used to represent this type of operation. Essentially this means that for each row in the outer table (outer loop), SQL Server will perform a criteria based comparison against each row of the inner table (inner loop). The Nested Loops join is sometimes referred to as Nested Iteration Operator.
Nested Loops joins are most efficient when you have a small outer table (not necessarily indexed), and a large but importantly indexed inner table. This will allow SQL Server to very quickly retrieve the required values of the inner (larger) table, even if the smaller (outer) table does not have an index that matches the operator’s requirement. Having an index that supports the criteria of the Nested Loop Join operator on the inner table is crucial in order to maintain an acceptable level of performance.
SQL Server tends not to choose this join operator if these criteria indicated above are not met, although a reused cached execution plan or an out of date statistics could cause the Database Engine to select a Nested Loop Join even though this might not be the optimum operator for the data-sets being handled.
This is interesting behavior to consider when building an indexing strategy for a particular database or data warehouse solution in general.
Types of Nested Loops Join
There are a few types of Nested Loops Join operator, knowing which one is being employed by SQL Server is crucial to optimizing this operator.
- Indexed Nested Loops Join: In this scenario the inner table rows are being matched using a SEEK operator through an index that exits (and is pre-built) on the underlying data-set. This is obviously what you want your Nested Loops Join operator to be, as it gives you the best Nested Loops Join performance over the required data-sets.
- Naive Nested Loops Join: In this scenario the inner table either has no index, or the existing indexes do not satisfy the criteria required by the join columns in the Nested Loops Join operator. SQL Server will rarely choose Nested Loops in that occasion, and generally tends to resort to Hash Match joins instead, but this could show up in smaller data-sets, or join columns that have a relatively small data-type.
- Temporarily Indexed Nest Loops Join: In this scenario SQL Server is having to create a temporary index on the underlying tables in order to satisfy the Nested Loops Join. SQL Server will essentially decide that the cost of building a temporary index in order to perform the Nested Loops Join operation outweighs the cost of running a Naive Nested Loops Join or any other type of Join operation. The index that is missing and being created by SQL Server to satisfy this join type can be found in the missing index DMVs (such as sys.dm_db_missing_index_details and sys.dm_db_missing_index_groups).
Optimizing Nested Loops Join Operator
Generally Nested Loops is an efficient joining strategy, and whenever SQL Server utilizes this operator, performance tends to be high and cost of that operator will be low in the over-all execution plan, although this is not always the case, particularly if SQL Server is using the Naive Nest Loops Join approach or building temporary indexes in order to satisfy the join operation.
In order to identify if SQL Server is building a temporary index for the purpose of improving the performance of the Nested Loops Join, you can look at the missing indexes DMVs, listed below, or run the query being diagnosed with “Show Execution Plan” button selected.
- sys.dm_db_missing_index_details
- sys.dm_db_missing_index_group_stats
- sys.dm_db_missing_index_groups
- sys.dm_db_missing_index_columns
It is highly recommended to build any index that will aid the Nested Loops Join operator in matching the relevant data-sets, this is more important for the larger data-set, were an index could make a large impact on the over-all cost of this operator.
As always, if you suspect that SQL Server is selecting Nested Loops Join operator by mistake, then ensure that all your statistics on the underlying tables are up-to-date, you can use SQL Server’s Tuning Adviser to build new relevant statistics if required. If you suspect that the Database Engine’s cached plan is out of date, try forcing an update or use query plans or hints.
Leave a Reply
Want to join the discussion?Feel free to contribute!