Diagnose SQL Server CPU Resource Starvation Issues with Email Alerts

04 Nov
November 4, 2012

Diagnosing CPU performance issues with SQL Server can be a bit tricky, particularly if you have a system that exhibit a CPU spikes pattern throughout the day. You could run some SQL DMVs in order to identify holistically how much CPU time each query is consuming, but that does not give you a real-time way of tracking CPU resource starvation on SQL Server.

Recently I have been asked to devise a system on a few of our servers that are experiencing CPU resource bottleneck issues, the monitoring system needs to identify when SQL Server is suffering from CPU bottlenecks, and report the queries currently running on the server via email alerts, along with performance statistics that can help you diagnose which queries are consuming most of the resources, and why are they doing so.

 

Identifying CPU Resource Bottleneck (Starvation) on SQL Server

The first step is to identify when SQL Server is suffering from CPU bottlenecks, there are a few ways this can be achieved, for example there are some 3rd party CLR based stored procedure that help you access Performance Monitor Counters from within SQL Server, these can be used to detect CPU over-utilization on the box in general or by the SQL process specifically, and based on pre-set threshold alerts can be sent out. Although I’ve decided to keep everything natively within SQL Server, so I opted for querying the sys.dm_os_ring_buffers DMV.

The sys.dm_os_ring_buffers Dynamic Management View regularly samples and reports on different aspects of the SQL Server system, this includes CPU utilization, memory pressure and state, security errors and many other interesting traces. This information is presented in an easy to parse XML format and also available to query without adding too much pressure on SQL Server, which means a high sampling rate can be configured without any issues.

So in order to identify CPU over-utilization issues using the sys.dm_os_ring_buffers DMV, we need to filter the ring_buffer_type column to RING_BUFFER_SCHEDULER_MONITOR, and only return rows were the XML record contains a SystemHealth element, as shown below.

DECLARE @CurrentHighCPUCount INT = 0,
    @SQLCPUThrehold_Percent INT = 75,
    @LogHighCPUCount INT = 3
;WITH topCPU AS
(SELECT TOP (5)
CONVERT (VARCHAR(30), getdate(), 126) AS runtime,
        record.value('(Record/@id)[1]', 'int') AS record_id,
        record.value('(Record/SchedulerMonitorEvent/SystemHealth/SystemIdle)[1]', 'int') AS system_idle_cpu,
        record.value('(Record/SchedulerMonitorEvent/SystemHealth/ProcessUtilization)[1]', 'int') AS sql_cpu_utilization
--into #tempCPU
FROM sys.dm_os_sys_info inf CROSS JOIN (
SELECT TIMESTAMP, CONVERT (xml, record) AS record
FROM sys.dm_os_ring_buffers
WHERE ring_buffer_type = 'RING_BUFFER_SCHEDULER_MONITOR'
AND record LIKE '%<SystemHealth>%') AS t
ORDER BY record.value('(Record/@id)[1]', 'int') DESC
)
SELECT @CurrentHighCPUCount = COUNT(*) FROM topCPU WHERE sql_cpu_utilization >@SQLCPUThrehold_Percent

The script essentially allows you to configure the threshold of what percentage of CPU utilization the script would consider as “over-utilization”, and the number of samples from the sys.dm_os_ring_buffers that meets this threshold before an alert is triggered, allowing you to control the length of the CPU spike you consider to be abnormal for your system.

The script is a bastardized version of Denzil Ribeiro’s Troubleshooting High CPU, which also provides a clever system of tracking the source behind CPU bottlenecks.

Reporting Currently Running (active) Queries

Now after identifying when a CPU starvation issue is in progress, we need to grab a snapshot of what is currently executing on the system, this will help us in diagnosing exactly what queries are contributing towards the high CPU utilization the system is experiencing, and what can we do in order to improve those queries performance.

I have created a handy little script that essentially:

  • Grabs the currently running queries on SQL Server: Using the sys.dm_exec_requests which returns the currently executing sessions and their sql_handle.
  • Appends memory information and query execution statistics (when available): Through sys.dm_exec_query_memory_grants and sys.dm_exec_query_stats we are able to extract important information regarding the execution of each query, including historical performance figures and execution counts.
  • Append the text of the executing query: This will help us identify the query. In the case of a Stored Procedure this will be the CREATE statement of that procedure.
  • Appends the Query Execution Plan: This will help us to dive deep into why this query is causing high CPU utilization, and optimize accordingly.
CREATE PROC [dbo].[usp_snapshot_QueryMemoryCPU]
AS
BEGIN

SELECT
    GETUTCDATE() AS 'DateStamp',
--  REPLACE(REPLACE ( REPLACE ( sqltext.TEXT, '
--' , ' ' ), '  ' , ' ' ),'|',' ') as [TEXT],
    sqltext.TEXT
    query_plan,
    req.session_id,
    req.status,
    req.command,
    req.cpu_time,
    req.total_elapsed_time, requested_memory_kb,
    granted_memory_kb,used_memory_kb, wait_order,
    qs.Execution_count AS Executions,
    qs.total_worker_time AS TotalCPU,
    qs.total_physical_reads AS PhysicalReads,
    qs.total_logical_reads AS LogicalReads,
    qs.total_logical_writes AS LogicalWrites,
    qs.total_elapsed_time AS Duration,
    qs.total_worker_time/qs.execution_count AS [Avg CPU TIME]
FROM sys.dm_exec_requests req
    INNER JOIN sys.dm_exec_query_memory_grants MG ON req.session_id = MG.session_id
    LEFT JOIN sys.dm_exec_query_stats qs ON req.sql_handle = qs.sql_handle
    CROSS APPLY sys.dm_exec_sql_text(req.sql_handle) AS sqltext
    CROSS APPLY sys.dm_exec_query_plan(MG.plan_handle)

END

This is a very handy query that I recommend always carrying in your SQL toolbox. I usually have a “Maintenance” database that contains a host of queries I collected over the years that helped me diagnosing one thing or another.

I have created an XML version of this query that returns an HTML table, handy for sticking in an email alerts that support HTML.

SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE FUNCTION [dbo].[usf_snapshot_QueryMemoryCPU_XML]
(
)
RETURNS NVARCHAR(MAX)
AS
BEGIN

    DECLARE @RESULT NVARCHAR(MAX)
    DECLARE @xmlHeaderdata XML

    SELECT @xmlHeaderdata =
    (
    SELECT
      (SELECT 'DateStamp' AS th FOR xml path(''), TYPE),
      (SELECT 'TEXT'   AS th FOR xml path(''), TYPE),
      (SELECT 'session_id'  AS th FOR xml path(''), TYPE),
      (SELECT 'status'  AS th FOR xml path(''), TYPE),
      (SELECT 'command'  AS th FOR xml path(''), TYPE),
      (SELECT 'cpu_time'  AS th FOR xml path(''), TYPE),
      (SELECT 'total_elapsed_time'  AS th FOR xml path(''), TYPE),
      (SELECT 'requested_memory_kb'  AS th FOR xml path(''), TYPE),
      (SELECT 'granted_memory_kb'  AS th FOR xml path(''), TYPE),
      (SELECT 'used_memory_kb'  AS th FOR xml path(''), TYPE),
      (SELECT 'wait_order'  AS th FOR xml path(''), TYPE),
      (SELECT 'Execution_count'  AS th FOR xml path(''), TYPE),
      (SELECT 'total_worker_time'  AS th FOR xml path(''), TYPE),
      (SELECT 'total_physical_reads'  AS th FOR xml path(''), TYPE),
      (SELECT 'total_logical_reads'  AS th FOR xml path(''), TYPE),
      (SELECT 'total_logical_writes'  AS th FOR xml path(''), TYPE),
      (SELECT 'total_elapsed_time'  AS th FOR xml path(''), TYPE),
      (SELECT 'avg_cpu_time'  AS th FOR xml path(''), TYPE)
    FOR xml path('tr'), TYPE
    )
    DECLARE @xmldata XML
    SELECT @xmldata =
    (      
    SELECT
      (SELECT GETUTCDATE()     AS 'td' FOR xml path(''), TYPE),
      (SELECT sqltext.TEXT           AS 'td' FOR xml path(''), TYPE),
      (SELECT req.session_id           AS 'td' FOR xml path(''), TYPE),
      (SELECT req.status           AS 'td' FOR xml path(''), TYPE),
      (SELECT req.command           AS 'td' FOR xml path(''), TYPE),
      (SELECT req.cpu_time           AS 'td' FOR xml path(''), TYPE),
      (SELECT req.total_elapsed_time           AS 'td' FOR xml path(''), TYPE),
      (SELECT requested_memory_kb           AS 'td' FOR xml path(''), TYPE),
      (SELECT granted_memory_kb           AS 'td' FOR xml path(''), TYPE),
      (SELECT used_memory_kb           AS 'td' FOR xml path(''), TYPE),
      (SELECT wait_order           AS 'td' FOR xml path(''), TYPE),
      (SELECT qs.Execution_count           AS 'td' FOR xml path(''), TYPE),
      (SELECT qs.total_worker_time           AS 'td' FOR xml path(''), TYPE),
      (SELECT qs.total_physical_reads           AS 'td' FOR xml path(''), TYPE),
      (SELECT qs.total_logical_reads           AS 'td' FOR xml path(''), TYPE),
      (SELECT qs.total_logical_writes           AS 'td' FOR xml path(''), TYPE),
      (SELECT qs.total_elapsed_time           AS 'td' FOR xml path(''), TYPE),
      (SELECT qs.total_worker_time/qs.execution_count           AS 'td' FOR xml path(''), TYPE)
      --(select query_plan as 'td' for xml path(''), type)
    FROM sys.dm_exec_requests req
        INNER JOIN sys.dm_exec_query_memory_grants MG ON req.session_id = MG.session_id
        LEFT JOIN sys.dm_exec_query_stats qs ON req.sql_handle = qs.sql_handle
        CROSS APPLY sys.dm_exec_sql_text(req.sql_handle) AS sqltext
        CROSS APPLY sys.dm_exec_query_plan(MG.plan_handle)
    WHERE req.session_id != @@spid
    FOR xml path('tr'), TYPE
    )
    SELECT @RESULT = '<table>'+CONVERT(NVARCHAR(MAX),@XmlHeaderdata) +  CONVERT(NVARCHAR(MAX),@xmldata) + '</table>'

    RETURN @RESULT

END

 

Setting Up a SQL Server CPU Resource Starvation Diagnostic Email Alert

Now we just need to put everything together.

This is simple and all we have to do is make sure that the condition for the alert is met, as in the CPU is being over-utilized and is causing bottlenecks according to the parameters we have setup, and an email alert is triggered carrying the results of our query above.

CREATE PROC [dbo].[usp_mon_alert_CPUStarvation]
    @SQLCPUThrehold_Percent INT = 75
    ,@LogHighCPUCount INT = 3
AS
BEGIN

SET NOCOUNT ON;

DECLARE @CurrentHighCPUCount INT = 0,
    @SQLCPUThrehold_Percent INT = 75,
    @LogHighCPUCount INT = 3
;WITH topCPU AS
(SELECT TOP (5)
CONVERT (VARCHAR(30), getdate(), 126) AS runtime,
        record.value('(Record/@id)[1]', 'int') AS record_id,
        record.value('(Record/SchedulerMonitorEvent/SystemHealth/SystemIdle)[1]', 'int') AS system_idle_cpu,
        record.value('(Record/SchedulerMonitorEvent/SystemHealth/ProcessUtilization)[1]', 'int') AS sql_cpu_utilization
--into #tempCPU
FROM sys.dm_os_sys_info inf CROSS JOIN (
SELECT TIMESTAMP, CONVERT (xml, record) AS record
FROM sys.dm_os_ring_buffers
WHERE ring_buffer_type = 'RING_BUFFER_SCHEDULER_MONITOR'
AND record LIKE '%<SystemHealth>%') AS t
ORDER BY record.value('(Record/@id)[1]', 'int') DESC
)
SELECT @CurrentHighCPUCount = COUNT(*) FROM topCPU WHERE sql_cpu_utilization >@SQLCPUThrehold_Percent

IF @CurrentHighCPUCount >= @LogHighCPUCount
BEGIN
   
    DECLARE @tableVar NVARCHAR(MAX)

    SELECT @tableVar = dbo.usf_snapshot_QueryMemoryCPU_XML()
    SET @tableVar = '<p>Attached is the currently running queries that could be contributing towards high CPU</p><br/>' + @tableVar
    EXEC msdb..sp_send_dbmail
    @recipients='Diagnostics@Company.com',
    @body_format = 'HTML',
    @subject='CPU Starvation on box',
    @body= @tableVar

END

END

The script above assumes that SQL Server Database Mail has been setup and configured with a default profile, otherwise the parameters for the need to be sp_send_dbmail needs to be configured.

And thats it! now you can schedule this using SQL Server Job Agent as often as you would like, and any time your server is suffering from CPU bottlenecks and starvation issue you should get an email alert with enough information to action the problem immediately.

I hope this helps you in troubleshooting your SQL Server instance CPU utilization issues. If you use a different technique to diagnose your CPU starvation issues please drop me a comment.

Diagnose SQL Server CPU Resource Starvation Issues with Email Alerts rated 5 out of 5 by 1 readers

Diagnose SQL Server CPU Resource Starvation Issues with Email Alerts , 5.0 out of 5 based on 1 ratings
* * * * * 1 vote
Tags: , , ,
2 replies
  1. Melanie says:

    Any chance you have the code for 2008R2?

    Reply

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>