Truncate SSIS Catalog Database Operation Log Tables
The SSIS Catalog in SQL Server 2012 comes with the benefit of logging natively integrated within the service’s architecture, which is awesome considering before SSIS 2012 BI developers and admins had to go for 3rd party tools in order to bring that level of logging into their SSIS packages.
That being said, with a lot of log data comes the responsibility of maintaining and clearing out old log entries and execution messages, which are stored on the SSIS Catalog Database and drive the SSRS reports that are used to diagnose the execution of packages, as well as provide performance statistics. The currently provided SSIS Server Maintenance Job in SQL Server Job Agent that performs these routine maintenance operation can be very slow and extremely blocking, especially on an SSIS server that sees alot of operations and activities, and cannot afford the downtime caused by the heavy locking experienced by this operation.
In this post I provide an introduction to the [internal].[cleanup_server_retention_window] stored procedure executed by the SSIS Server Operation Records Maintenance step in the SSIS Server Maintenance Job, as well as provide an extremely fast truncation query to completely clear out the SSIS log tables. In a follow-up post I will be providing a rewrite of the [internal].[cleanup_server_retention_window] stored procedure that performs much better on SSIS services with a large number of logged operations per day.
The [internal].[cleanup_server_retention_window] stored procedure, which is the main procedure in charge of clearing out SSIS operations as per the configured logging policy, has a simple structure but does a lot in the background, so lets start by dissecting this procedure.
The first stage of the procedure grabs the logging configurations setup for the SSIS Catalog DB.
FROM [catalog].[catalog_properties]
WHERE property_name = 'OPERATION_CLEANUP_ENABLED'
SELECT @retention_window_length = CONVERT(INT,property_value)
FROM [catalog].[catalog_properties]
WHERE property_name = 'RETENTION_WINDOW'
In the second stage, the procedure logs a new clean-up operations, the id returned by this INSERT could be used to track the status of the clean-up operation.
[operation_type],
[created_time],
[object_type],
[object_id],
[object_name],
[STATUS],
[start_time],
[caller_sid],
[caller_name]
)
VALUES (
2,
SYSDATETIMEOFFSET(),
NULL,
NULL,
NULL,
1,
SYSDATETIMEOFFSET(),
@caller_sid,
@caller_name
)
SET @operation_id = SCOPE_IDENTITY()
Then the procedure recursively (in increments of 10 by default) deletes operations from the [internal].[operations] table. The important thing to note here is that the SSIS Catalog foreign key constraints is setup to CASCADE ON DELETE, hence each row deleted in the [internal].[operations] table propagates to 14 other tables that can go as far as 3 levels deep. This is clearly the bottleneck where the procedure will spend most of its time.
BEGIN
DELETE TOP (@delete_batch_size)
FROM [internal].[operations]
OUTPUT DELETED.operation_id, DELETED.operation_type INTO @deleted_ops
WHERE ( [end_time] <= @temp_date
OR ([end_time] IS NULL AND [STATUS] = 1 AND [created_time] <= @temp_date ))
SET @rows_affected = @@ROWCOUNT
END
In the last stage the procedure drops the SYMMETRIC KEY issued with each new operation of type 200 executed by the SSIS Service, although this is done on an operation by operation basis, it does not actually take that much time to go through the number of operations produced by a heavy utilization SSIS Catalog DB.
BEGIN
SET @key_name = 'MS_Enckey_Exec_'+CONVERT(VARCHAR,@execution_id)
SET @certificate_name = 'MS_Cert_Exec_'+CONVERT(VARCHAR,@execution_id)
SET @sqlString = 'IF EXISTS (SELECT name FROM sys.symmetric_keys WHERE name = ''' + @key_name +''') '
+'DROP SYMMETRIC KEY '+ @key_name
EXECUTE sp_executesql @sqlString
SET @sqlString = 'IF EXISTS (select name from sys.certificates WHERE name = ''' + @certificate_name +''') '
+'DROP CERTIFICATE '+ @certificate_name
EXECUTE sp_executesql @sqlString
FETCH NEXT FROM execution_cursor INTO @execution_id
END
CLOSE execution_cursor
DEALLOCATE execution_cursor
SSIS Catalog Database Log Tables Schema
Now in order to fully understand were the clean-up operation spends most of its time, and in order to identify the data heavyweight tables, we need to analyse the structure of the tables related to the [internal].[operations] table, and understand how the deletion is being cascaded.
All tables are related via an operation_id column, although this column could be called different names in different tables, for example in the [internal].[executions] table the column is called execution_id. The diagram below (along with the SQL text) should highlight the schema generated by following all tables related to [internal].[operations] via a foreign key reference that cascades on deletion.
-- [internal].[executions]
-- [internal].[executable_statistics]
-- [internal].[execution_component_phases]
-- [internal].[execution_data_statistics]
-- [internal].[execution_data_taps]
-- [internal].[execution_parameter_values]
-- [internal].[execution_property_override_values]
-- [internal].[extended_operation_info]
-- [internal].[operation_messages] --(DATA HEAVY)
-- [internal].[event_messages] --(DATA HEAVY)
-- [internal].[event_message_context]
-- [internal].[operation_os_sys_info]
-- [internal].[operation_permissions]
-- [internal].[validations]
SSIS Catalog Log Table Truncation Procedure
The current SSIS maintenance procedure requires a lot resources to run, this is particularly obvious when you have a heavy utilization SSIS service that generates a large number of operations per day, the main cause for this performance issue is the cascading nature of the delete operation, which can take a very long time and causes a large number of non-uniform IO access to disk, this type of access not only slows down the delete operation, but can be damaging to disk availability across the server (if disks are physically attached to servers, rather than through a SAN layer for example).
Not running the SSIS maintenance operation is usually not an option, since the performance of the SSIS reporting will increasingly be getting worse, I personally had some reports take longer than 5 minutes to generate.
For a quick-and-dirty approach to truncating the SSIS log tables, and starting all logs from scratch, you can use the [internal].[cleanup_server_retention_window_truncate] scripts that I created (below), which essentially follows the foreign key constraint tables, dropping all relationships, truncating all involved tables, then re-creating all relationships again, then dropping all SYMMETRIC KEYS as per the original SSIS maintenance procedure. This is very much a brute force approach and it will mean that you will loose all your log data, but can be a very handy quick solution if you’re jammed with a massive log that a normal clean-up operation is unable to handle in a reasonable amount of time.
This script is obviously provided as is, and you should test it on your system thoroughly before releasing it. I had no issues with it (running on SSIS 2012 Ent. Edt.), and all SSIS packages continued to execute normally.
--WITH EXECUTE AS 'AllSchemaOwner'
AS
----- STRUCTURE OF TABLES:
--[internal].[operations]
-- [internal].[executions]
-- [internal].[executable_statistics]
-- [internal].[execution_component_phases]
-- [internal].[execution_data_statistics]
-- [internal].[execution_data_taps]
-- [internal].[execution_parameter_values]
-- [internal].[execution_property_override_values]
-- [internal].[extended_operation_info]
-- [internal].[operation_messages] --(DATA HEAVY)
-- [internal].[event_messages] --(DATA HEAVY)
-- [internal].[event_message_context]
-- [internal].[operation_os_sys_info]
-- [internal].[operation_permissions]
-- [internal].[validations]
SET NOCOUNT ON
DECLARE @enable_clean_operation bit
DECLARE @retention_window_length INT
DECLARE @caller_name nvarchar(256)
DECLARE @caller_sid varbinary(85)
DECLARE @operation_id BIGINT
EXECUTE AS CALLER
SET @caller_name = SUSER_NAME()
SET @caller_sid = SUSER_SID()
REVERT
BEGIN TRY
SELECT @enable_clean_operation = CONVERT(bit, property_value)
FROM [catalog].[catalog_properties]
WHERE property_name = 'OPERATION_CLEANUP_ENABLED'
IF @enable_clean_operation = 1
BEGIN
SELECT @retention_window_length = CONVERT(INT,property_value)
FROM [catalog].[catalog_properties]
WHERE property_name = 'RETENTION_WINDOW'
IF @retention_window_length <= 0
BEGIN
RAISERROR(27163 ,16,1,'RETENTION_WINDOW')
END
INSERT INTO [internal].[operations] (
[operation_type],
[created_time],
[object_type],
[object_id],
[object_name],
[STATUS],
[start_time],
[caller_sid],
[caller_name]
)
VALUES (
2,
SYSDATETIMEOFFSET(),
NULL,
NULL,
NULL,
1,
SYSDATETIMEOFFSET(),
@caller_sid,
@caller_name
)
SET @operation_id = SCOPE_IDENTITY()
-- Remove all [internal].[executions] dependancies
TRUNCATE TABLE [internal].[executable_statistics]
TRUNCATE TABLE [internal].[execution_component_phases]
TRUNCATE TABLE [internal].[execution_data_statistics]
TRUNCATE TABLE [internal].[execution_data_taps]
TRUNCATE TABLE [internal].[execution_parameter_values]
TRUNCATE TABLE [internal].[execution_property_override_values]
-- Remove all [internal].[event_message_context] dependancies
TRUNCATE TABLE [internal].[event_message_context]
-- Remove all non-dependant tables
TRUNCATE TABLE [internal].[operation_os_sys_info]
TRUNCATE TABLE [internal].[operation_permissions]
TRUNCATE TABLE [internal].[validations]
TRUNCATE TABLE [internal].[extended_operation_info]
-- Deal with [internal].[event_messages] and [internal].[operation_messages]
ALTER TABLE [internal].[event_message_context] DROP CONSTRAINT [FK_EventMessageContext_EventMessageId_EventMessages]
TRUNCATE TABLE internal.event_messages
ALTER TABLE [internal].[event_message_context] WITH CHECK ADD CONSTRAINT [FK_EventMessageContext_EventMessageId_EventMessages] FOREIGN KEY([event_message_id])
REFERENCES [internal].[event_messages] ([event_message_id])
ON DELETE CASCADE
ALTER TABLE [internal].[event_messages] DROP CONSTRAINT [FK_EventMessages_OperationMessageId_OperationMessage]
TRUNCATE TABLE [internal].[operation_messages]
ALTER TABLE [internal].[event_messages] WITH CHECK ADD CONSTRAINT [FK_EventMessages_OperationMessageId_OperationMessage] FOREIGN KEY([event_message_id])
REFERENCES [internal].[operation_messages] ([operation_message_id])
ON DELETE CASCADE
-- Deal with [internal].[executions]
ALTER TABLE [internal].[executable_statistics] DROP CONSTRAINT [FK_ExecutableStatistics_ExecutionId_Executions]
ALTER TABLE [internal].[execution_component_phases] DROP CONSTRAINT [FK_ExecCompPhases_ExecutionId_Executions]
ALTER TABLE [internal].[execution_data_statistics] DROP CONSTRAINT [FK_ExecDataStat_ExecutionId_Executions]
ALTER TABLE [internal].[execution_data_taps] DROP CONSTRAINT [FK_ExecDataTaps_ExecutionId_Executions]
ALTER TABLE [internal].[execution_parameter_values] DROP CONSTRAINT [FK_ExecutionParameterValue_ExecutionId_Executions]
ALTER TABLE [internal].[execution_property_override_values] DROP CONSTRAINT [FK_ExecutionPropertyOverrideValue_ExecutionId_Executions]
TRUNCATE TABLE [internal].[executions]
ALTER TABLE [internal].[execution_property_override_values] WITH CHECK ADD CONSTRAINT [FK_ExecutionPropertyOverrideValue_ExecutionId_Executions] FOREIGN KEY([execution_id])
REFERENCES [internal].[executions] ([execution_id])
ON DELETE CASCADE
ALTER TABLE [internal].[execution_parameter_values] WITH CHECK ADD CONSTRAINT [FK_ExecutionParameterValue_ExecutionId_Executions] FOREIGN KEY([execution_id])
REFERENCES [internal].[executions] ([execution_id])
ON DELETE CASCADE
ALTER TABLE [internal].[execution_data_taps] WITH CHECK ADD CONSTRAINT [FK_ExecDataTaps_ExecutionId_Executions] FOREIGN KEY([execution_id])
REFERENCES [internal].[executions] ([execution_id])
ON DELETE CASCADE
ALTER TABLE [internal].[execution_data_statistics] WITH CHECK ADD CONSTRAINT [FK_ExecDataStat_ExecutionId_Executions] FOREIGN KEY([execution_id])
REFERENCES [internal].[executions] ([execution_id])
ON DELETE CASCADE
ALTER TABLE [internal].[execution_component_phases] WITH CHECK ADD CONSTRAINT [FK_ExecCompPhases_ExecutionId_Executions] FOREIGN KEY([execution_id])
REFERENCES [internal].[executions] ([execution_id])
ON DELETE CASCADE
ALTER TABLE [internal].[executable_statistics] WITH CHECK ADD CONSTRAINT [FK_ExecutableStatistics_ExecutionId_Executions] FOREIGN KEY([execution_id])
REFERENCES [internal].[executions] ([execution_id])
ON DELETE CASCADE
-- Deal with [internal].[operations]
DECLARE @deleted_ops TABLE(operation_id BIGINT, operation_type SMALLINT)
DELETE --TOP (@delete_batch_size)
FROM [internal].[operations]
OUTPUT DELETED.operation_id, DELETED.operation_type INTO @deleted_ops
WHERE operation_id != @operation_id
DECLARE @execution_id BIGINT
DECLARE @sqlString nvarchar(1024)
DECLARE @key_name [internal].[adt_name]
DECLARE @certificate_name [internal].[adt_name]
DECLARE execution_cursor CURSOR LOCAL FOR
SELECT operation_id FROM @deleted_ops
WHERE operation_type = 200
OPEN execution_cursor
FETCH NEXT FROM execution_cursor INTO @execution_id
WHILE @@FETCH_STATUS = 0
BEGIN
SET @key_name = 'MS_Enckey_Exec_'+CONVERT(VARCHAR,@execution_id)
SET @certificate_name = 'MS_Cert_Exec_'+CONVERT(VARCHAR,@execution_id)
SET @sqlString = 'IF EXISTS (SELECT name FROM sys.symmetric_keys WHERE name = ''' + @key_name +''') '
+'DROP SYMMETRIC KEY '+ @key_name
EXECUTE sp_executesql @sqlString
SET @sqlString = 'IF EXISTS (select name from sys.certificates WHERE name = ''' + @certificate_name +''') '
+'DROP CERTIFICATE '+ @certificate_name
EXECUTE sp_executesql @sqlString
FETCH NEXT FROM execution_cursor INTO @execution_id
END
CLOSE execution_cursor
DEALLOCATE execution_cursor
END
END TRY
BEGIN CATCH
IF (CURSOR_STATUS('local', 'execution_cursor') = 1
OR CURSOR_STATUS('local', 'execution_cursor') = 0)
BEGIN
CLOSE execution_cursor
DEALLOCATE execution_cursor
END
UPDATE [internal].[operations]
SET [STATUS] = 4,
[end_time] = SYSDATETIMEOFFSET()
WHERE [operation_id] = @operation_id;
THROW
END CATCH
RETURN 0
Truncate SSIS Catalog Database Operation Log Tables rated 4 out of 5 by 1 readers
Great article,
It helped us since the provided maintenance plan makes no difference.
Thank you !
Great article, i had the same issue and was thinking to rewrite the sp in a similiar way.
You saved my time !!!
Well once you truncate i would also suggest to get below indexes created , it will improve the purge performance in subsequent runs.
USE [SSISDB]
GO
CREATE NONCLUSTERED INDEX [NCIX_operation_id]
ON [internal].[event_messages] ([operation_id]);
GO
CREATE NONCLUSTERED INDEX [NCIX_operation_id]
ON [internal].[operation_messages] ([operation_id]);
Glad it helped you out Ananda.
Thanks for the awesome index suggestions too!
L
Hi,
Thanks for the article, its really great.
It is stated that in the follow up post a new re-write of the sp will be published, if its already published can you share me the code please.
Hi CHN,
Thanks!
I actually had a procedure written up for this, although I have recently changed jobs, and forgot take the procedure code with me, so unfortunately I won’t be posting a rewrite for a while, until the situation happens again.
I do hear though that with SQL Server 2012 CU-7, there has been some massive improvement to the performance of the SSIS maintenance operation, through adding a collection of indexes that support the procedure’s operations.
You can always check out Phil Brammer’s post on SSIS Catalog Indexing Recommendations, in which he provides some helpful indexes to improve the performance of this operation.
Cheers,
This should be included in the next SQL server hotfix; had ran into a brickwall with the SSISDB performance and explosive growth. Kudos.
Why do you drop the symmetric key and certificate in the sproc?
I think I was trying to replicate the behavior of the SSIS catalog maintenance job, like for like. Also, not sure exactly what are the implication of continuously adding a large number of these types of objects, without maintenance.
Why, is it causing trouble?
Yeah sorry, I noticed after I commented that you were just undoing what the SSIS maintenance sproc did. I ended up having to kill your sproc in my production environment because the part where it deletes the certs and keys was going to take hours. I tried doing the deletes on my own but it still was taking a lifetime and causing blocking somehow. I’m just going to leave the data there and hope it never causes a problem. Thanks!
Yeah this is definitely the bottleneck in this version… only if there was a way to bulk delete objects! :).
I would maybe try to schedule the deletion of those objects around whenever no other work is being done (for example when there is no job agent jobs running, or CPU is idle, or no ssis package being executed). That way you delete it bit by bit. Probably keeping them there will clog up something… eventually…. surely!
Would love to hear if you end up figuring out a better way around it. Cheers.
I wrote a script to replace the cleanup window script a while back… just got around to sharing it.
I can run it while SSIS packages run without the cascading locks that the original one created as well.
http://cryptoknight.org/index.php?post/2015/11/03/SSIS-Maintenance-Script
Thats a nice script, thanks for sharing, although I think its incomplete, it doesn’t delete the certificate keys, which means you’ll end up with loads of stray certificates.
Also, keep in mind that this is not faster at all than the truncate approach, but causes no locking (so could potentially be run while SSIS packages are still executing).
Hi,
I am new to this dba world..i have a doubt that by running the above given script will the entire database becomes empty or the unwanted logs are removed
Hey Haritha,
I’ve written this a while ago, so am not sure how relevant/needed this is today, but at the time of writing, this would truncate log tables only, and not the entire catalog db, which contains execution configuration that this script should leave untouched.
hi,
I have the same issue with my SSIS. when i try to execute your sp, i’ve got some errors like this,
Msg 3728, Level 16, State 1, Procedure cleanup_server_retention_window_truncateall, Line 102
‘FK_EventMessageContext_EventMessageId_EventMessages’ is not a constraint.
Msg 3727, Level 16, State 0, Procedure cleanup_server_retention_window_truncateall, Line 102
Could not drop constraint. See previous errors.
i’m new btw in this job, hopefully you could help me.. am I miss something?
Hi,
Some foreign keys are missing. You must create that foreign key. You can check the sp, and copy&run missing foreign keys.
e.g.:
ALTER TABLE [internal].[event_message_context] WITH CHECK ADD CONSTRAINT [FK_EventMessageContext_EventMessageId_EventMessages] FOREIGN KEY([event_message_id])
REFERENCES [internal].[event_messages] ([event_message_id])
ON DELETE CASCADE
Could anyone help me with this error, I tried to drop the foreign key constraint and add it again after truncating the table
The ALTER TABLE statement conflicted with the FOREIGN KEY constraint “FK_EventMessageContext_EventMessageId_EventMessages”. The conflict occurred in database “SSISDB”, table “internal.event_messages”, column ‘event_message_id’.
Help me guys, Could anyone help me with this error, I tried to drop the foreign key constraint and add it again after truncating the table
The ALTER TABLE statement conflicted with the FOREIGN KEY constraint “FK_EventMessageContext_EventMessageId_EventMessages”. The conflict occurred in database “SSISDB”, table “internal.event_messages”, column ‘event_message_id’.
Hi,
I am new to the SQL server world. Currently using 2014 sql server. SSISDB is growing a lot.
There are few tables like
internal.event_messages
internal.opertaion_messages
internal.event_message_context
Can i truncate them directly.
As in truncate table internal.event_messages
please suggest,
U should just use this script to truncate them cleanly. If you do a straight up truncate on these tables there will be some objects that will stick around and bloat your db.
Hi,
The job is getting failed some times due to below error
Executed as user: ##MS_SSISServerCleanupJobLogin##. The DELETE statement conflicted with the REFERENCE constraint “FK_EventMessage_Operations”. The conflict occurred in database “SSISDB”, table “internal.event_messages”, column ‘operation_id’. (Error 547)
We are able resolve the issue temporarily by deleting data manually but it is failing frequently after few runs. Could you please suggest permanent solution to us.
Thanks & Regards
Raju
Hi –
I ran this script in our dev environment – it was successful, after that I ran ssis packages as well to test. However in Production this was taking long time so I stopped and ran it one by one, the operation_permission table was not getting deleted and it was taking a long time. any clues?
Thanks for the script. It is very helpful. Do i need to add extra step for this stored procedure in cleaning job? thanks again it was helpful