SSIS 2012 “Cancelled” Status Issues
If you have used Microsoft development products long enough, then you must be familiar with the error reporting nightmares some of their products exhibit, the classic “please check the error log for more details” message, even though you are in the error log, is not just frustrating, but in my opinion is a bug that shouldn’t even reach end consumer, especially when you are handing the developers a framework for them to build on and extend.
Microsoft’s SSIS and SSRS have always been shortchanged when it comes to error logging; whether they are run-time errors on the server, or even while trying to debug a package or a report in a development environment, there is no lack of confusing error messages that send you down hours of so many long winded paths that leaves you tugging on strings hoping something might eventually make sense.
Today I will go through one of SSIS 2012 new quirks, the SSIS Canceled Status (or error), and what could be the reasons that you might receive such an error.
I have dealt with SSIS 2005 and 2008 for years now, and started to get a really good handle on what an error is trying to say, rather than what it is saying. To an outsider it might look like some Jedi mind tricks, or that perhaps I am using some sort of telepathy SSIS component, but in reality I just spend shit loads of time decoding the hidden meaning behind a ridiculous error message trying to find the actual root cause of a box turning red instead of green.
But SSIS 2012 is a new beast, and with that comes a host of new and exciting quirks and awesomely encrypted error messages that is surely bound to turn hairs white and hearts into stones. The SSIS 2012 Canceled Status (which could be seen in a Package’s Execution Messages report), is a reliable start into this hideous world of SSIS darkness and disrepair… I might be being abit over-dramatic here.
What is an SSIS Canceled Status?
The Canceled status in SSIS 2012 is an indication that the package has stopped execution, this stoppage is due to either a user or a system(such as SQL Server) forcing the termination of the SSIS package before it finished executing.
The thing to focus on here is that a package could be Canceled either by a user or by a system, the Canceled status does not distinguish between either, and does not log any error message indicating the cause of the cancellation, leaving the user to investigate why a cancellation of execution happened on their merry own.
This is bad, especially when there are many cooks managing different SSIS packages, being able to tell why a package has been Canceled, and whether this has been done by a user or due to a system or environmental error is of paramount importance.
It is important to note that SSIS Execution Statuses are statuses about the execution of the steps within an SSIS packages, and not the execution status of an SSIS package within an environment, in fact, a package execution context has no idea about the environment it is executing under, and if anything environmental (such as resource contention on the box, or Deadlocks in the SSIS Catalog DB) causes SSIS execution to be interrupted, SSIS’s Execution Status will be set Canceled, rather than to Failure, and no Error Message will be attributed to this (since Status Canceled has no error messages), leaving users to wonder why did their SSIS package get canceled.
Reasons for SSIS Canceled Status
Before we delve into what could cause an SSIS package to go to a Canceled Execution Result DataCode, there are a few logs you should check out as soon as you get this status which could help you identify the root cause:
- SSIS 2012 catalog.operation_messages View: If you have the SSIS package operational number, this log (within the SSIS Catalog DB) might contain relevant information regarding why a Canceled Execution Result DataCode might have been returned.
- Event Viewer Application Windows Log: An SSIS package service error should appear in the Windows Event Viewer logs, this is a very good place to look for any high level errors about the SSIS execution environment.
- SQL Server Log: Since SSIS is much more closely coupled with SQL Server, any problems the SQL Server instance hosting the SSIS Catalog DB is experiencing will result in execution issues with SSIS packages, resulting in Execution Result Status Canceled.
- SQL Server Profiler (if running): Again, this will help you identify and closely inspect any issues with the SQL Server instance on which the SSIS Catalog DB resides.
As mentioned before, SSIS Canceled Execution Result Status could show up for many reasons, but mostly in relation to anything outside the execution context of a package, so there is no need to try and debug the package itself, you need to debug the environment in which the package is running, and so issues similar to the following could be causing this DataCode Execution Result:
- Resource Contention on SSIS Server: You could have either allocated lower resources to SQL Server, which is causing timeouts on activities in the SSIS Catalog DB, or the SSIS service process itself does not have enough resource to execute all the currently scheduled/triggered packages. You need to allocate resources carefully when deploying an SSIS environment, this includes ensuring that both SQL Server and SSIS service process have enough resources to perform the required operations.
- SSIS Catalog Database Deadlocks or Timeouts: This is fairly common, either Deadlocks between different sessions working on the SSIS Catalog DB, or Timeouts trying to write to log tables. Difficult to say what to do here, I switched the “Is Read Committed Snapshot On” flag to TRUE in the SSIS Catalog Database Properties, and that seemed to make things run in a slightly less “blocking” manner, but sometimes you might just have to bite the bullet and scale-out your SSIS environment across 2 or more SQL Server instances (which is what am having to resort to).
- SSIS Catalog Database Log Expansion Timeout: I’ve noticed sometimes that SSIS package can go awry and deviate from its user-given path; basically SQL Server Job Agent can report an SSIS package as “Canceled” but in reality the package is still executing in the background, and instead of this package actually executing tasks, it just holds at a particular step forever while keeping its “Running” status, this causes the Log file to grow massively, and as it is growing, it becomes slower and more sluggish, eventually causing Timeouts whenever it tries to grow again. To resolve this issue you need to ensure that you set limits on package execution timeouts (i.e. no package should take longer than a day to execute, if it does it should be killed), you can also regularly check the SSIS Catalog “Active Operations” menu (or the SSIS Catalog DB views) for any packages that shouldn’t be running, or has been running for a long time.
- SSIS Server Maintenance Job Locking: This job is evil, it takes too long to executes and holds up the entire catalog while doing so, if like me you have a requirement for 24/7 SSIS package execution, this is no good. By default this is scheduled to run every day at 00:00 as a SQL Server Agent Job, the schedule could be edit to better suite your business requirements, but make sure nothing is executing while the maintenance job is doing its thing.
These are just some ideas to get you started, as I mentioned this Canceled status could be anything outside the failure of SSIS execution steps, so dig deep and hard in one of the 10s of server logs and you’re bound to find something… start tugging on those strings
Microsoft Connect ticket regarding Improved Handling of SSIS 2012 Canceled Execution Result
More on SSIS 2012 Execution Result Status (DataCode) in MSDN
Update from Microsoft (20120807): According to Microsoft’s reply to the Connect ticket above, this issue will be resolved in SQL Server 2012 SP1, and the Cancelled SSIS status will hopefully become much more descriptive in terms of error reporting.
Thanks for this. What a lifesaver! After having our SSIS 2012 server in production for a year, we suddenly started seeing mass cancellations from midnight to 3 AM every night. Turns out that nasty SSIS Maintenance Job is configured to retain 365 days of data by default. So while it completed in a few seconds for the first year, it suddenly started to take three hours without warning, canceling everything in its path and leaving no useful error messages behind. What a mess! And incredibly inefficient just for maintaining ~100 rows!
We changed that job’s schedule and gave it its own time slot with no other competing packages, and all is well again.
Awesome! am really glad this helped you. The SSIS catalog maintenance operation is definitely not thought out properly for a high usage environment.
I also recommend reading this nice article by Phil Brammer on how to optimize the SSIS catalog , which also speeds up the overall time it takes to complete the SSIS maintenance operation.
Awesome, yeah those indexes were definitely needed too. Thanks again!