Those of you running SSIS 2012 may not yet be pushing the limits of the RTM release, and while I can’t tell you where the limit boundary lies, I can give you advanced notice about a fairly big problem.
The entire SSISDB catalog is linked together via foreign keys, most of which are all linked to a single ancestor table – internal.operations. Almost all descendants of that table are set to cascade delete when removing data from internal.operations. Enter the SQL Agent job, “SSIS Server Maintenance Job.” This job by default is set to run at midnight daily, and uses two catalog parameters to function: “Clean Logs Periodically” and “Retention Period (days).” When these are set, the maintenance job purges any data outside of the noted retention period.
This maintenance job deletes, 10 records at a time in a loop, from internal.operations and then cascades into many tables downstream. In our case, we have around 3000 operations records to delete daily (10 at a time!) that translates into 1.6 million rows from internal.operation_messages. That’s just one downstream table! This entire process completely, utterly locks up the SSISDB database from any SELECT/INSERT data.
With that said, what does it mean to you? Potentially the cancellation of other SSIS packages that are running at the same time. For us, each and every time the maintenance job runs, packages get canceled. This is not good.
What can you do? For starters, move the maintenance job schedule to a more appropriate time for your environment. If you are like us and you have packages running 24×7, and there is no convenient time, then what? Not much really. At that point, you can decide not to run the maintenance job and hence accumulate records in SSISDB, or you can file a case with PSS (to add weight to the case I already have open).
Some details on the issue – I suspect based on some SQL Profiler tracing that the executable running the package is timing out waiting for its lock request to insert data into an SSISDB table. This timeout is not handled well, obviously, and needs to be reworked. The product team is aware of the issue. Please let me know if you’ve run into this problem and what you’re doing about it.