I’ve been playing with data taps in SSIS 2012 lately, and they are definitely one of my favorite new features of SSIS in the 2012 release. Data taps are very similar conceptually to the Data Viewer when you’re developing an SSIS package, but the problem they’ve always had is you can’t use Data Viewers in a production package, and since they’re only around for the duration of the package run, as soon as it completes they’re gone. Previously, if you’ve had a data related issue in production you’ve been left to pull the package down, run it again, and try and debug it. And if it works in development but not in production….(begin cursing session here). But fear not, this issue can be fixed in 2012. 🙂
To use a data tap, you have to run an SSIS package via a sql script. (Another new feature which is AWESOME, btw. :)) Navigate to Integration Services Catalogs->Your Folder –> Projects->Project->Packages->Your Package Name. Then right click and click Execute.
Once you click Execute, click on the script button to generate it as a sql script.
And you’ll see a script that gets automatically created to run the package:
This is what you’ll add your data tap call to. In this release, there is no way to add a data tap from the GUI, so you’ll have to call the package in a job through this sql script. The package you want to call is add_data_tap, and is housed in the SSISDB database in stored procedures. There are 4 parameters that you are required to pass, and one optional. The required ones you’ll need are execution_id, task_package_path, dataflow_path_id_string, and data_filename. The first 3 are inputs, and the last, data_filename one, is an output parameter. Create a line between before the last 2 exec’s and call the proc, then pass in your required parameters.
Execution_id will be equal to the current execution_id of the package, so that one’s easy. It’s just @execution_id=@execution_id. The task_package_path one and the @dataflow_path_id_string are a little bit trickier, but if you know SSIS internals a little bit it’s pretty straightforward as well. Component level items on the control flow always start with \Package, as that tells SSIS to resolve to the control flow level (thusly, telling the engine it’s at the Package level). In the above example, the name of my data flow is “Dim Events DF”. An alternative to get to that property if you have the package open in Sql Server Data Tools is to click on the data flow at the control flow level and go to properties, and under the identification section find the package path property.
To get the dataflow_path_id_string property, you’re now inside of the data flow and following a path. What Path you’re following can be determined by your data flow level component, input/output line. Again, you can type this out if you know the design of your package but I think this one’s a little easier to just grab from the properties. Inside of the data flow, click on the line you want to see and go to properties, then under “Common Properties” find the IdentificationString property and copy it, then paste it after your dataflow_path_id_string variable.
The last variable, @data_filename, is super easy. Just type in the name you want to the file to be and make sure to add your .txt at the end, and that’s it. It is possible to have multiple data taps on the same package, so if you’re processing a dimension, for example, you could have one data tap to see what’s changing, and another one to see your new writes every day, if you wanted.
So your final sql script will look like the below:
Then you can just paste this code, into a job step if you need, and voila! You have a data level auditing trail against your package. Every time it runs it will overwrite the existing file. If you want to do a daily archive, you would need to modify the data_filename output variable to take a datepart string and add the date, time, whatever to the end of the file name before you append the .txt.
Now you just need to be able to access your file you just created. All data taps are kept in the same location in the Program Files directory, or wherever you specified your SQL install at. The full location by default, for example, would be:
C:\Program Files\Microsoft SQL Server\110\DTS\DataDumps
With all this coolness, one thing I can begin to see happening (which honestly should have been in place for a long time but is impossible to enforce), is consistent naming amongst developers for data flows and package components. Which is awesome, because others will be able to see what components do without having to have been the one who created the package.