Recently I had the need to create a couple of custom tasks for SQL Server Integration Services (SSIS) at work and I was shocked at how few the resources were for doing this. There are a lot of blogs and articles about creating SSIS custom tasks, but most of them are outdated or wrong. Even Microsoft’s own documentation has clearly been promoted from older versions and is no longer accurate. This leads to a lot of wasted time and scratching of your head trying to figure out what is wrong.
In this series of articles I’m going to demonstrate how to create a custom task for SSIS 2014. This information is also applicable to SSIS 2012 but doesn’t necessarily work with earlier or later versions of SSIS. In this first article we will set up a simple SSIS package that we can use for the remainder of the series.
SQL and SSDT Versioning
In order to follow along you’ll need a copy of SQL 2014 with SSIS and SSRS installed. If you have SQL 2012 then these steps should work but you’ll need to adjust the SQL versioning (discussed later). SQL 2016 or later may work as well. To avoid confusing things even more, let’s talk about SQL versioning now. SQL versions do not line up with product versions. As of today here’s the versioning of SQL to be aware of.
- SQL 2008 R2 = 10.0
- SQL 2012 = 11.0
- SQL 2014 = 12.0
It is critically important that you remember to use the correct numeric version for your target version of SSIS. SSIS requires that you use the exact version for SSIS otherwise things won’t work correctly. For example, if you are targeting SSIS 2012 then you must use 11.0 of SQL assemblies. If you attempt to use newer or older versions then it may compile but it will not work properly.
You’ll also need a copy of SSDT in order to build SSIS packages. If you have Visual Studio installed then you already have SSDT most likely. But you must use the version of SSDT that is tied to your version of SSIS. Each SSDT version indicates what version of SSIS it supports (and each one only supports 1 version). Here’s the current mapping.
Since we will be targeting SSIS 2014 for this article then you need SSDT-BI for Visual Studio 2013. If you do not already have VS 2013 installed then it will install the shell. VS2015 and the version of SSDT that ships with it DOES NOT work with SSIS 2012.
Finally, for this article we’ll be using the AdventureWorks database so you’ll want to ensure you have a copy of that installed as well. Additionally go ahead and install the SSRS reports for AdventureWorks. Currently there is not a SQL2014 version so install SQL2012 instead. When prompted for the installation path be sure to use 120 instead of 100 as the SQL version. You’ll need to upload each report into SSRS (I recommend placing them in a separate folder), rename them to remove the database name reference and adjust the data source to use the correct server information.
What is SSIS?
If you are not sure what SSIS then refer to MSDN for a full description but ultimately it is used to perform ETLs of data into or out of SQL Server. Whenever you need to do bulk processing of data and ultimately that data is going into or coming from SQL then SSIS is a useful tool. SSIS is based upon workflow processing of sets of data and ships as part of SQL Server. An SSIS package is the equivalent of a program in .NET. A package can be scheduled to run at certain times or run manually as needed.
Out of the box SSIS comes with lots of tasks that you can use to build an ETL process. Some example tasks include running data queries against SQL and other data sources, transforming data into new formats and saving data to various data sources. Ultimately a task consists of inputs and outputs which you set to control the process to determine where and what data to store. Tasks are the building blocks of the ETL process.
There are other components in SSIS as well. Connection managers are responsible for managing connections to external resources including databases, the file system and web services. Anything outside of SSIS that requires a connection of some sort (URL, connection string, etc.) is managed through a connection manager. Connection managers are configured at the package level so all tasks within a package have access to the same connections. A connection manager isn’t an actual connection to a remote resource but merely the information needed to make the connection. The connection is created at runtime, on demand. The benefit of connection managers is that the connection details can be changed (such as at deployment) without breaking or requiring any code to be changed.
In the cases where SSIS does not have a task that accomplishes what you want there is a free-form script task that allows you to write .NET code. The .NET code runs in the SSIS sandbox and has access to the same information that standard tasks do. Oftentimes we’ll write scripts to do custom work that SSIS doesn’t support directly.
SSIS Variables
Before we dive into creating an SSIS package we should talk about variables. SSIS variables work similar to programming language variables in the sense that they have a name, type, value and scope. But the way SSIS uses variables can result in problems if you are not careful.
The first thing that is important is the variable’s scope. By default variables are scoped at the package level which means all tasks can access them. Variables can also be scoped to any container and some tasks. A package can have multiple tasks running in parallel if needed so sharing variables across tasks needs to be done carefully. In general I recommend that you scope a variable to the container or task that you need it in. If you need to change a variable’s scope after creation then use the Variables
window and click the button to move the variable.
The next thing to know about variables is that they should be locked for reading and writing. Because tasks can be running in parallel SSIS needs to know when you are reading/writing variables. You do this using the VariableDispenser. For most tasks you do not need to worry about this process but for script tasks you have to specify which variables you want to read and write as we’ll see later.
Lastly, variables are case sensitive so the namespace and variable name must match the variable declaration or you will get errors.
Creating an SSIS Package
Let’s set up a basic SSIS package.
- Start SSDT-BI for Visual Studio 2013.
- On the Start Page click New Project and then Templates\Business Intelligence\Integration Services Project.
- Name the project whatever you want and click OK.
The solution contains a single package and is ready for you to add some work. For this article we’ll query for all sales that were over $100K.
- Add an Execute SQL Task to the designer called ‘Get orders over $100K’.
-
On the
General
tab- Set
ConnectionType
to ADO.NET and click theConnection
property and select New Connection to create a new connection. - Configure the connection.
- Set
ResultSet
to Full result set to pull back all records. -
For
SQLStatement
use the following.SELECT TOP 100 SalesOrderId, SUM(LineTotal) FROM [Sales].[SalesOrderDetail] GROUP BY SalesOrderId HAVING SUM(LIneTotal) > 100000
- Set
-
On the
Result Set
tab- Set the
Result Name
column to 0. This is required for ADO.NET results. - Assign the results to a new variable of type
Object
(i.e. SalesOrder).
- Set the
For each order we’ll generate the Sales_Order_Detail
report from SSRS so we need to enumerate the rows using the foreach enumerator. Currently we have a single variable representing the entire rowset. The foreach enumerator will enumerate each row. To access the columns of the row we’ll define new variables scoped to the foreach container.
- Add a
Foreach Loop Container
task below the SQL task. - Connect the tasks using the green arrow so SSIS knows to execute the loop after the query returns.
-
On the
Collection
tab- Select
Foreach ADO Enumerator
(the enumerator must match the type of the data coming in). - In the
ADO object source variable
dropdown, select the variable that was created earlier. - Set
Enumeration mode
to be the rows in the first table.
- Select
-
On the
Variable Mappings
tab- Click in the
Variables
column to add a new variable calledSalesOrderId
. - Set the scope to the foreach container so we don’t collide with other variables elsewhere.
- Set the type to
Int32
. It is important the type matches the column type. - Click OK to create the variable.
- In the
Index
column set the index to zero (indice are zero-based).
- Click in the
Inside the foreach container we need to generate the SSRS report but SSIS does not have a task for that. For now we’ll add a script task.
- Add a
Script
task inside the foreach container. -
On the
Script
tab- Click on the
ReadOnlyVariables
field and select theSalesOrderId
variable we created earlier. - Click
Edit Script
to open the script editor. - Put in the following code.
-
Put in the following code where the TODO comment is.
var salesOrderId = Dts.Variables["User::SalesOrderId"]; bool fireAgain = true; Dts.Events.FireInformation(0, "ScriptTask", String.Format("SalesOrderId = {0}", salesOrderId.Value), null, 0, ref fireAgain);
- Build the code to ensure it is valid and then close the editor.
- Click
OK
to close the script task.
- Click on the
When we set up the script task we had to specify which variables we wanted to read and write. If you fail to do this then when you try to reference the variables it will fail. Setting up the variables in this way is actually a shortcut for locking. You don’t need to use the variable dispenser inside script tasks because of this (although you could if you wanted to).
Testing the Package
We can now test the package. In SSDT build and run the solution. If you get any errors then it will show in the designer. You have to stop the debugger and go to the Progress
view to see the errors.
If everything runs correctly then all the tasks and containers will show a green check. The informational messages should be shown in the results view.
Generating a Report
To finish up this post we’ll add the code to generate the report from the script task. For this we’ll use SSRS url access. It is a lot of boilerplate code so we won’t cover it here but we need to set up some stuff first.
We need an HTTP connection to the reporting server so add a new HTTP connection in the connection manager list. Specify the SSRS url (i.e. http://myserver/ReportServer) and credentials to use. We also need a file connection to a folder so we can store the report results. Remember the connection names for later.
Open the script task and replace the existing code with the new code. You need to add a reference to the System.Web
assembly.
public void Main() { bool fireAgain = true; //TODO: Set this to the path to the report on SSRS var reportPath = "/Sales_Order_Detail"; var salesOrderId = Dts.Variables["User::SalesOrderId"].Value; reportPath = HttpUtility.UrlEncode(reportPath); var builder = new StringBuilder(reportPath.StartsWith("/") ? reportPath : "/" + reportPath); //Add parameters, case matters to SSRS builder.AppendFormat("&SalesOrderIDStart={0}", salesOrderId); builder.AppendFormat("&SalesOrderIDEnd={0}", salesOrderId); builder.AppendFormat("&rs:Command=Render&rs:Format=PDF"); //Get the connection var cm = Dts.Connections["ReportServer"]; //Create a copy of the connection because we're going to change the URL //var conn = new HttpClientConnection(cm.AcquireConnection(Dts.Transaction)).Clone(); var conn = new HttpClientConnection(cm.AcquireConnection(Dts.Transaction)); if (conn == null) throw new Exception("Unable to acquire connection."); // SSRS url access format = {serverUrl}?{reportPath}{parameters}{options} var uri = new UriBuilder(conn.ServerURL) { Query = builder.ToString() }; conn.ServerURL = uri.Uri.ToString(); // Generate Report Dts.Events.FireInformation(0, "ScriptTask", String.Format("Generating report: {0}", conn.ServerURL), null, 0, ref fireAgain); try { var data = conn.DownloadData(); Dts.Events.FireInformation(0, "ScriptTask", String.Format("Report generated for {0}", salesOrderId), null, 0, ref fireAgain); //Save the file var file = Dts.Connections["TargetFolder"]; var target = file.AcquireConnection(Dts.Transaction) as string; File.WriteAllBytes(Path.Combine(target, salesOrderId.ToString() + ".pdf"), data); } catch (Exception e) { Dts.Events.FireError(0, "ScriptTask", e.Message, null, 0); Dts.TaskResult = (int)ScriptResults.Failure; }; Dts.TaskResult = (int)ScriptResults.Success; }
Running the package now should generate a series of reports in the specified target folder. If you are getting SSRS errors then try the URL that is generated in the log in a browser window. In most cases the issue is either report paths or security.
Next
Next time we’ll start moving the report generation logic to a custom task.