background preloader

Ssis

Facebook Twitter

Ssis - How to reuse the cache between several lookups. What should a SSIS Framework have? Posted: January 25, 2013 in SSIS Tags: ETL, Framework, SQL, SSIS, SSIS Framework Recently I was talking with a customer and we were discussing what an ETL Framework was and what should a good one have.

What should a SSIS Framework have?

I had to pause for a second and really think about this. Overtime I have been either created or been involved with creating Frameworks for several projects and as with any code base you keep tweaking it over time but here is my initial list of what a good Framework should have. Flexible Execution OrderAbility to Restart Either from the beginning or last failure pointLogging of the followingRow CountsVariablesErrorsDurationEasy to implementEasy to maintainAbility to send alerts So what do each of these things actually mean? The ability to restart the ETL process is one that is quite important. Logging to me is self-explanatory as you always want insight into what is going on in your ETL process. So the next question is how do you design a framework to handle all of this? Like this: SSIS For Loop Skip Files. When running a For Each Loop through a set of files, sometimes you will have specific files that you do not want to load.

SSIS For Loop Skip Files

For example, I have a set of files named: Abc.txtMno.txtRts.txtWln.txtXyz.txt If I want to skip the file that starts with “W” then I will need an expression in my For Each Loop to detect this file. Inside the For Each loop I am going to place a sequence container. This will give me a place to anchor my expression which I will place on the precedence constraint coming from the sequence container. On the precedence constraint line I am going to set it to constraint and expression. Substring(Upper(@strFileName),1,1) ! Sql server - SQL/SSIS DataWareHouse Fact table loading, best practices. Using Hashbytes to track and store historical changes for SQL Server data. Problem Change Data Capture (CDC) is a fundamental part of ETL especially in a data warehousing context.

Using Hashbytes to track and store historical changes for SQL Server data

A way to achieve this is through hashing algorithms. Even though SSIS is a great ETL tool, it lacks such a hashing function. The scenario presented here is rather simple, but it could also be applied to more complex ETL logic. Basically what we have is a source table in which we have no reliable timestamp to identify which data has changed since the last load (as is often the case in operational systems).

Solution. Using Hashbytes to track and store historical changes for SQL Server data. SSIS Novices’ Guide to Data Warehouses: Flattening While Staging the Data. In "SSIS Novices' Guide to Data Warehouses: Moving Data Into the Data Warehouse," I showed you the basic structure of a data warehouse whose databases contain sets of tables that store raw, staged, and dimensionally modeled data.

SSIS Novices’ Guide to Data Warehouses: Flattening While Staging the Data

These tables are referred to as the Raw tables, Stage tables, and Dimensional tables, respectively. I also showed you how to create a SQL Server Integration Services (SSIS) package called the Raw package. This package is used to move a near exact copy of your source data from an external location (probably the transactional database and server) to the Raw tables. Using Hashbytes to track and store historical changes for SQL Server data. Override SSIS Package Variables Without Opening the Package » Bradley Schacht. SSIS packages need to have the ability to be dynamic.

Override SSIS Package Variables Without Opening the Package » Bradley Schacht

To an extent we are able to accomplish this through the use of Configuration Files, Execute SQL tasks with results written to variables and even the use of the script task. One great way to make an SSIS connection manager dynamic is through the use of Expressions. Kyle Walker recently posted a blog here on BIDN about setting the location of an Excel file in the connection manager using an expression. This expression basically uses a location stored inside of a variable which can then be overridden. While this can be done a number of ways, an execute sql task to pull the location from a table, or a for each loop that will go through all the files in a folder, what happens if you want to have that variable overridden on demand at runtime.

Running Parameterized SQL Commands Using the Execute SQL Task – I « Systems Engineering and RDBMS. In one of our previous blog post, we saw the configuration of one of the many control flow tasks which the Integration Services offers us: the Execute SQL Task.

Running Parameterized SQL Commands Using the Execute SQL Task – I « Systems Engineering and RDBMS

We also saw how to configure this task to use various kind of sources for the sql statements like the Direct Input, Input from the variable and also the Input from a sql file. In this article we will go over using the parameterized queries in the Execute SQL task and returning result sets using OLE DB. In OLTP systems, we require running a particular sql statement many times with a change in the parameter value supplied. The idea is to parse and compile once and execute many times. As such such queries make use of bind variables (Oracle Lingo) or parameters and the queries using them are called as parameterized queries. Defining the parameters in the Execute Sql Task: SQL statements and stored procedures frequently use input parameters, output parameters, and return codes.