Saturday, September 29, 2007

The Source is Ready

On a number of occasions, our ETL applications needed to determine whether a SQL Server database has come back online after being restored from a backup. Such requirement arises when the systems we are sourcing from create copy of their data and restore it onto a separate server dedicated to all downstream consumption. These source systems are often mission-critical or LOB applications that restrict direct ETL operations - fearing performance impact on user queries.

A quick and simple way to find out the availability of a SQL Server database is to use the DATABASEPROPERTYEX function. This T-SQL function returns the current setting of the specified database option or property for the specified database. The SQL is listed below.

The WHILE loop checks to see if the "status" of the "SourceDB" database is online and available for query, if not, it waits for a minute then checks again.

To maximize reusability and uniformity of our ETL implementation, we wraps this piece of code in its own separate SSIS or DTS package. The package in turn can be executed as first step of a job or within another package.

Wednesday, September 12, 2007

SSIS Configuration File for Manageability

Using SSIS configuration file (.dtsConfig) to manage environment variables, such as server connections and source file paths, can contribute greatly to portability and manageability of an ETL application that comprised of SSIS packages.

We utilize configuration files extensively in our DWBI implementations to alleviate the complexity of promoting builds (DEV, TEST, PROD), server migration and deployment to multiple ETL servers. SSIS solutions can be readily deployed to different environments with simple changes to these XML-based configuration files without having to modify a single package.

For instance, one of our DWBI implementations employs 2 separate servers to "load balance" the processing of flat files (web logs) made available on a central file server. The same SSIS solution is deployed on these 2 ETL/staging servers (SQL Server, SQL Server Agent, Integration Services). But the solution instances are driven by slightly different configuration file specifications. Below is the over-simplified data flow.

The configuration file on each server specified the same source file server, destination DW server, and other control variables with the exception of one flag to indicate whether to process odd- or even- hour source files and another to indicate which one of the 2 target ETL servers to stage the extracted data.

Thursday, September 6, 2007