Thursday, January 17, 2008

Describing a Data Warehouse

While I was on a sales call yesterday I found myself explaining away what a data warehouse is.  Honestly I was in some way caught off guard by the question.  I shouldn't have been but I guess I haven't had to do such in a while.  Or maybe I thought that data warehousing has been around for "so long" I presumed everybody would somehow know about it.  This is undoubtedly a bad assumption on my part. 

Nonetheless it was a refreshing surprise.  It is not exactly easy to answer the question of what is a data warehouse.  I wanted to be "accurate" and provide a portrayal that is both "textbook" and draws from my own account over the years. 

I believe the essence of a data warehouse is data consistency and non-volatility.  A data warehouse is an integration of information gathered from different operational applications and other data sources used to support business analysis activities and decision-making tasks. It captures an organization’s past transactional and operational information and changes. The data in the warehouse, in most cases, is read-only. Unlike OLTP systems, the architecture (and the technologies) of a data warehouse is optimized to favor efficient data analysis and reporting.

I suppose my view of a data warehouse, in many ways, aligns with Inmon’s top-down approach. But that certainly does not preclude me from also seeing it as a collection of more subject-oriented data marts – Kimball’s bottom-up approach.

No comments: