Sunday, December 14, 2008

Microsoft Certifications

   I recently sat through the 70-445 and 70-446 exams for MCTS (SQL Server 2005 BI) and MCITP (BI Developer) respectively.  I took them about a week-and-a-half from each other.  Even though I passed them both on my first try which I am very please, I did plan on using the free second chance promotion vouchers from Microsoft just in case. 

The MCTS Self-Paced Training Kit: Microsoft SQL Server 2005 book and its practice exams were very helpful with the 70-445 exam; I also think the book serves as a good reference in general.  As you all know there is no training kit for the 70-446 exam.  I believe my day-to-day experience working with the technology stack and DWBI as a whole helped a bit.  I was also able to leverage my preparation with 445 for 446.  In hindsight I believe I made the right decision by taking them close to each other. 

The exams were about the level of difficulty I anticipated.  In some ways I think 446 was a little easier than 445 - it emphasizes less on specifics and has more straight-forward question format.  There were definitely quite a few questions on data mining in both exams.  If I could go back and do one thing different though, I would only skim through the individual case study rather than spending the time really understanding it before embarking on the questions.  The questions are applicable to only parts of the case study – read those parts when you need to. I did run out of time on a couple of case studies.

Personally I think industry certifications are great only if you have the working experience to go with them - before and after.  The process of certification does promote well-rounded in-depth knowledge of the technology of interest. Real-life scenarios often take you to only those bits and pieces that are immediately relevant to solving the problems. 

Friday, October 3, 2008

Contributions to KPI Library

Here are some of my recent contributions to the KPI Library - a site offering free key performance indicators and metrics submitted by the community.  This collection of indicators is used in measuring efficiency, and ultimately, customer satisfaction in incident management center, help desk operation and support ticket system.

Total/Mean Time to Ticket (TTTT/MTTT)

Calculated in seconds, minutes or hours, the total or average time it takes from the reporting of an incident to the generation of the support ticket.

Total/Mean Time to Action (TTTA/MTTA)

Calculated in seconds, minutes or hours, the total or average time it takes from the creation of a support ticket to the first action taken to resolve it.

Total/Mean Time to Escalation (TTTE/MTTE)

Calculated in seconds, minutes or hours, the total or average amount of time the support ticket has been escalated through the support tiers.

Total/Mean Time to Resolution (TTTR/MTTR)

Calculated in seconds, minutes or hours, the total or average time it takes from the creation of a support ticket to the resolution of the ticket or incident.

Total/Mean Time in Queue

Calculated in seconds, minutes or hours, the total or average amount of time the support ticket is not being "actively" worked on.

Total/Mean Time in Postmortem

Calculated in seconds, minutes or hours, the total or average time it takes to perform postmortem review of the support ticket after resolution.

Sunday, April 13, 2008

Dimension Security using SSAS Stored Procedure

Often time the business needs are such that the security restrictions of users change periodically due to modification of job roles, work location and/or other attributes.  A solution that can accommodate addition, update and removal of users seamlessly or dynamically in SSAS is critical to reducing latency and minimizing user security maintenance.  The idea of using SSAS stored procedure to enforce dimension security dynamically is rather straightforward: While all users belong to a single SSAS database role, individually, their data access is restricted based on the set of dimension members they have permission to as determined by the SSAS stored procedure dynamically.

A short time ago my team identified a requirement to refine our overall SSAS security scheme.  The data available in our cubes needed to be filtered to authorized users by their sales location(s).  The list of users and their authorized sales locations are maintained in a collection of M:N relational tables in one of our SQL Server databases.  Our ETL refreshes these tables regularly.  For simplicity, a database view is used to de-normalize the data in these tables into "user alias-to-sales location" pairs as shown in the mock-up here.

UserAlias SalesLocation
usera United States
usera Canada
userb United States
userc Hong Kong
userd United States
userd Germany
userd UK

To avoid the potential of SQL injection attacks, a SQL Server stored procedure is used to look up the data in the view.  The stored procedure (listed below) takes the alias of the current user as an argument and returns the corresponding sales locations.  The domain definition is stripped off for our purpose.

   

Our SSAS stored procedure (shown below) is implemented as a method from the UserSecurity C# class library - GetFilteredSet_SalesLocation().  The SSAS stored procedure employs the [usp_GetUserSalesLoc] SQL Server stored procedure to retrieve and return the unique name of the sales locations. 

The GetFilteredSet_SalesLocation() method takes three string-type arguments: user alias of the current user, name of the SQL Server where [usp_GetUserSalesLoc] is located (name of the database is hard-coded in the method - yes I know), and the dimension hierarchy level.  The dimension hierarchy level specification allows us to uniformly leverage the SSAS stored procedure across multiple SSAS databases and cubes with similar security filtering requirement but different dimensional schemas.  The method then enumerates through the data set returned by [usp_GetUserSalesLoc] and assembles the MDX set as an output string as such:

{[Dimension].[Hierarchy].[Level].&[MemberName], [Dimension].[Hierarchy].[Level].&[MemberName], [Dimension].[Hierarchy].[Level].&[MemberName], ...} 

Subsequently the SSAS stored procedure is compiled into a .NET assembly (SSASSecurity.dll) and deployed to the SSAS server.  I usually place the DLL in Drive:\Program Files\Microsoft SQL Server\MSSQL.#\OLAP\bin\ directory.  In this particular instance, I had the assembly registered within the server scope so that it can be used by any of the databases on the SSAS server.  It can certainly be registered within the database scope.

Note that the assembly must be registered with "External access" permission at the minimum in order for the GetFilteredSet_SalesLocation() method to function properly.  It also needs to run under an account that has permission to execute the [usp_GetUserSalesLoc] SQL Server stored procedure wherever it might be located. 

Finally we specified one SSAS database role with all user logins added to the membership collection.  The user members have read access to the cubes and dimensions.  For the dimension hierarchy of interest, we placed a call to the GetFilteredSet_SalesLocation() SSAS stored procedure in the MDX expression of the Allowed member set.  The STRTOSET MDX function converts the output string into a valid MDX set.  The USERNAME MDX function returns the login alias of the current user.   

The allowed member set can be defined for either database or cube dimension.  If a dimension is referenced by a number of cubes in the database, it might be desirable to define the allowed member set at database-level and propagate the filter to cube-level through inheritance.  By enabling the "Visual Total" option, we excluded the denied members from contributing to the aggregate values. 

This is it.  Users are now restricted to cube data for those sales locations they are authorized to see.  This security scheme works nicely with Excel when the workbook data connections are specified with integrated security.

There are few approaches to implement dynamic dimension security in SSAS - as you will find in number of writings both online and offline.  I found the SSAS stored procedure approach to be the most straightforward of them all and the least pervasive to implement.  Furthermore security restrictions are much more immediate because they are queried directly from the relational database each time.  However, we do need to ensure the SQL Server and the database where the [usp_GetUserSalesLoc] stored procedure resides is up and running at all time..

Monday, March 17, 2008

Data Warehouse Upgrade

This recent article from Intelligent Enterprise presents the roadmap of upgrading legacy data warehouses in an effort to increase flexibility, decrease data latency and deliver finer grains.  The discussion makes a case for Kimball's dimensional bus architecture over the hub-and-spoke design.  A good read for those who are looking to improve or extend existing DW implementations.

Sunday, February 3, 2008

BI Wish List

The authors of this recent article from Business Intelligence Network talk about their wish list of DWBI capabilities and features for the coming year or future.  I simply couldn’t agree more; I almost had tears in my eyes when I was reading it.  Alright that might be a little dramatic. I especially enjoyed the bits on data self-service, data black market and business user ownership.

Thursday, January 24, 2008

Justifying The Value of OLAP

So I was in this meeting at a client site yesterday, we talked about the progress of their DW upgrade and the dashboard/scorecard implementation that my consulting firm is tasked to do.  During our discussion, one of the IT folks, actually a contractor indicated that the current DW (SQL Server 2000) hasn't an OLAP layer, namely cubes.  All user reporting needs are satisfied through the relational DW database.  As part of his assessment, he is putting together a series of proof-of-concept cubes based upon the data sitting in the current DW.  Everything sounded good until we got further into our meeting and the following points were made by the other participants.

  • Performance with reporting is not an issue. 
  • All canned reports are created in Excel and are being pushed directly to the users.
  • All users are "causal."  There are no "power" user requirements such as more complex analysis or ad-hoc query.
  • No one in-house knows anything about cubes or multi-dimensional concept.

My inclination here is to ask, at least in the back of my mind, how much value the OLAP layer provides in this scenario.  I can see the POC as showcasing the potential value the OLAP layer may offer in the distance future. Other than that, we are adding a whole new layer of technology to the DW framework which requires additional skill sets and resources to support and maintain.  I maybe missing a bigger picture here but I am not seeing the ROI on this endeavor.

Thursday, January 17, 2008

Describing a Data Warehouse

While I was on a sales call yesterday I found myself explaining away what a data warehouse is.  Honestly I was in some way caught off guard by the question.  I shouldn't have been but I guess I haven't had to do such in a while.  Or maybe I thought that data warehousing has been around for "so long" I presumed everybody would somehow know about it.  This is undoubtedly a bad assumption on my part. 

Nonetheless it was a refreshing surprise.  It is not exactly easy to answer the question of what is a data warehouse.  I wanted to be "accurate" and provide a portrayal that is both "textbook" and draws from my own account over the years. 

I believe the essence of a data warehouse is data consistency and non-volatility.  A data warehouse is an integration of information gathered from different operational applications and other data sources used to support business analysis activities and decision-making tasks. It captures an organization’s past transactional and operational information and changes. The data in the warehouse, in most cases, is read-only. Unlike OLTP systems, the architecture (and the technologies) of a data warehouse is optimized to favor efficient data analysis and reporting.

I suppose my view of a data warehouse, in many ways, aligns with Inmon’s top-down approach. But that certainly does not preclude me from also seeing it as a collection of more subject-oriented data marts – Kimball’s bottom-up approach.