Tuesday, July 15, 2008

Type-II SCD Logic in OBIEE OOB ETLs

I have been thinking about blogging about evolution of Type-II SCD in OBIEE Applications in different versions of Applications for quite sometime now but haven't had all the time. With previous versions of Analytics applications, our ability was very limited in handling SCD logic. I remember having a requirement in one of my earlier implementations to look at capturing changes as the change occur in source (SIEBEL) system, which means tracking history of all changes that occur even during the day. After much of deliberations, customer thought it is not worth putting all that effort to solve a business problem that did not exist. Let me explain what the business problem we were trying to solve.

In a contact center, team leads and team managers are changed in the source system during the work day very often than anybody can imagine. We had ETLs running at top of the half hour to gather the telephony data at CSR level and most reports roll up the data to team lead and team manager level in SIEBEL Analytics. We proposed to make the position hierarchy dimension a Type-II SCD but when the logic of SCD was explained, IT management raised that the SCD should capture intra-day changes as well. I started to put some effort back then to come up with some custom SCD logic that would accomplish this. But, business dropped that with an argument that the CSRs do not report to more than one Team Lead on any day thouh the changes may be done in the system during the work day.

I have reviewed 7.9.4 ETLs and one of the things that impressed me most is the SCD logic and how it has been implemented. It definitely deserves a post and credits to Engineering team for design and build of this logic in Informatica. I am not going to write all the logic down here but will mention at high level what is happenning in ETLs with respect to TYpe2 SCD. This may be helpful to those folks, who haven't had a chance to look at 7.9.X apps as yet. In the latest OBIEE Apps, ETL_PROC_WID is not a constant anymore. The definition of ETL_PROC_WID column is more explainable now than before. ETL_PROC_WID is different in each ETL run and is incremented by 1 for each ETL run. OBIEE ETLs leveraged this column in handling failure cases. Though there are multiple exceptions handled, when it comes to Type-II SCD logic, OOB ETLs evaluates 2 ports;
1) SYSTEM_COLLS_DIFF checks for any changes in date (Audit) columns in Stage and Target tables and marks it 'Y', if there is a change in values of all the audit columns between satge and target table.
2) TYPE2_COLLS_DIFF checks all the columns for which the history of changes need to be tracked and mark it 'Y', if there is a change in values of all the columns between stage and target table.

Once the above ports are evaluated, UPDATE_FLG is evaluated whether is an update or SCD or insert or nothing to do. Source System Change Date is one column that is compared while the record is inserted with the Effective From Date of the target record while the SCD record in inserted. I had used ETL run date so far for populating SCD Start Date and SCD End Date. Source system change date will help because lot of systems are modified after the fact and they would like to go with either change date or a date that the source system captures as effective from date. OO ETLs have another mapping to update the Effective To Date of the history record for all the records that are modified for Type2.

I am pasting a small graphic that explains the difference in older reeases and new release of OBIEE Applications.

4 comments:

deepika said...

what is SCD?? I know it stands for slowly changing dimension,but which are those dimensions? and why is it called so??

Changa Reddy said...

Hi Deepika,

There is tonnes of documentation on web on this subject. I recommend you to read on web.

- Changa

Anonymous said...

Hi Changa,

When you say "In the latest OBIEE Apps, ETL_PROC_WID is not a constant anymore." this is not quite correct. ETL_PROC_WID was never a constant I believe.

Regards
MG

Changa Reddy said...

MG,

I am refering to Analytics apps 7.8.4 and earlier. It was a Constant back then. I can't point which release made this change but think 7.9.x changed it.