Monday, September 28, 2009

DAC Full Load vs Incremental Load

I have been getting few queries around what determines DAC to do an incremental load or Full load. Though DAC Guide has it under Refresh Dates section, I thought I will simplify it for the benefit of everybody.

If you ever want to run a Full Load of an execution plan, reset Data Warehouse in DAC by navigating to Tools -> ETL Management -> Reset Data Warehouse. The warning is - "This action will cause all tables that are loaded during next ETL to be truncated". This means that refresh dates in Physical Sources will be set to NULL both on Source and Target tables and if both refresh dates are NULL, ETL truncates the target table and run Full Load command.

Full load is a straight forward exercise but most often, we encounter doing something that is not straight forward such as running full load only on certain tasks but not the whole of the execution plan. Reasons are plenty and this post is limited to provide a reasonable explanation on how DAC determines whether to run an incremental or full load.

The Refresh Dates in Physical Data Sources determine whether to run the command for incremental or full load. I am pasting a graphic that explains what happens when.



Based on what you need, determine what refresh dates are to be populated/modified in Physical data sources.

Wednesday, August 19, 2009

DAC Synchronize Tasks failing

MESSAGE:::Error while inserting a record!
EXCEPTION CLASS::: com.siebel.etl.gui.core.RecordManipulationException

com.siebel.analytics.etl.client.core.DACMessage.convertToRME(DACMessage.java:31)
com.siebel.analytics.etl.client.data.model.UpdatableDataTableModel.upsertNewRecord(UpdatableDataTableModel.java:141)
com.siebel.analytics.etl.infa.fileParsing.InfaDacWriter.insertTableList(InfaDacWriter.java:459)
com.siebel.analytics.etl.infa.fileParsing.InfaDacWriter.insertNodeTables(InfaDacWriter.java:397)
com.siebel.analytics.etl.infa.fileParsing.InfaDacWriter.insertNodeTables(InfaDacWriter.java:314)
com.siebel.analytics.etl.infa.fileParsing.TaskSync.sync(TaskSync.java:168)
com.siebel.analytics.etl.client.action.TaskSynchronizationAction.doOperation(TaskSynchronizationAction.java:123)
com.siebel.etl.gui.view.dialogs.WaitDialog.doOperation(WaitDialog.java:53)
com.siebel.etl.gui.view.dialogs.WaitDialog$WorkerThread.run(WaitDialog.java:85)
------------------------------------------------------------------------------------
We have a custom mapping that has been tuned to use DB Links and also commented the line that was checking Last Update Date in Oracle EBS tables. The commented line is as below.

-- AND (AP_INVOICE_DISTRIBUTIONS_ALL.LAST_UPDATE_DATE > TO_DATE('$$LAST_EXTRACT_DATE', 'MM/DD/YYYY HH24:MI:SS') OR AP_INVOICES_ALL.LAST_UPDATE_DATE > TO_DATE('$$LAST_EXTRACT_DATE', 'MM/DD/YYYY HH24:MI:SS'))

After this, DAC task is not synchronizing. It is throwing the error - "MESSAGE:::Error while inserting a record". After removing the line, it is just synchronizing fine.

It looks like DAC has problems avoiding the commented line in SQL Query. When I run DAC Client in interactive mode, it throws this error. So, DAC is treating MM/DD/YYYY as a table name for some reason. I can't think what the logic is but definitely wrong.

FROM W_ETL_TABLE A_W_ETL_TABLE
INNER JOIN W_ETL_OBJ_REF OBJ_REF ON OBJ_REF.OBJ_TYPE='W_ETL_TABLE'
AND OBJ_REF.SOFT_DEL_FLG='N'
AND OBJ_REF.APP_WID='PGBU_R12'
AND OBJ_REF.OBJ_WID=A_W_ETL_TABLE.ROW_WID
WHERE
((A_W_ETL_TABLE.NAME=''MM/DD/YYYY'
))
Aug 10, 2009 9:43:32 AM com.siebel.etl.engine.core.ETLUtils logException
SEVERE: MESSAGE:::ORA-00907: missing right parenthesis

EXCEPTION CLASS::: java.sql.SQLException

Friday, August 7, 2009

How to change the connection pool passwords with each migration?

I have been depending on my DBA to type the passwords in the connection pools every time we have an RPD that needs to be migrated to pre-production or Production. It is not much of an effort but it is just nuisance to bring him to my desk and let him type the password while I turn my head away.

I have been thinking it can be done by UDML and even heard from a colleague who has successfully done it but never really had all that time to try it myself. My colleague, Pradip, gave me some pointers on this and I thank him for that. Today, I got sometime to try it out for myself and I certify that it works best.

I had RPD with pre-production connections and new RPD with development database connections that need to be moved to pre-production. First, I copied the connection pool from my pre-production RPD and pasted in a notepad. Right click on the connection pool and click Copy and then Paste in a notepad. Just a reminder - Copy and Paste menu items in RPD are actually UDML generator and UDML executor respectively.

Once the UDML is pasted in notepad, you will see the password, user name and all the other connection pool properties. You can add all connection pools to notepad, if you have more than one connection pool that needs to be changed with every migration. Once done, run the below command at command line (you should have OracleBI client installed on the machine where you execute the command).

D:\OracleBI\server\Bin>nQUDMLExec.exe -U Administrator -P SADMIN -I D:\UDML_Stg.txt -B "D:\1 Changa\Work\Repository\Latest RPD\OracleBIAnalyticsApps_Dev.rpd" -O "D:\1 Changa\Work\Repository\Latest RPD\OracleBIAnalyticsApps_Stg.rpd"

Contents of D:\UDML_Stg.txt is - (This is the result of Copy of the connection pool in RPD). If you copy just the password from Production RPD and paste after PASSWORD in the below UDML, you are set for generating RPD with Production connections.

DECLARE CONNECTION POOL "Oracle Data Warehouse"."Oracle Data Warehouse Connection Pool" AS "Oracle Data Warehouse Connection Pool" UPGRADE ID 2327604
DATA SOURCE {VALUEOF(OLAP_DSN)}
TIME OUT 600
MAX CONNECTIONS 100
TYPE 'Default'
USER 'VALUEOF(OLAP_USER)'
PASSWORD 'D7EDED84BC624A917F5B462A4DCA05CDCE256EEEEEDC97D54E960C7A19ECF4EFE0BDF057F3860BD2C8F3BEE6EAF9FC34'
SHARED LOGIN
ASYNCHRONOUS
CONNECTIONS TO SAME URI 10
OUTPUT TYPE XML
HEADER PATH {D:\\OracleBI\\server\\config\\NQSQueryHeader.xml}
TRAILER PATH {D:\\OracleBI\\server\\config\\NQSQueryTrailer.xml}
ISOLATION LEVEL DIRTY READ
BULK INSERT BUFFER SIZE 32768 TRANSACTION BOUNDARY 10
TEMP TABLE PREFIX {TT} OWNER {}
DESCRIPTION {User name is VALUEOF(OLAP_USER)
Data source name is VALUEOF(OLAP_DSN)}
PRIVILEGES ( READ);

Friday, April 10, 2009

Synchronize Tasks in DAC

I had this error recently in DAC Client while trying to Synchronize Task.

MESSAGE:::Failed to create folder/workflow/session cache file
EXCEPTION CLASS::: com.siebel.analytics.etl.infa.interaction.PmrepUtilsException

This happens if you have not created/changed INFA_DOMAINS_FILE and PATH variables to point to domains.infa and path to bin directory respectively.

Friday, January 16, 2009

Data truncation with OBIEE Applications 7.9.5 OOB ETLs

This post is applicable only for OBIEE 7.9.5 applications configured to run with data movement type as UNICODE. After the recent upgrade to 7.9.5 (Informatica 8.1), in development, some ETLs (Bank, GL Account, Employee and User dimensions) started failing with either Unique constraint error or duplicate keys found. After looking at data, it was clear that OBIEE is getting loaded with duplicate values in INTEGRATION_ID column causing duplicate keys found error while creating unique index, which is a composite key on columns - DATASOURCE_NUM_ID, INTEGRATION_ID and EFFECTIVE_FROM_DT.

When I exported the source data and stage table data to a spreadsheet and started ro by row comparison with VLOOKUPs in Excel, it was clear that Informatica while loading stage table, it either truncated a character in INTEGRATION_ID or added some junk characters in some VARCHAR columns. Because the truncated values in INTEGRATION_ID matched already existing values, duplicate keys found error was thrown.

After raising with support and spending some time on it, I figured that it is Default Buffer Block Size setting in Informatica that is causing the issue. Defaukt Buffer Block size is configured as 1280000 in all of the ETL sessions that are failing. I reverted the value to 128000 and the sessions started to run with no truncation and no junk characters.

Informatica support directed me to the Performance tuning guide, which has details on 2 very important session level parameters (Default Buffer Block Size and DTM Buffer Size), confuguring which without understanding the implications can lead to data loss without even realizing the problem. I am putting dowm the explanation in brief.

Informatica Integration Service process starts the Data Transformation Manager (DTM) when you start a session and DTM allocates buffer memory to the session at runtime based on the DTM Buffer Size setting in the session properties. The DTM divides the memory into buffer blocks as configured in the Default Buffer Block Size setting in the session properties. The reader, transformation, and writer threads use buffer blocks to move data from sources to targets. The buffer block size should be larger than the precision for the largest row of data in a source or target.

The Integration Service allocates at least 2 buffer blocks for each source and target partition. Use the following calculation to determine buffer block requirements:
[(total number of sources + total number of targets)* 2] = (session buffer blocks)


Session Buffer Blocks = No. of Partitions * (DTM Buffer Size / Buffer Block Size)

In OOB ETLs, Buffer Block Size is increased to 1280000 leaving DTM Buffer Size as 32000000. Session Buffer Blocks = (32000000/1280000), which is 25 blocks and that seem to be not enough to move data from source to taget, which case it should have failed rather than truncating characters. But, my first test with 128000 for Default Buffer Block size worked, which resulted 250 Memory Blocks and this seem good enough to move data. I was not sure why this fixed but support directed me to right document.

Here is some additional information that may be useful.

You configure buffer memory settings by adjusting the following session parameters:

DTM Buffer Size. The DTM buffer size specifies the amount of buffer memory the Integration Service uses when the DTM processes a session. Configure the DTM buffer size on the Properties tab in the session properties.

Default Buffer Block Size. The buffer block size specifies the amount of buffer memory used to move a block of data from the source to the target. Configure the buffer block size on the Config Object tab in the session properties.

Refer to Power Center Workflow Administration Guide or Performance Tuning Guide for more detailed explanation.

Friday, October 31, 2008

Registering Informatica 8.1.1 Integration Service in DAC

Registering Informatica Integration service in DAC took a lot of time in the evening the other day. The documentation says that the ServerPort variable in Custom Properties in Informatica Admin Console will be the port on which Informatica Integration service will listen to. Refer to section 4.7.3 in Oracle® Business Intelligence Applications Fusion Edition Installation and Configuration Guide Release 7.9.5 document. But, after creating ServerPort variable with value 4006 as a custom property in Repository properties did not help. lsof -i|grep 4006 on Solaris returned nothing on Solaris, which indicates that the port is not in use. After shutting down and restarting of Informatica services multiple times, I thought I missed something in documentation. I was able to connect to Integration serviec and Repository service with pmrep and pmcmd respectively from the command prompt on the machine where DAC client is running.

After lot of unsucessful attempts, I looked at all of the ports in use by the Solaris user and 6009 stood out to be something of interest. Boom!! Here I go. DAC threw a message "connection successful but passwords not verified" when the port changed to 6009 from 4006. I do not think we ever configured 6009 anywhere during or post install of Informatica 8.1.1 OEM. I am wondering what if we want to create another Informatica integration service on the same box with a different user. I will post the answer as I come out of that situation.

Here is the way to fix the Port issue with Informatica Integration Service

After going through the documentation once more, it is clear that the custom property should be created in Informatica Integration Service properties NOT on the repository properties. I restarted Infaservices today and the port is different now. So, Informatica is assigning different port each time for Integration service. After adding the ServerPort variable with value as 6010 in Informatica Integration service custom properties, Informatica integration service is always running on the static port(6010). Now I am confident that we can run multiple Informatica Integration services on a Solaris machine, if required.

Thursday, October 30, 2008

ParameterfileOLTP.txt in DAC

I had many problems with DAC 7.9.4 not picking right values from parameterOLTP.txt file while generating the individual parameter files at the run time. Read the full post here... Dylan Wan talked about an enhancement then and a colleague of mine reminded me today of testing that while we were planning on modifying parameterfileOLTP.txt as we upgrade from 7.9.4 to 7.9.5. It is pretty cool now as we get to define parameters at Task level for different source system containers and requires no restart of DAC, no assemle of subject areas and no build of execution plans required after adding the parameters at task level, which is really cool. If parameterfileOLTP.txt has same parameter, what you define in DAC takes precedence over the definition in parameterfileOLTP.txt.

I was suspicious of the fact that Informatica expects the parameters to be prefixed with mapplet name in case the parameter is a mapplet parameter. DAC is able to generate the parameter as mapplet.parameter at the run time though we do not specify in DAC that it is a Mapplet parameter. I am thinking DAC queries OPB tables (Informatica Repository tables) to identify whether the parameter is Mapplet or mapping.

7.9.4 DAC in fact had same parameters feature at task level in DAC Client and in fact it is documented that we should use parameters at task level instead of attempting to change parameterfileOLTP.txt. It is clear now that I missed to read this piece of the DAC 7.9.4 documentation, while I configured OOB Order Management. To be fair to myself, applications configuration document mentions to change the parameter values in parameterfileOLTP.txt, which should have at least had a line on configuration steps in DAC. I spent lot of time trying to fix the issue with DAC not passing right values from parameterfileOLTP.txt and I ultimately created my own parameters in 7.9.4. Anyway, not bad to realize now than never.