Skip to main content

Informatica Performance Tuning



                                                                                                                                                                      
Scenario:
Informatica Performance Tuning

Solution:

Identifying Target Bottlenecks
------------------------------
The most common performance bottleneck occurs when the Informatica Server writes to a target database. You can identify target bottlenecks by configuring the session to write to a flat file target. If the session performance increases significantly when you write to a flat file, you have
a target bottleneck.

Consider performing the following tasks to increase performance:
* Drop indexes and key constraints.
* Increase checkpoint intervals.
* Use bulk loading.
* Use external loading.
* Increase database network packet size.
* Optimize target databases.

Identifying Source Bottlenecks
------------------------------
If the session reads from relational source, you can use a filter transformation, a read test mapping, or a database query to identify source bottlenecks:
* Filter Transformation - measure the time taken to process a given amount of data, then add an always false filter transformation in the mapping after each source qualifier so that no data is processed past the filter transformation. You have a source bottleneck if the new session runs in about the same time.
* Read Test Session - compare the time taken to process a given set of data using the session with that for a session based on a copy of the mapping with all transformations after the source qualifier removed with the source qualifiers connected to file targets. You have a source bottleneck if the new session runs in about the same time.
* Extract the query from the session log and run it in a query tool. Measure the time taken to return the first row and the time to return all rows. If there is a significant difference in time, you can use an optimizer hint to eliminate the source bottleneck

Consider performing the following tasks to increase performance:
* Optimize the query.
* Use conditional filters.
* Increase database network packet size.
* Connect to Oracle databases using IPC protocol.

Identifying Mapping Bottlenecks
-------------------------------
If you determine that you do not have a source bottleneck, add an Always False filter transformation in the mapping before each target definition so that no data is loaded into the target tables. If the time it takes to run the new session is the same as the original session, you have a mapping bottleneck.

You can also identify mapping bottlenecks by examining performance counters.

Readfromdisk and Writetodisk Counters: If a session contains Aggregator, Rank, or Joiner transformations, examine each Transformation_readfromdisk and Transformation_writetodisk counter. If these counters display any number other than zero, you can improve session performance by increasing the index and data cache sizes. Note that if the session uses Incremental Aggregation, the counters must be examined during the run, because the Informatica Server writes to disk when saving historical data at the end of the run.

Rowsinlookupcache Counter: A high value indicates a larger lookup, which is more likely to be a bottleneck

Errorrows Counters: If a session has large numbers in any of the Transformation_errorrows counters, you might improve performance by eliminating the errors.

BufferInput_efficiency and BufferOutput_efficiency counters: Any dramatic difference in a given set of BufferInput_efficiency and BufferOutput_efficiency counters indicates inefficiencies that may benefit from tuning.

To enable collection of performance data:
1. Set session property Collect Performance Data (on Performance tab)
2. Increase the size of the Load Manager Shared Memory by 200kb for each session in shared memory that you configure to create performance details. If you create performance details for all sessions, multiply the MaxSessions parameter by 200kb to calculate the additional shared memory requirements.

To view performance details in the Workflow Monitor:
1. While the session is running, right-click the session in the Workflow Monitor and choose Properties.
2. Click the Performance tab in the Properties dialog box.

To view the performance details file:
1. Locate the performance details file. The Informatica Server names the file session_name.perf, and stores it in the same directory as the session log.
2. Open the file in any text editor.

General Optimizations
---------------------
Single-pass reading - instead of reading the same data several times, combine mappings that use the same set of source data and use a single source qualifier

Avoid unnecessary data conversions: For example, if your mapping moves data from an Integer column to a Decimal column, then back to an Integer column, the unnecessary data type conversion slows performance.

Factor out common expressions/transformations and perform them before data pipelines split

Optimize Char-Char and Char-Varchar Comparisons by using the Treat CHAR as CHAR On Read option in the Informatica Server setup so that the Informatica Server does not trim trailing spaces from the end of Char source fields.

Eliminate Transformation Errors (conversion errors, conflicting mapping logic, and any condition set up as an error, such as null input). In large numbers they restrict performance because for each one, the Informatica Server pauses to determine its cause, remove the row from the data flow and write it to the session log or bad file.

As a short term fix, reduce the tracing level on sessions that must generate large numbers of errors.

Optimize lookups
----------------
Cache lookups if
o the number of rows in the lookup table is significantly less than the typical number of source rows
o un-cached lookups perform poorly (e.g. they are based on a complex view or an unindexed table) Optimize Cached lookups
o Use a persistent cache if the lookup data is static
o Share caches if several lookups are based on the same data set
o Reduce the number of cached rows using a SQL override with a restriction
o Index the columns in the lookup ORDER BY
o Reduce the number of co


Comments

Popular posts from this blog

CMN_1650 A duplicate row was attempted to be inserted into a dynamic lookup cache Dynamic lookup error.

Scenario: I have 2 ports going through a dynamic lookup, and then to a router. In the router it is a simple case of inserting new target rows (NewRowLookup=1) or rejecting existing rows (NewRowLookup=0). However, when I run the session I'm getting the error: "CMN_1650 A duplicate row was attempted to be inserted into a dynamic lookup cache Dynamic lookup error. The dynamic lookup cache only supports unique condition keys." I thought that I was bringing through duplicate values so I put a distinct on the SQ. There is also a not null filter on both ports. However, whilst investigating the initial error that is logged for a specific pair of values from the source, there is only 1 set of them (no duplicates). The pair exists on the target so surely should just return from the dynamic lookup newrowlookup=0. Is this some kind of persistent data in the cache that is causing this to think that it is duplicate data? I haven't got the persistent cache or...

SQL Transformation with examples

============================================================================================= SQL Transformation with examples   Use : SQL Transformation is a connected transformation used to process SQL queries in the midstream of a pipeline . We can insert, update, delete and retrieve rows from the database at run time using the SQL transformation. Use SQL transformation in script mode to run DDL (data definition language) statements like creating or dropping the tables. The following SQL statements can be used in the SQL transformation. Data Definition Statements (CREATE, ALTER, DROP, TRUNCATE, RENAME) DATA MANIPULATION statements (INSERT, UPDATE, DELETE, MERGE) DATA Retrieval Statement (SELECT) DATA Control Language Statements (GRANT, REVOKE) Transaction Control Statements (COMMIT, ROLLBACK) Scenario: Let’s say we want to create a temporary table in mapping while workflow is running for some intermediate calculation. We can use SQL transformat...

Load the session statistics such as Session Start & End Time, Success Rows, Failed Rows and Rejected Rows etc. into a database table for audit/log purpose.

                                                                                                                                                                     ...