Solution for Informatica problems

Posts

Showing posts from October, 2011

Need to get the lastest ID

Scenario: We have data from source is coming as below Source is Oracle database: OLD_ID NEW_ID ---------- ---------- 101 102 102 103 103 104 105 106 106 108 Need the output as below. OLD_ID NEW_ID ---------- ------------ 101 104 102 104 103 104 105 108 106 108 Can anyone help me todo this in informatica. Solution: Mapping def Exp2 Sq --> Exp1 --> ---->Jnr ----> TGT Agg Explnation: In exp1 ,add tow variable shown below OLD_ID NEW_ID Diff_of_rows New_id ---------- ---------- ----------------- ---------- 101 102 1 (1) 1 102 103 1 (102-101) 1 103 104 1 (103-102)

Aborting a Session in Informatica 8.6.1

Scenario: I am trying to abort a session in the workflow monitor by using 'Abort' option. But the status of the session is still being shown as 'Aborting' and remains same for the past 4 days. Finally I had to request the UNIX team to kill the process. Could anybody let me know the reason behind this as I couldn't find any info in the log file as well. Solution: - If the session you want to stop is a part of batch, you must stop the batch - If the batch is part of nested batch, stop the outermost batch - When you issue the stop command, the server stops reading data. It continues processing and writing data and committing data to targets - If the server cannot finish processing and committing data, you can issue the ABORT command. It is similar to stop command, except it has a 60 second timeout. If the server cannot finish processing and committing data within 60 seconds, You need to kill DTM process and terminates the session. As you sa

Condition to Check for NULLS and SPACES in Informatica

Scenario: I have String data and I want to filter out NULLS and SPACES from that set of Data. What can be the condition given in Informatica to check for NULLS and SPACES in ONE EXPRESSION OR FILTER TRANSFORMATION. Solution: Use LENGTH(LTRIM(RTRIM(column_name)))<>0 in filter transformation. OR IIF(ISNULL(column_name) or LTRIM(RTRIM(column_name)) = '', 0, 1) -- do this in exp t/n and use this flag in filte.

Combining multiple rows as one based on matching colum value of multiple rows

Scenario: I have my source data like below: ID Line-no Text 529 3 DI-9001 529 4 DI-9003 840 2 PR-031 840 2 DI-9001 616 1 PR-029 874 2 DI-9003 874 1 PR-031 959 1 PR-019 Now I want my target to be ID Line-no Text 529 3 DI-9001 529 4 DI-9003 840 2 PR-031&DI-9001 616 1 PR-029 874 2 DI-9003 874 1 PR-031 959 1 PR-019 It means if both the ID and the LINE_NO both are same then the TEXT should concatenate, else no change Solution: The mapping flow like this: source-->sq-->srttrans-->exptrans--->aggtrans--->target srttrans--->sort by ID, line_no ASC order exp-->use variable ports as ID(I/O) Line_no(i/o) Text(i) text_v : iif(ID=pre_id and Line_no=pre_line_no,Text_v||'&'||Text,Text) pre_id(v):ID pre_line_no(v):Line_no Text_op:Text_v Aggtrans-->group by ID and Line_no. It will return last row which is concatenation of text Then pass to the Target.

How to Read the Data between 2 Single Quotes

Scenario: I have a record like this: field1 = "id='102',name='yty,wskjd',city='eytw' " Note: sometimes data might come as [id,name, city] or sometimes it might come as [code,name,id]. It varies... I need to store the value of field1 into different fields, value1 = id='102' value2 = name='yty,wskjd' value3 = city='eytw' If I split the record based on comma(,) then the result wont come as expected as there is a comma(,) in value of name. Is there a way where we can achieve the solution in easier way i.e., if a comma comes in between two single quotes then we have to suppress the comma(,). I gave a try with different inbuilt functions but couldnt make it. Is there a way to read the data in between 2 single quotes ??? Solution: Please try below solution it may help u in some extent Field1 = "id='102',name='yty,wskjd',city='eytw' " Steps - 1. v_1 = Replace(field1

How to use Lookup Transformation for Incremental data loading in target table?

Scenario: how to load new data from one table to another. For eg: i have done a mapping from source table (Which contain bank details) to target table. For first time i will load all the data from source to target, if i have run the mapping second day, i need to get the data which is newly entered in the source table. First time it have to load all the data from source to target, for second or third time, if there is any new record in the source table, only that record must load to the target, by comparing both source and the target. How to use the lookup transformation for this issue? Solution: 1) In mapping, create a lookup on target table and select dynamic lookup cache in property tab, once you check it you can see NewLookupRow column in lookup port through which you can identify whether incoming rows are new or existing. So after lookup you can use router to insert or update

Is it possible to use a parameter to specify the 'Table Name Prefix'?

Scenario: Is it possible to use a parameter to specify the 'Table Name Prefix'? Solution: Yes, you can use the parameter for specifying the tablename prefix. Say suppose if you have table x with different tablename prefix like p1.x and p2.x you can load into the tables seperately by specifying the tablenameprefix value in the parameter file. All you need to do is to create a workflow variable and assign value to the variable in the param file.Use that variable in the table prefix property

Informatica PowerCenter performance - Lookups

Informatica PowerCenter performance - Lookups Lookup performance Lookup is an important and a useful transformation when used effectively. What is a lookup transformation? It is just not another transformation which fetches you data to look against the source data. It is a transformation when used improperly, makes your flow run for ages. I now try to explain different scenarios where you can face problems with Lookup and also how to tackle them. Unwanted columns: By default, when you create a lookup on a table, PowerCenter gives you all the columns in the table, but be sure to delete the unwanted columns from the lookup as they affect the lookup cache very much. You only need columns that are to be used in lookup condition and the ones that have to get returned from the lookup. SQL query: We will start from the database. Find the execution plan of the SQL override and see if you can add some indexes or hints to the query to make it fetch data faster. You may have to t

PowerCenter objects – Introduction

PowerCenter objects – Introduction • A repository is the highest physical entity of a project in PowerCenter. • A folder is a logical entity in a PowerCenter project. For example, Customer_Data is a folder. • A workflow is synonymous to a set of programs in any other programming language. • A mapping is a single program unit that holds the logical mapping between source and target with required transformations. A mapping will just say a source table by name EMP exists with some structure. A target flat file by name EMP_FF exists with some structure. The mapping doesn’t say in which schema this EMP table exists and in which physical location this EMP_FF table going to be stored. • A session is the physical representation of the mapping. The session defines what a maping didn’t do. The session stores the information about where this EMP table comes from. Which schema, with what username and password can we access this table in that schema. It also tells about the target flat

Dynamically generate parameter files

Scenario: Dynamically generate parameter files Solution: Parameter file format for PowerCenter: For a workflow parameter which can be used by any session in the workflow, below is the format in which the parameter file has to be created. [Folder_name:WF.Workflow_Name] $$parameter_name1 =value $$parameter_name2 =value For a session parameter which can be used by the particular session, below is the format in which the parameter file has to be created. [Folder_name:WF.Workflow_Name:ST.Session_Name] $$parameter_name1 =value $$parameter_name2 =value 3. Parameter handling in a data model: • To have flexibility in maintaining the parameter files. • To reduce the overhead for the support to change the parameter file every time a value of a parameter changes • To ease the deployment, all the parameters have to be maintained in Oracle or any database tables and a PowerCenter session is created to generate the parameter file in the required format auto

How to generate target file names (like YYYYMMDDHH24:MISS.csv) dynamically from the mapping?

Scenario: How to generate target file names (like YYYYMMDDHH24:MISS.csv) dynamically from the mapping? Solution: In order to generate the target file names from the mapping, we should make use of the special "FileName" port in the target file. You can't create this special port from the usual New port button. There is a special button with label "F" on it to the right most corner of the target flat file when viewed in "Target Designer". Below two screen-shots tell you how to create the special port in your target file. Once this is done, the job is done. When you want to create the file name with a timestamp attached to it, just use a port from an Expression transformation before the target to pass a value of Output Port with expression $$FILE_NAMEto_char(sessstarttime, 'YYYYMMDDHH24:MISS')'.csv'. Please note that $$FILE_NAME is a parameter to the mapping and I've us

To find the inserted deleted and updted rows count

Scenario: I want to find the no.of rows inserted,updated and deleted on the successful execution of a session. These details are present only in the session log file so how to grep these details from the log file? or Is there anyother method? I actually have to insert these details into a table.The other details which i have to include in the table are session name, target table name,sesion start time,session end time. Thus my table structure is Session_name Tgt_Table_name Start_time End_time Inserted_count Deleted_count Updated_count Solution: Hey u will get ths info through INFA metadata tables

How to load first,last,remaining records into different targets when source is a flat file?

Scenario: How to load first,last,remaining records into different targets when source is a flat file? Solution: If you are using seq and aggregator then the mapping flow should be like below -->AGG SRC-->SQ-->EXP -->JNR-->RTR-->TGT1 SEQ--> -->TGT2 -->TGT3 In router if seq value =1 then that record will go to target1 if seq value and agg count out put equal that means that is last record so it has to go to target 3 the remaining all records has to pass to target 2. for sql query to get first, last and remaining records try the below For First record: select * from emp where rownum=1; For Last record: select * from (select * from (select empno,ename,sal,job,mgr,rownum from emp) order by rownum DESC) where rownum=1; For remaining record you can use minus functio

Email date in subject line

Scenario: Is there a way to add the sysdate to the email subject sent from Informatica? I am running a mapping which create an error file. I am sending this error at the end of the process via an email. But the requirement is to send it with some text as error report and the system in the subject line. Solution: Below is the approach, Create a workflow variable $$Datestamp as datetime datatype.In assignment task assign the sysdate to that variable and in email subject use the $$Datestamp variable and it will send the timestamp in the subject.

Route records as UNIQUE AND DUPLICATE

Scenario: I HAVE A SRC TABLE AS : A B C C B D B I HAVE 2 TGT TABLES UNIQUE AND DUPLICATE : The first table should contain the following output A D The second target should contain the following output B B B C C hOW DO I DO THIS : Solution: Try the following approach. AGG--> SRC-->SQ--------------> JNR--> RTR-->TGT1 -->TGT2 from source pass all the data to aggregator and group by source column. one one out put port count(column) so from agg you have two ports out puts COLUMN,COUNT A,1 B,2 C,2 D,1 Now join this data with source based on column. Out put of joiner will be like below COLUMN,COUNT A,1 B,3 C,2 C,2 B,3 D,1 B,3 In router create two groups one for Unique and another one for duplicate Unique=(count=1) Duplicate=(count>1)

Informatica Source Qualifier (Inner Joins )

Scenario: have an 3 tables ENO ENAM HIREDATE 001 XXX MAY/25/2009 002 JJJJ OCT/12/2010 008 KKK JAN/02/2011 006 HJJH AUG/12/2012 ENO S-ID 001 OO 002 OO 007 OO ENO V-ID 006 DD 008 DD 001 DD Using informatica source qualifier or other transformations I should be able to club the above tables in such a way that if the HIREDATE>JAN/01/2011 then eno should select v-id and if HIREDATE<JAN/01/2011 the ENO should select s-id and make a target table leaving the ID columns blank based on condition IT SHOULD HAVE EITHER S-ID OR V-ID BUT NOT BOTH . ENO ENAM HIREDATE S-ID V-ID Please give me the best advice for the following situation. Solution: Better u do it in source qualifier sql query by case statement select ENO,ENAM,HIREDATE, CASE WHEN (HIREDATE<JAN/01/2011 THEN table2.s-id ELSE table3.s-id END from table1 a,table2 b,table3 c where a.eno=b.eno

CMN_1650 A duplicate row was attempted to be inserted into a dynamic lookup cache Dynamic lookup error.

Scenario: I have 2 ports going through a dynamic lookup, and then to a router. In the router it is a simple case of inserting new target rows (NewRowLookup=1) or rejecting existing rows (NewRowLookup=0). However, when I run the session I'm getting the error: "CMN_1650 A duplicate row was attempted to be inserted into a dynamic lookup cache Dynamic lookup error. The dynamic lookup cache only supports unique condition keys." I thought that I was bringing through duplicate values so I put a distinct on the SQ. There is also a not null filter on both ports. However, whilst investigating the initial error that is logged for a specific pair of values from the source, there is only 1 set of them (no duplicates). The pair exists on the target so surely should just return from the dynamic lookup newrowlookup=0. Is this some kind of persistent data in the cache that is causing this to think that it is duplicate data? I haven't got the persistent cache or

Update Strategy for Deleting Records in Informatica

Scenario: I am using an update strategy transformation for deleting records from my target table. In my Warehouse Designer, I have defined one column (say col1) as Primary Key and another column (say col2) as Primary/Foreign Key. My target has rows like this: Col1 Col2 Col3 ---- 1 A value1 2 A value2 3 B value3 I want to delete the record from the target which has the combination (Col1="2" and Col2="A"). Will linking the fields Col1 and Col2 from the Update Strategy transformation to the Target serve the purpose? Solution: Define both the columns as primary key in target definition and link only col1 and col2 in mapping. This will serve your purpose. BTW, if you do only delete then update strategy is not required at all.

Target Rows as Update or Insert Option in the Target

Scenario: When you have the option to treat target rows as update or Insert option in the target why do you need lookup transformation. I mean why do you need a look up transformation with update strategy in a mapping to mark the records for update or Insert when you have update else Insert option in the target? Is there any difference between both? Can someone please let me know what is the difference and when to use which option? Solution: In slowly growing targets (Delta loads) target is loaded incrementally. You need to know a particular record is existing or not in target target. Look up is used to cache the Target records and compare the incoming records with the records in Target. If incoming record is new it will be insert in target otherwise not. Expression is used to flag a record whether it is a new or existing. If it is new Record is flagged as 'I' with the sense of Insert. In Slowly Changing Dimensions(SCD), History of dimension is m

How to Fill Missing Sequence Numbers in Surrogate Key Column in Informatica

Scenario: Hello all, I am new to working with surrogate key columns in database. Recently I developed a workflow/mapping that populates an SCD table with a surrogate key column. For each record that is inserted, I created a logic in expression t/r such that it generates a new sequence number. This seems fine and works OK. Now, We have a purge logic that runs every day in post-sql that will delete records that have not been updated for the last 10 days. Due to this reason, after testing the ETL process for over 15 days, I find a lot of gaps in the surrogate key column. Is there a way/logic in Informatica with which I can fill these gaps while loading the target and create a new sequence number only if there a no gaps? Or can this be done at database level? I searched over the Internet but did not find any solution whatsoever. Please advise. Solutions: Hello, If you can make a bit changes to ur mapping u can achive it. 1. First delete the record which is not

Surrogate Key

Scenario: What is a surrogate key and where do you use it? Solution: A surrogate key is a substitution for the natural primary key. It is just a unique identifier or number for each row that can be used for the primary key to the table. The only requirement for a surrogate primary key is that it is unique for each row in the table. Data warehouses typically use a surrogate, (also known as artificial or identity key), key for the dimension tables primary keys. They can use Infa sequence generator, or Oracle sequence, or SQL Server Identity values for the surrogate key. It is useful because the natural primary key (i.e. Customer Number in Customer table) can change and this makes updates more difficult. Some tables have columns such as AIRPORT_NAME or CITY_NAME which are stated as the primary keys (according to the business users) but ,not only can these change, indexing on a numerical value is probably better and you could consider creating a surrogate key called

Unique Constraint Violated

Scenario: Database errors occurred: ORA-00001: unique constraint (INF_PRACTICE1.SYS_C00163872) violated Database driver error. Function Name : Execute SQL Stmt : INSERT INTO D_CLAIM_INJURY_SAMPLEE(CK_SUM,DM_ROW_PRCS_DT,DM_RO W_PRCS_UPDT_DT,CLAIM_INJRY_SID,DM_CRRNT_ROW_IND,INCDT_ID,ENAME,J OB,FIRSTNAME,LASTNAME) VALUES ( ?, ?, ?, ?, ?, ?, ?, ?, ?, ?) Database driver error... Function Name : Execute Multiple SQL Stmt : INSERT INTO D_CLAIM_INJURY_SAMPLEE(CK_SUM,DM_ROW_PRCS_DT,DM_RO W_PRCS_UPDT_DT,CLAIM_INJRY_SID,DM_CRRNT_ROW_IND,INCDT_ID,ENAME,J OB,FIRSTNAME,LASTNAME) VALUES ( ?, ?, ?, ?, ?, ?, ?, ?, ?, ?) Solution: check the definition of unique index columns and then below query on source to fine out thd duplicate rows. if index def like create index on targettable(col1,col2,col3); select col1,col2,col3,count(1) from sourcetable group by col1,col2,col3 having count(1)>1 either u have to delete those records from source or use agg in informatica

Validating Multiple Sessions in Informatica

Scenario: Is there anyway of validating multiple workflows and their respective sessions at the same time in Informatica. Validating them separately is tedious. Solution: Best approach is to create a worklet instead of workflow then put a set of sessions in worklet then call all those worklets in single workflow. After doing this, you can validate your workflow which contains the multiple worklet to validate multiple sessions.

Informatica Workflow Execution based on Conditions

Scenario: I have a table which contains a single row which has a column ABC. The value of ABC defines different scenarios. For ex. if the value of ABC is say 1, 1st workflow should be executed; if 2, 2nd workflow should be executed and so on. Solution: If there are few values 1,2,3 for ABC Then we can have filter in the mapping having source table with column ABC. Filter the records with condition ABC=1,ABC=2,ABC=3 and load target tables in three different mappings. Create three different sessions and then use decision task in workflow level as If tgtsuccessrows=1 for session1 then run worklet1 If tgtsuccessrows=2 for session2 then run worklet2 If tgtsuccessrows=2 for session3 then run worklet3

Finding Objects in Checkout Status

Scenario: So, does anyone know of a way to find what objects are in checkout status and who has it checked out? Solution: Under the Repository database, there must be folders that you have created. Open that folder and then right click and goto version->find checkouts->all users. This will show the history of changes made and saved on that particular code. It will show you details such as last check-out, last-saved time, saved-by, etc.

Convert data as per specific region

Scenario : Convert date as per specific region Solution: Specifying an NLS parameter for an SQL function means that any User Session NLS parameters (or the lack of) will not affect evaluation of the function. This feature may be important for SQL statements that contain numbers and dates as string literals. For example, the following query is evaluated correctly only if the language specified for dates is American: SELECT ENAME FROM EMP WHERE HIREDATE > '1-JAN-01' This can be made independent of the current date language by specifying NLS_DATE_LANGUAGE: SELECT ENAME FROM EMP WHERE HIREDATE > TO_DATE ('1-JAN-01','DD-MON-YY', 'NLS_DATE_LANGUAGE = AMERICAN') Using all numerics is also language-independent: SELECT ENAME FROM EMP WHERE HIREDATE > TO_DATE('1-01-01','DD-MM-YY') NLS settings include Character set, Language and Territory Common character sets: WE8ISO8859P15 European English includes euro

Compare the Total Number of Rows in a Flat File with the Footer of the Flat File

Scenario : I have a requirement where I need to find the number of rows in the flat file and then compare the row count with the row count mentioned in the footer of the flat file. Solution : Using Infomratica: I believe you can identify the data records from the trailer record. you can use following method to identify the count of the records 1. use router to create two data streams ; one for data records & other for trailer record 2. use aggregator (with out defining any group key) and use count() aggregate function now both data stream will have single record. 3.use joiner to get one record from these two data streams it will give you two different count ports in single record 4. use expression for comparing the counts and proceed as per you rules. Using UNIX : If you are on Unix, then go for a couple of line script or commands: Count number of lines in file by wc -l. Assign the count to variable x = (wc -l) - 1 i.e. neglecting footer reco

Loading Multiple Flat Files using one mapping

Scenario: Can any one explain that how can we load multiple flat files using one mapping Solution: Use Indirect option in session properties and give file_list name. In the file list you can have actual file names with complete path. Ex: In Session Properties SourceFileType --- Indirect and File Name ABC.txt ABC.txt will contain all the input file names with complete path. like /home/.../...filename.dat /home/.../...filename1.dat

Capture filename while using indirect file

Scenario : I have 5 source files which I am planning to load using indirect file as they all are of same format and go to same target table. One requirement is to capture the source file name in the target. Is there any simple way to achieve this? The filename column is there only for file targets, not for file sources. Solution: Sol 1. Effective with PowerCenter 8.5 there is an option called Add Currently Processed Flat File Name Port. If this flat file source option is selected, the file name port will be added in the ports of the source. To add the CurrentlyProcessedFileName port: 1. Open the flat file source definition in the Source Analyzer. 2. Click the Properties tab. 3. Select Add Currently Processed Flat File Name Port. The Designer adds the CurrentlyProcessedFileName port as the last column on the Columns tab. The CurrentlyProcessedFileName port is a string port with default precision of 256 characters. 4. Click the Columns tab to see your change

CurrentlyProcessedFileName port in source coming as NULL

Issue - The "CurrentlyProcessedFileName" port is coming properly in test environment. I imported the same objects into stage. But in stage, "CurrentlyProcessedFileName" port is always coming as NULL. Solution - 1. Edit source defn by removing CurrentlyProcessedFileName port and add again, this should solve your problem.