Informatica Interview Questionnaire
1.
What are the components of Informatica?
And what is the purpose of each?
Ans:
Informatica Designer, Server Manager & Repository Manager. Designer
for Creating Source & Target definitions, Creating Mapplets and Mappings
etc. Server Manager for creating sessions & batches, Scheduling the
sessions & batches, Monitoring the triggered sessions and batches, giving
post and pre session commands, creating database connections to various
instances etc. Repository Manage for Creating and Adding repositories, Creating
& editing folders within a repository, Establishing users, groups,
privileges & folder permissions, Copy, delete, backup a repository, Viewing
the history of sessions, Viewing the locks on various objects and removing
those locks etc.
2.
What is a repository? And how to add it
in an informatica client?
Ans: It’s a
location where all the mappings and sessions related information is stored.
Basically it’s a database where the metadata resides. We can add a repository
through the Repository manager.
3.
Name at least 5 different types of
transformations used in mapping design and state the use of each.
Ans: Source Qualifier –
Source Qualifier represents all data queries from the source, Expression –
Expression performs simple calculations,
Filter –
Filter serves as a conditional filter,
Lookup –
Lookup looks up values and passes to other objects,
Aggregator -
Aggregator performs aggregate calculations.
4.
How can a transformation be made
reusable?
Ans: In the
edit properties of any transformation there is a check box to make it reusable,
by checking that it becomes reusable. You can even create reusable
transformations in Transformation developer.
5.
How are the sources and targets
definitions imported in informatica designer? How to create Target definition
for flat files?
Ans: When
you are in source analyzer there is a option in main menu to Import the source
from Database, Flat File, Cobol File & XML file, by selecting any one of
them you can import a source definition. When you are in Warehouse Designer
there is an option in main menu to import the target from Database, XML from
File and XML from sources you can select any one of these.
There is no
way to import target definition as file in Informatica designer. So while
creating the target definition for a file in the warehouse designer it is
created considering it as a table, and then in the session properties of that
mapping it is specified as file.
6.
Explain what is sql override for a source
table in a mapping.
Ans: The Source Qualifier provides
the SQL Query option to override the default query. You can enter any SQL
statement supported by your source database. You might enter your own SELECT
statement, or have the database perform aggregate calculations, or call a
stored procedure or stored function to read the data and perform some tasks.
7.
What is lookup override?
Ans: This
feature is similar to entering a custom query in a Source Qualifier
transformation. When entering a Lookup SQL Override, you can enter the entire
override, or generate and edit the default SQL statement.
The lookup query override can
include WHERE clause.
8.
What are mapplets? How is it different
from a Reusable Transformation?
Ans: A mapplet is a reusable
object that represents a set of transformations. It allows you to reuse
transformation logic and can contain as many transformations as you need. You
create mapplets in the Mapplet Designer.
Its different than a reusable
transformation as it may contain a set of transformations, while a reusable
transformation is a single one.
9.
How to use an oracle sequence generator
in a mapping?
Ans: We
have to write a stored procedure, which can take the sequence name as input and
dynamically generates a nextval from that sequence. Then in the mapping we can
use that stored procedure through a procedure transformation.
10.
What is a session and how to create it?
Ans: A session is a set of
instructions that tells the Informatica Server how and when to move data from
sources to targets. You create and maintain sessions in the Server Manager.
11.
How to create the source and target
database connections in server manager?
Ans: In the
main menu of server manager there is menu “Server Configuration”, in that there
is the menu “Database connections”. From here you can create the Source and
Target database connections.
12.
Where are the source flat files kept
before running the session?
Ans: The
source flat files can be kept in some folder on the Informatica server or any
other machine, which is in its domain.
13.
What are the oracle DML commands possible
through an update strategy?
Ans:
dd_insert, dd_update, dd_delete & dd_reject.
14.
How to update or delete the rows in a
target, which do not have key fields?
Ans: To
Update a table that does not have any Keys we can do a SQL Override of the
Target Transformation by specifying the WHERE conditions explicitly. Delete cannot be done this way. In this case
you have to specifically mention the Key for Target table definition on the
Target transformation in the Warehouse Designer and delete the row using the Update Strategy
transformation.
15.
What is option by which we can run all
the sessions in a batch simultaneously?
Ans: In the
batch edit box there is an option called concurrent. By checking that all the
sessions in that Batch will run concurrently.
16.
Informatica settings are available in
which file?
Ans:
Informatica settings are available in a file pmdesign.ini in Windows
folder.
17.
How can we join the records from two
heterogeneous sources in a mapping?
Ans: By using a joiner.
18.
Difference between Connected &
Unconnected look-up.
Ans: An
unconnected Lookup transformation exists separate from the pipeline in the
mapping. You write an expression using the :LKP reference qualifier to call the
lookup within another transformation. While the connected lookup forms a part
of the whole flow of mapping.
19.
Difference between Lookup Transformation
& Unconnected Stored Procedure Transformation – Which one is faster ?
20.
Compare Router Vs Filter & Source
Qualifier Vs Joiner.
Ans: A Router transformation
has input ports and output ports. Input ports reside in the input group, and
output ports reside in the output groups. Here you can test data based on one
or more group filter conditions.
But in
filter you can filter data based on one or more conditions before writing it to
targets.
A source
qualifier can join data coming from same source database. While a joiner is
used to combine data from heterogeneous sources. It can even join data from two
tables from same database.
A source
qualifier can join more than two sources. But a joiner can join only two
sources.
21.
How to Join 2 tables connected to a
Source Qualifier w/o having any relationship defined ?
Ans: By
writing an sql override.
22.
In a mapping there are 2 targets to load
header and detail, how to ensure that header loads first then detail table.
Ans:
Constraint Based Loading (if no relationship at oracle level) OR Target Load
Plan (if only 1 source qualifier for both tables) OR select first the header
target table and then the detail table while dragging them in mapping.
23.
A mapping just take 10 seconds to run, it
takes a source file and insert into target, but before that there is a Stored
Procedure transformation which takes around 5 minutes to run and gives output
‘Y’ or ‘N’. If Y then continue feed or else stop the feed. (Hint: since SP
transformation takes more time compared to the mapping, it shouldn’t run row
wise).
Ans: There
is an option to run the stored procedure before starting to load the rows.
Data warehousing concepts
1.What is difference between view and materialized view?
Views
contains query whenever execute views it has read from base table
Where as M
views loading or replicated takes place only once, which gives you better query
performance
Refresh m
views 1.on commit and 2. on demand
(Complete,
never, fast, force)
2.What is bitmap index why it’s used for DWH?
A bitmap
for each key value replaces a list of rowids. Bitmap index more efficient for
data warehousing because low cardinality, low updates, very efficient for where
class
3.What is star schema? And what is snowflake schema?
The center
of the star consists of a large fact table and the points of the star are the
dimension tables.
Snowflake
schemas normalized dimension tables to eliminate redundancy. That is, the
Dimension
data has been grouped into multiple tables instead of one large table.
Star schema
contains demoralized dimension tables and fact table, each primary key values
in dimension table associated with foreign key of fact tables.
Here a fact
table contains all business measures (normally numeric data) and foreign key
values, and dimension tables has details about the subject area.
Snowflake
schema basically a normalized dimension tables to reduce redundancy in the
dimension tables
4.Why need staging area database for DWH?
Staging
area needs to clean operational data before loading into data warehouse.
Cleaning in
the sense your merging data which comes from different source
5.What are the steps to create a database in manually?
create os service and
create init file and start data base no mount stage then give create data base
command.
6.Difference between OLTP and DWH?
OLTP system
is basically application orientation (eg, purchase order it is functionality of
an application)
Where as in
DWH concern is subject orient (subject in the sense custorer, product, item,
time)
OLTP
· Application Oriented
· Used to run business
· Detailed data
· Current up to date
· Isolated Data
· Repetitive access
· Clerical User
· Performance Sensitive
· Few Records accessed at a time (tens)
· Read/Update Access
· No data redundancy
· Database Size 100MB-100 GB
DWH
· Subject Oriented
· Used to analyze business
· Summarized and refined
· Snapshot data
· Integrated Data
· Ad-hoc access
· Knowledge User
· Performance relaxed
· Large volumes accessed at a
time(millions)
· Mostly Read (Batch Update)
· Redundancy present
· Database Size 100 GB - few terabytes
7.Why need data warehouse?
A single,
complete and consistent store of data obtained from a variety of different
sources made available to end users in a what they can understand and use in a
business context.
A process
of transforming data into information and making it available to users in a
timely enough manner to make a difference Information
Technique
for assembling and managing data from various sources for the purpose of
answering business questions. Thus making decisions that were not previous
possible
8.What is difference between data mart and data warehouse?
A data mart
designed for a particular line of business, such as sales, marketing, or
finance.
Where as
data warehouse is enterprise-wide/organizational
The data
flow of data warehouse depending on the approach
9.What is the significance of surrogate key?
Surrogate
key used in slowly changing dimension table to track old and new values and
it’s derived from primary key.
10.What is slowly changing dimension. What kind of scd used in your project?
Dimension
attribute values may change constantly over the time. (Say for example customer
dimension has customer_id,name, and address) customer address may change over
time.
How will
you handle this situation?
There are 3
types, one is we can overwrite the existing record, second one is create
additional new record at the time of change with the new attribute values.
Third one
is create new field to keep new values in the original dimension table.
11.What is difference between primary key and unique key constraints?
Primary key maintains uniqueness
and not null values
Where as unique constrains
maintain unique values and null values
12.What are the types of index? And is the type of index used in your project?
Bitmap index, B-tree index,
Function based index, reverse key and composite index.
We used Bitmap index in our
project for better performance.
13.How is
your DWH data modeling(Details about star schema)?
14.A table
have 3 partitions but I want to update in 3rd partitions how will you do?
Specify partition name in
the update statement. Say for example
Update employee
partition(name) a, set a.empno=10 where ename=’Ashok’
15.When you
give an update statement how memory flow will happen and how oracles allocate
memory for that?
Oracle first checks in
Shared sql area whether same Sql statement is available if it is there it uses.
Otherwise allocate memory in shared sql area and then create run time memory in
Private sql area to create parse tree and execution plan. Once it completed
stored in the shared sql area wherein previously allocated memory
16.Write a
query to find out 5th max salary? In Oracle, DB2, SQL Server
Select (list the columns
you want) from (select salary from employee order by salary)
Where
rownum<5
17.When you
give an update statement how undo/rollback segment will work/what are the
steps?
Oracle keep
old values in undo segment and new values in redo entries. When you say
rollback it replace old values from undo segment. When you say commit erase the
undo segment values and keep new vales in permanent.
Informatica Administration
18.What is
DTM? How will you configure it?
DTM transform data received
from reader buffer and its moves transformation to transformation on row by row
basis and it uses transformation caches when necessary.
19.You
transfer 100000 rows to target but some rows get discard how will you trace
them? And where its get loaded?
Rejected
records are loaded into bad files. It has record indicator and column
indicator.
Record
indicator identified by (0-insert,1-update,2-delete,3-reject) and column
indicator identified by (D-valid,O-overflow,N-null,T-truncated).
Normally
data may get rejected in different reason due to transformation logic
20.What are
the different uses of a repository manager?
Repository manager used
to create repository which contains metadata the informatica uses to transform
data from source to target. And also it use to create informatica user’s and
folders and copy, backup and restore the repository
21.How do
you take care of security using a repository manager?
Using repository
privileges, folder permission and locking.
Repository
privileges(Session operator, Use designer, Browse repository, Create session
and batches, Administer repository, administer server, super user)
Folder
permission(owner, groups, users)
Locking(Read,
Write, Execute, Fetch, Save)
22.What is
a folder?
Folder contains repository
objects such as sources, targets, mappings, transformation which are helps
logically organize our data warehouse.
23.Can you create a folder within designer?
Not
possible
24.What are shortcuts? Where it can be used? What are the
advantages?
There are 2 shortcuts(Local and global) Local used in local repository
and global used in global repository. The advantage is reuse an object without
creating multiple objects. Say for example a source definition want to use in
10 mappings in 10 different folder without creating 10 multiple source you
create 10 shotcuts.
25.How do you increase the performance of mappings?
Use single pass read(use one
source qualifier instead of multiple SQ for same table)
Minimize data type conversion
(Integer to Decimal again back to Integer)
Optimize transformation(when you
use Lookup, aggregator, filter, rank and joiner)
Use caches for lookup
Aggregator use presorted port,
increase cache size, minimize input/out port as much as possible
Use Filter wherever possible to
avoid unnecessary data flow
26.Explain
Informatica Architecture?
Informatica consist of
client and server. Client tools such as Repository manager, Designer, Server
manager. Repository data base contains metadata it read by informatica server
used read data from source, transforming and loading into target.
27.How will
you do sessions partitions?
It’s not
available in power part 4.7
Transformation
28.What are
the constants used in update strategy?
DD_INSERT,
DD_UPDATE, DD_DELETE, DD_REJECT
29.What is
difference between connected and unconnected lookup transformation?
Connected
lookup return multiple values to other transformation
Where as
unconnected lookup return one values
If lookup
condition matches Connected lookup return user defined default values
Where as
unconnected lookup return null values
Connected
supports dynamic caches where as unconnected supports static
30.What you
will do in session level for update strategy transformation?
In session
property sheet set Treat rows as “Data Driven”
31.What are the port available for update strategy ,
sequence generator, Lookup, stored
procedure transformation?
Transformations Port
Update
strategy Input,
Output
Sequence
Generator Output only
Lookup Input,
Output, Lookup, Return
Stored
Procedure Input,
Output
32.Why did
you used connected stored procedure why don’t use unconnected stored procedure?
33.What is
active and passive transformations?
Active
transformation change the no. of records when passing to targe(example filter)
where as
passive transformation will not change the transformation(example expression)
34.What are
the tracing level?
Normal – It
contains only session initialization details and transformation details no.
records rejected, applied
Terse
- Only initialization details will be
there
Verbose
Initialization – Normal setting information
plus detailed information about the transformation.
Verbose
data – Verbose init. Settings and all information about the session
35.How will you make records in groups?
Using group by port in aggregator
36.Need to
store value like 145 into target when you use aggregator, how will you do that?
Use Round()
function
37.How will
you move mappings from development to production database?
Copy all the mapping from
development repository and paste production repository while paste it will
promt whether you want replace/rename. If say replace informatica replace all
the source tables with repository database.
38.What is
difference between aggregator and expression?
Aggregator
is active transformation and expression is passive transformation
Aggregator
transformation used to perform aggregate calculation on group of records really
Where as
expression used perform calculation with single record
39.Can you
use mapping without source qualifier?
Not
possible, If source RDBMS/DBMS/Flat file use SQ or use normalizer if the source
cobol feed
40.When do
you use a normalizer?
Normalized can be used in
Relational to denormilize data.
41.What are
stored procedure transformations. Purpose of sp transformation. How did you go
about using your project?
Connected
and unconnected stored procudure.
Unconnected
stored procedure used for data base level activities such as pre and post load
Connected
stored procedure used in informatica level
for example passing one parameter as input and capturing return value
from the stored procedure.
Normal - row wise check
Pre-Load
Source - (Capture source incremental
data for incremental aggregation)
Post-Load
Source - (Delete Temporary tables)
Pre-Load
Target - (Check disk space available)
Post-Load
Target – (Drop and recreate index)
42.What is
lookup and difference between types of lookup. What exactly happens when a
lookup is cached. How does a dynamic lookup cache work.
Lookup transformation
used for check values in the source and target tables(primary key values).
There are 2
type connected and unconnected transformation
Connected
lookup returns multiple values if condition true
Where as
unconnected return a single values through return port.
Connected
lookup return default user value if the condition does not mach
Where as
unconnected return null values
Lookup
cache does:
Read the
source/target table and stored in the lookup cache
43.What is a joiner transformation?
Used for
heterogeneous sources(A relational source and a flat file)
Type of
joins:
Assume 2
tables has values(Master - 1, 2, 3 and Detail - 1, 3, 4)
Normal(If the
condition mach both master and detail tables then the records will be
displaced. Result set 1, 3)
Master
Outer(It takes all the rows from detail table and maching rows from master
table. Result set 1, 3, 4)
Detail
Outer(It takes all the values from master source and maching values from detail
table. Result set 1, 2, 3)
Full
Outer(It takes all values from both tables)
44.What is
aggregator transformation how will you use in your project?
Used perform aggregate
calculation on group of records and we can use conditional clause to filter
data
45.Can you
use one mapping to populate two tables in different schemas?
Yes we can
use
46.Explain
lookup cache, various caches?
Lookup transformation
used for check values in the source and target tables(primary key values).
Various Caches:
Persistent cache
(we can save the lookup cache files and reuse them the next time process the
lookup transformation)
Re-cache from database
(If the persistent cache not synchronized with lookup table you can configure
the lookup transformation to rebuild the lookup cache)
Static cache (When
the lookup condition is true, Informatica server return a value from lookup
cache and it’s does not update the cache while it processes the lookup
transformation)
Dynamic cache
(Informatica server dynamically inserts new rows or update existing rows in the
cache and the target. Suppose if we want lookup a target table we can use
dynamic cache)
Shared cache
(we can share lookup transformation between multiple transformations in a
mapping. 2 lookup in a mapping can share single lookup cache)
47.Which
path will the cache be created?
User specified directory. If we say c:\ all the cache files created in this
directory.
48.Where do
you specify all the parameters for lookup caches?
Lookup property
sheet/tab.
49.How do you remove the cache files after the transformation?
After session complete,
DTM remove cache memory and deletes caches files.
In case using persistent cache and Incremental aggregation
then caches files will be saved.
50.What is
the use of aggregator transformation?
To perform
Aggregate calculation
Use
conditional clause to filter data in the expression Sum(commission, Commission
>2000)
Use
non-aggregate function iif (max(quantity) > 0, Max(quantitiy), 0))
51.What are
the contents of index and cache files?
Index caches files hold
unique group values as determined by group by port in the transformation.
Data caches
files hold row data until it performs
necessary calculation.
52.How do
you call a store procedure within a transformation?
In the expression
transformation create new out port in the expression write :sp.stored procedure
name(arguments)
53.Is there any performance issue in connected & unconnected
lookup? If yes, How?
Yes
Unconnected lookup much
more faster than connected lookup why because in unconnected not connected to
any other transformation we are calling it from other transformation so it
minimize lookup cache value
Where as
connected transformation connected to other transformation so it keeps values
in the lookup cache.
54.What is
dynamic lookup?
When we use target lookup
table, Informatica server dynamically insert new values or it updates if the
values exist and passes to target table.
55.How
Informatica read data if source have one relational and flat file?
Use joiner transformation
after source qualifier before other transformation.
56.How you will load unique record into target flat file from
source flat files has duplicate data?
There are 2
we can do this either we can use Rank transformation or oracle external table
In rank
transformation using group by port (Group the records) and then set no. of rank
1. Rank transformation return one value from the group. That the values will be
a unique one.
57.Can you use flat file for repository?
No, We cant
58.Can you
use flat file for lookup table?
No, We cant
59.Without Source Qualifier and joiner how will you join tables?
In session level we have
option user defined join. Where we can write join condition.
60.Update strategy set DD_Update but in session level have
insert. What will happens?
Insert take
place. Because this option override the mapping level option
Sessions and batches
61.What are
the commit intervals?
Source
based commit (Based on the no. of active source records(Source
qualifier) reads. Commit interval set 10000 rows and source qualifier reads
10000 but due to transformation logic 3000 rows get rejected when 7000 reach
target commit will fire, so writer buffer does not rows held the buffer)
Target
based commit (Based on the rows in the buffer and commit interval.
Target based commit set 10000 but writer buffer fills every 7500, next time
buffer fills 15000 now commit statement will fire then 22500 like go on.)
62.When we
use router transformation?
When we want perform
multiple condition to filter out data then we go for router. (Say for example
source records 50 filter condition mach 10 records remaining 40 records get
filter out but still we want perform few more filter condition to filter
remaining 40 records.)
63.How did
you schedule sessions in your project?
Run once (set 2 parameter
date and time when session should start)
Run Every
(Informatica server run session at regular interval as we configured, parameter
Days, hour, minutes, end on, end after, forever)
Customized
repeat (Repeat every 2 days, daily frequency hr, min, every week, every month)
Run only on
demand(Manually run) this not session scheduling.
64.How do
you use the pre-sessions and post-sessions in sessions wizard, what for they
used?
Post-session
used for email option when the session success/failure send email. For that we
should configure
Step1. Should have a
informatica startup account and create outlook profile for that user
Step2. Configure
Microsoft exchange server in mail box applet(control panel)
Step3. Configure
informatica server miscellaneous tab have one option called MS exchange profile
where we have specify the outlook profile name.
Pre-session
used for even scheduling (Say for example we don’t know whether source file
available or not in particular directory. For that we write one DOS command to
move file directory to destination and set event based scheduling option in
session property sheet Indicator file wait for).
65.What are
different types of batches. What are the advantages and dis-advantages of a
concurrent batch?
Sequential(Run
the sessions one by one)
Concurrent
(Run the sessions simultaneously)
Advantage
of concurrent batch:
It’s takes
informatica server resource and reduce time it takes run session separately.
Use this
feature when we have multiple sources that process large amount of data in one
session. Split sessions and put into one concurrent batches to complete
quickly.
Disadvantage
Require more shared memory
otherwise session may get failed
66.How do you handle a session if some of the
records fail. How do you stop the session in case of errors. Can it be achieved
in mapping level or session level?
It can be achieved in session
level only. In session property sheet, log files tab one option is the error
handling Stop on ------ errors. Based on the error we set informatica server
stop the session.
67.How you
do improve the performance of session.
If we use Aggregator
transformation use sorted port, Increase aggregate cache size, Use filter
before aggregation so that it minimize unnecessary aggregation.
Lookup
transformation use lookup caches
Increase
DTM shared memory allocation
Eliminating
transformation errors using lower tracing level(Say for example a mapping has
50 transformation when transformation error occur informatica server has to
write in session log file it affect session performance)
68.Explain
incremental aggregation. Will that increase the performance? How?
Incremental aggregation capture whatever changes made in source used
for aggregate calculation in a session, rather than processing the entire
source and recalculating the same calculation each time session run. Therefore it
improve session performance.
Only use incremental aggregation following situation:
Mapping have aggregate calculation
Source table changes incrementally
Filtering source incremental data by time stamp
Before Aggregation have to do following steps:
Use filter transformation to remove pre-existing records
Reinitialize aggregate cache when source table completely changes for example
incremental changes happing daily and complete changes happenings monthly once.
So when the source table completely change we have reinitialize the aggregate
cache and truncate target table use new source table. Choose Reinitialize cache
in the aggregation behavior in transformation tab
69.Concurrent batches have 3 sessions and set each
session run if previous complete but 2nd fail then what will happen the batch?
Batch will fail
General Project
70. How many mapping, dimension tables, Fact tables
and any complex mapping you did? And what is your database size, how frequently
loading to DWH?
I did 22 Mapping, 4 dimension table
and one fact table. One complex mapping I did for slowly changing dimension
table. Database size is 9GB. Loading data every day
71. What
are the different transformations used in your project?
Aggregator,
Expression, Filter, Sequence generator, Update Strategy, Lookup, Stored
Procedure, Joiner, Rank, Source Qualifier.
72. How did
you populate the dimensions tables?
73. What
are the sources you worked on?
Oracle
74. How
many mappings have you developed on your whole dwh project?
45 mappings
75. What is
OS used your project?
Windows NT
76. Explain
your project (Fact table, dimensions, and database size)
Fact table contains all business measures (numeric values) and foreign
key values, Dimension table contains details about subject area like customer,
product
77.What is
difference between Informatica power mart and power center?
Using power
center we can create global repository
Power mart
used to create local repository
Global
repository configure multiple server to balance session load
Local repository
configure only single server
78.Have you
done any complex mapping?
Developed one mapping to handle slowly changing dimension table.
79.Explain
details about DTM?
Once we session start, load manager start DTM and it allocate session
shared memory and contains reader and writer. Reader will read source data from
source qualifier using SQL statement and move data to DTM then DTM transform
data to transformation to transformation and row by row basis finally move data
to writer then writer write data into target using SQL statement.
I-Flex Interview (14th May 2003)
80.What are the key you used other than primary key and foreign key?
Used surrogate key to maintain uniqueness to overcome duplicate value in the primary key.
81.Data flow of your Data warehouse(Architecture)
DWH is a basic architecture (OLTP to Data warehouse from DWH OLAP
analytical and report building.
82.Difference
between Power part and power center?
Using power
center we can create global repository
Power mart
used to create local repository
Global
repository configure multiple server to balance session load
Local
repository configure only single server
83.What are
the batches and it’s details?
Sequential(Run
the sessions one by one)
Concurrent
(Run the sessions simultaneously)
Advantage
of concurrent batch:
It’s takes
informatica server resource and reduce time it takes run session separately.
Use this
feature when we have multiple sources that process large amount of data in one
session. Split sessions and put into one concurrent batches to complete
quickly.
Disadvantage
Require more shared memory
otherwise session may get failed
84.What is external table in oracle. How oracle read the flat
file
Used for read flat file. Oracle internally write SQL loader script with
control file.
85.What are
the index you used? Bitmap join index?
Bitmap
index used in data warehouse environment to increase query response time, since
DWH has low cardinality, low updates, very efficient for where clause.
Bitmap join
index used to join dimension and fact table instead reading 2 different index.
86.What are
the partitions in 8i/9i? Where you will use hash partition?
In oracle8i there are 3 partition (Range, Hash, Composite)
In Oracle9i
List partition is additional one
Range (Used for
Dates values for example in DWH ( Date values are Quarter 1, Quarter 2, Quarter
3, Quater4)
Hash (Used for
unpredictable values say for example we cant able predict which value to
allocate which partition then we go for hash partition. If we set partition 5
for a column oracle allocate values into 5 partition accordingly).
List (Used for
literal values say for example a country have 24 states create 24 partition for
24 states each)
Composite
(Combination of range and hash)
91.What is
main difference mapplets and mapping?
Reuse the transformation in several mappings, where as mapping not like
that.
If any
changes made in mapplets it automatically inherited in all other instance
mapplets.
92. What is
difference between the source qualifier filter and filter transformation?
Source qualifier filter only used for relation source where as Filter
used any kind of source.
Source
qualifier filter data while reading where as filter before loading into target.
93. What is
the maximum no. of return value when we use unconnected
transformation?
Only one.
94. What
are the environments in which informatica server can run on?
Informatica client runs on Windows 95 / 98 / NT, Unix Solaris, Unix
AIX(IBM)
Informatica
Server runs on Windows NT / Unix
Minimum
Hardware requirements
Informatica
Client Hard disk 40MB, RAM 64MB
Informatica
Server Hard Disk 60MB, RAM 64MB
95. Can
unconnected lookup do everything a connected lookup transformation can do?
No, We cant
call connected lookup in other transformation. Rest of things it’s possible
96. In 5.x
can we copy part of mapping and paste it in other mapping?
I think its
possible
97. What
option do you select for a sessions in batch, so that the sessions run one
after the
other?
We have
select an option called “Run if previous completed”
98.
How do you really know that paging to disk is
happening while you are using a lookup transformation? Assume you have access
to server?
We have
collect performance data first then see the counters parameter
lookup_readtodisk if it’s greater than 0 then it’s read from disk
Step1.
Choose the option “Collect Performance data” in the general tab session
property
sheet.
Step2.
Monitor server then click server-request à session performance details
Step3.
Locate the performance details file named called session_name.perf file in the
session
log file directory
Step4. Find
out counter parameter lookup_readtodisk if it’s greater than 0 then
informatica
read lookup table values from the
disk. Find out how many rows in the cache see
Lookup_rowsincache
99.
List three option available in informatica to tune
aggregator transformation?
Use Sorted
Input to sort data before aggregation
Use Filter
transformation before aggregator
Increase
Aggregator cache size
100.Assume
there is text file as source having a binary field to, to source qualifier What
native data type informatica will convert this binary field to in source
qualifier?
Binary data type for relational source for flat file ?
101.Variable
v1 has values set as 5 in designer(default), 10 in parameter file, 15 in
repository. While running session which
value informatica will read?
Informatica
read value 15 from repository
102. Joiner
transformation is joining two tables s1 and s2. s1 has 10,000 rows and s2 has
1000 rows . Which table you will set master for better performance of joiner
transformation?
Why?
Set table S2 as Master table because informatica server has to keep
master table in the cache so if it is 1000 in cache will get performance
instead of having 10000 rows in cache
103. Source
table has 5 rows. Rank in rank transformation is set to 10. How many rows the
rank transformation will output?
5 Rank
104. How to
capture performance statistics of individual transformation in the mapping and
explain some important statistics that can be captured?
Use tracing
level Verbose data
105. Give a
way in which you can implement a real time scenario where data in a table is changing and you need to
look up data from it. How will you configure the lookup transformation for this
purpose?
In slowly
changing dimension table use type 2 and model 1
106. What
is DTM process? How many threads it creates to process data, explain each
thread in
brief?
DTM receive data from reader and move data to transformation to
transformation on row by row basis. It’s create 2 thread one is reader and
another one is writer
107.
Suppose session is configured with commit interval of 10,000 rows and source
has 50,000 rows explain the commit points for source based commit & target
based commit. Assume appropriate value wherever required?
Target
Based commit (First time Buffer size full 7500 next time 15000)
Commit
Every 15000, 22500, 30000, 40000, 50000
Source
Based commit(Does not affect rows held in buffer)
Commit
Every 10000, 20000, 30000, 40000, 50000
108.What
does first column of bad file (rejected rows) indicates?
First
Column - Row indicator (0, 1, 2, 3)
Second
Column – Column Indicator (D, O, N, T)
109. What
is the formula for calculation rank data caches? And also Aggregator, data,
index caches?
Index cache
size = Total no. of rows * size of the column in the lookup condition (50 * 4)
Aggregator/Rank
transformation Data Cache size = (Total no. of rows * size of the column in the
lookup condition) + (Total no. of rows * size of the connected output ports)
110. Can
unconnected lookup return more than 1 value? No
INFORMATICA TRANSFORMATIONS
·
Aggregator
·
Expression
·
External Procedure
·
Advanced External Procedure
·
Filter
·
Joiner
·
Lookup
·
Normalizer
·
Rank
·
Router
·
Sequence Generator
·
Stored Procedure
·
Source Qualifier
·
Update Strategy
·
XML source qualifier
Expression
Transformation
-
You can use ET to calculate values in a single row
before you write to the target
-
You can use ET, to perform any non-aggregate
calculation
-
To perform calculations involving multiple rows, such
as sums of averages, use the Aggregator. Unlike ET the Aggregator
Transformation allow you to group and sort data
Calculation
To use the Expression Transformation to calculate values for
a single row, you must include the following ports.
-
Input port for each value used in the calculation
-
Output port for the expression
NOTE
You can enter multiple expressions in a single ET. As long
as you enter only one expression for each port, you can create any number
of output ports in the Expression
Transformation. In this way, you can use one expression transformation rather
than creating separate transformations for each calculation that requires the
same set of data.
Sequence
Generator Transformation
-
Create keys
-
Replace missing values
-
This contains two output ports that you can connect
to one or more transformations. The server generates a value each time a row
enters a connected transformation, even if that value is not used.
-
There are two parameters NEXTVAL, CURRVAL
-
The SGT can be reusable
-
You can not edit any default ports (NEXTVAL, CURRVAL)
SGT Properties
-
Start value
-
Increment By
-
End value
-
Current value
-
Cycle (If
selected, server cycles through sequence range. Otherwise,
Stops with configured end value)
-
Reset
-
No of cached values
NOTE
-
Reset is disabled for Reusable SGT
-
Unlike other transformations, you cannot override SGT
properties at session level. This protects the integrity of sequence values
generated.
Aggregator
Transformation
Difference between Aggregator and Expression Transformation
We can
use Aggregator to perform calculations on groups. Where as the Expression
transformation permits you to calculations on row-by-row basis only.
The server performs aggregate calculations as it reads and stores necessary data group and row
data in an aggregator cache.
When Incremental aggregation occurs, the server passes new
source data through the mapping and uses historical cache data to perform new
calculation incrementally.
Components
-
Aggregate Expression
-
Group by port
-
Aggregate cache
When a session is being run using aggregator transformation,
the server creates Index and data caches in memory to process the
transformation. If the server requires more space, it stores overflow values in
cache files.
NOTE
The performance of aggregator transformation can be improved
by using “Sorted Input option”. When this is selected, the server assumes all
data is sorted by group.
Incremental
Aggregation
-
Using this, you apply captured changes in the source
to aggregate calculation in a session. If the source changes only incrementally
and you can capture changes, you can configure the session to process only
those changes
-
This allows the sever to update the target incrementally,
rather than forcing it to process the entire source and recalculate the same
calculations each time you run the session.
Steps:
-
The first time you run a session with incremental aggregation enabled,
the server process the entire source.
-
At the end of the session, the server stores aggregate data from that
session ran in two files, the index file and data file. The server creates the
file in local directory.
-
The second time you run the session, use only changes in the source as
source data for the session. The server then performs the following actions:
(1)
For each input record, the
session checks the historical information in the index file for a corresponding
group, then:
If it finds a corresponding group –
The server performs the
aggregate operation incrementally, using the aggregate data for that group, and
saves the incremental changes.
Else
Server
create a new group and saves the record data
Comments
Post a Comment