Informatica Interview Questions

  • Uploaded by: Anil Dhami
  • 0
  • 0
  • February 2021
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Informatica Interview Questions as PDF for free.

More details

  • Words: 8,772
  • Pages: 57
Loading documents preview...
Informatica Interview Questions What is Enterprise Data Warehousing? When the organization data is created at a single point of access it is called as enterprise data warehousing. Data can be provided with a global view to the server via a single source store. One can do periodic analysis on that same source. It gives better results but however the time required is high. What the difference is between a database, a data warehouse and a data mart? Database includes a set of sensibly affiliated data which is normally small in size as compared to data warehouse. While in data warehouse there are assortments of all sorts of data and data is taken out only according to the customer’s needs. On the other hand datamart is also a set of data which is designed to cater the needs of different domains. For instance an organization having different chunk of data for its different departments i.e. sales, finance, marketing etc. What is domain in Informatica? When all related relationships and nodes are covered by a sole organizational point, its called domain. Through this data management can be improved. What is the difference between a repository server and a powerhouse? Repository server controls the complete repository which includes tables, charts, and various procedures etc. Its main function is to assure the repository integrity and consistency. While a powerhouse server governs the implementation of various processes among the factors of server’s database repository. How many repositories can be created in Informatica? There can be any number of repositories in Informatica but eventually it depends on number of ports. What is the benefit of partitioning a session? Partitioning a session means solo implementation sequences within the session. It’s main purpose is to improve server’s operation and efficiency. Other transformations including extractions and other outputs of single partitions are carried out in parallel. How are indexes created after completing the load process? For the purpose of creating indexes after the load process, command tasks at session level can be used. Index creating scripts can be brought in line with the session’s workflow or the post session implementation sequence. Moreover this type of index creation cannot be controlled after the load process at transformation level.

Explain sessions. Explain how batches are used to combine executions? A teaching set that needs to be implemented to convert data from a source to a target is called a session. Session can be carried out using the session’s manager or pmcmd command. Batch execution can be used to combine sessions executions either in serial manner or in a parallel. Batches can have different sessions carrying forward in a parallel or serial manner. How many number of sessions can one group in batches? One can group any number of sessions but it would be easier for migration if the number of sessions are lesser in a batch. Explain the difference between mapping parameter and mapping variable? When values change during the session’s execution it’s called a mapping variable. Upon completion the Informatica server stores the end value of a variable and is reused when session restarts. Moreover those values that do not change during the sessions execution are called mapping parameters. Mapping procedure explains mapping parameters and their usage. Values are allocated to these parameters before starting the session. What is complex mapping? Following are the features of complex mapping. 1. Difficult requirements 2. Many numbers of transformations 3. Complex business logic How can one identify whether mapping is correct or not without connecting session? One can find whether the session is correct or not without connecting the session is with the help of debugging option. Can one use mapping parameter or variables created in one mapping into any other reusable transformation? Yes, One can do because reusable transformation does not contain any mapplet or mapping. Explain the use of aggregator cache file? Aggregator transformations are handled in chunks of instructions during each run. It stores transitional values which are found in local buffer memory. Aggregators provides extra cache files for storing the transformation values if extra memory is required.

Briefly describe lookup transformation? Lookup transformations are those transformations which have admission right to RDBMS based data set. The server makes the access faster by using the lookup tables to look at explicit table data or the database. Concluding data is achieved by matching the look up condition for all look up ports delivered during transformations. What does role playing dimension mean? The dimensions that are utilized for playing diversified roles while remaining in the same database domain are called role playing dimensions. How can repository reports be accessed without SQL or other transformations? Repositoryreports are established by metadata reporter. There is no need of SQL or other transformation since it is a web app. What are the types of metadata that stores in repository? The types of metadata includes Source definition, Target definition, Mappings, Mapplet, Transformations. Explain the code page compatibility? When data moves from one code page to another provided that both code pages have the same character sets then data loss cannot occur. All the characteristics of source page must be available in the target page. Moreover if all the characters of source page are not present in the target page then it would be a subset and data loss will definitely occur during transformation due the fact the two code pages are not compatible. How can you validate all mappings in the repository simultaneously? All the mappings cannot be validated simultaneously because each time only one mapping can be validated. Briefly explain the Aggregator transformation? It allows one to do aggregate calculations such as sums, averages etc. It is unlike expression transformation in which one can do calculations in groups. Describe Expression transformation? Values can be calculated in single row before writing on the target in this form of transformation. It can be used to perform non aggregate calculations. Conditional statements can also be tested before output results go to target tables.

What do you mean by filter transformation? It is a medium of filtering rows in a mapping. Data needs to be transformed through filter transformation and then filter condition is applied. Filter transformation contains all ports of input/output, and the rows which meet the condition can only pass through that filter. What is Joiner transformation? Joiner transformation combines two affiliated heterogeneous sources living in different locations while a source qualifier transformation can combine data emerging from a common source. What is Lookup transformation? It is used for looking up data in a relational table through mapping. Lookup definition from any relational database is imported from a source which has tendency of connecting client and server. One can use multiple lookup transformation in a mapping. How Union Transformation is used? It is a diverse input group transformation which can be used to combine data from different sources. It works like UNION All statement in SQL that is used to combine result set of two SELECT statements. What do you mean Incremental Aggregation? Option for incremental aggregation is enabled whenever a session is created for a mapping aggregate. Power center performs incremental aggregation through the mapping and historical cache data to perform new aggregation calculations incrementally. What is the difference between a connected look up and unconnected look up? When the inputs are taken directly from other transformations in the pipeline it is called connected lookup. While unconnected lookup doesn’t take inputs directly from other transformations, but it can be used in any transformations and can be raised as a function using LKP expression. So it can be said that an unconnected lookup can be called multiple times in mapping. What is a mapplet? A recyclable object that is using mapplet designer is called a mapplet. It permits one to reuse the transformation logic in multitude mappings moreover it also contains set of transformations. Briefly define reusable transformation?

Reusable transformation is used numerous times in mapping. It is different from other mappings which use the transformation since it is stored as a metadata. The transformations will be nullified in the mappings whenever any change in the reusable transformation is made. What does update strategy mean, and what are the different option of it? Row by row processing is done by informatica. Every row is inserted in the target table because it is marked as default. Update strategy is used whenever the row has to be updated or inserted based on some sequence. Moreover the condition must be specified in update strategy for the processed row to be marked as updated or inserted. What is the scenario which compels informatica server to reject files? This happens when it faces DD_Reject in update strategy transformation. Moreover it disrupts the database constraint filed in the rows was condensed. What is a surrogate key? Surrogate key is a replacement for the natural prime key. It is a unique identification for each row in the table. It is very beneficial because the natural primary key can change which eventually makes update more difficult. They are always used in form of a digit or integer. What are the prerequisite tasks to achieve the session partition? In order to perform session partition one need to configure the session to partition source data and then installing the Informatica server machine in multifold CPU’s. Which files are created during the session rums by informatics server? During session runs, the files created are namely Errors log, Bad file, Workflow low and session log. What is a Session task? It is a chunk of instruction the guides Power center server about how and when to transfer data from sources to targets. What does Command task mean? This specific task permits one or more than one shell commands in UNIX or DOS in windows to run during the workflow. What is a Standalone Command task? This task can be used anywhere in the workflow to run the shell commands.

What is meant by Pre and Post session shell command? Command task can be called as the pre or post session shell command for a session task. One can run it as pre session command r post session success command or post session failure command. What is a Predefined event? It is a file-watch event. It waits for a specific file to arrive at a specific location. What is a user defined event? User defined event can be described as a flow of tasks in the workflow. Events can be created and then raised as need arises. What is a Informatica Work flow? Work flow is a bunch of instructions that communicates server about how to implement tasks. What are the different tools in Informatica workflow manager? Following are the different tools in workflow manager namely 1. Task Designer 2. Task Developer 3. Workflow Designer Tell me any other tools for scheduling purpose other than workflow manager pmcmd? The tool for scheduling purpose other than workflow manager can be a third party tool like ‘CONTROL M’. What is OLAP (On-Line Analytical Processing? A method by which multi-dimensional analysis occurs. What are the different types of OLAP? Give an example? ROLAP eg.BO, MOLAP eg.Cognos, HOLAP, DOLAP What is a Worklet? When the workflow tasks are grouped in a set, it is called as worklet. Workflow tasks includes timer, decision, command, event wait, mail, session, link, assignment, control etc.

What is the use of target designer? Target Definition is created with the help of target designer. Where can we find the throughput option in informatica? Throughput option can be found in informatica in workflow monitor. In workflow monitor, right click on session, then click on get run properties and under source/target statistics we can find throughput option. What is a target load order? Target load order is specified on the basis of source qualifiers in a mapping. If there are multifold source qualifiers linked to different targets then one can entitle order in which informatica server loads data into targets.

Aggregator Transformation in Informatica

Aggregator Transformation This is the type an active transformation which allows you to calculate the summary’s for a “group of records”. An aggregated transformation is created with following components. Group by

This component defines the group for a specific port (s) which participates in aggregation Aggregate Expression

Use aggregate functions to drive the aggregate expression which can be develop either in variable ports (or) In only output ports Sorted input

“Group by ports” are sorted using a sorted transformation and receive the sorted data as a input to improve the performance of data aggregation. Keep the sorted transformation prior the aggregator transformation to perform sorting on fro up by ports.

Aggregate Cache

An integration service create aggregate ache for Unsorted inputs

The aggregate cache contains group by ports, non group by input ports and ouptput port which contains aggregate expressions.

This transformation offers even more functionality than SQL’s group by statements since one can apply conditional logic to groups within the aggregator transformation. Many different aggregate functions can be applied to individual output ports within the transformation. One is also able to code nested aggregate functions as well. Below is a list of these aggregate functions:           

AVG COUNT FIRST LAST MAX MEDIAN MIN PERCENTILE STDDEV SUM VARIANCE

Creating an Aggregator Transformation in Informatica Go to the Mapping Designer, click on transformation in the toolbar -> create.

Select the Aggregator transformation, enter the name and click create. Then click Done. This will create an aggregator transformation without ports. To create ports, you can either drag the ports to the aggregator transformation or create in the ports tab of the aggregator. Configuring the aggregator transformation You can configure the following components in aggregator transformation in Informatica. Aggregate Cache: The integration service stores the group values in the index cache and row data in the data cache. Aggregate Expression: You can enter expressions in the output port or variable port. Group by Port: This tells the integration service how to create groups. You can configure input, input/output or variable ports for the group. Sorted Input: This option can be used to improve the session performance. You can use this option only when the input to the aggregator transformation in sorted on group by ports.

Informatica Nested Aggregate Functions You can nest one aggregate function within another aggregate function. You can either use single-level aggregate functions or multiple nested functions in an aggregate transformation. You cannot use both single-level and nested aggregate functions in an aggregator transformation. The Mapping designer marks the mapping as invalid if an aggregator transformation contains both single-level and nested aggregate functions. If you want to create both single-level and nested aggregate functions, create separate aggregate transformations. Examples: MAX(SUM(sales)) Conditional clauses

You can reduce the number of rows processed in the aggregation by specifying a conditional clause. Example: SUM(salary, slaray>1000) This will include only the salaries which are greater than 1000 in the SUM calculation. Non Conditional clauses

You can also use non-aggregate functions in aggregator transformation. Example: IIF( SUM(sales) <20000, SUM(sales),0)

Note: By default, the Integration Service treats null values as NULL in aggregate functions. You can change this by configuring the integration service.

Incremental Aggregation in Informatica After you create a session that includes an Aggregator transformation, you can enable the session option, Incremental Aggregation. When the Integration Service performs incremental aggregation, it passes source data through the mapping and uses historical cache data to perform aggregation calculations incrementally.

Rank Transformation in Informatica

What is Rank Transformation This is of type an active transformation which allows you to calculate the “Top” and “Button” Performers. The rank transformation is created with following types of ports. Input port Output port Variable port (V) Rank Port (R) Rank Port The port which is participated in a rank calculation is known as Rank port. Variable Port A port which allows you to develop expression to store the data temporarily for rank calculation is known as variable port. Variable port support to write expressions which are required for rank calculation. Set the if properties to calculates the ranks Top or bottom Number of Rank’s

What is Rank Index? The Developer tool creates a RANKINDEX port for each Rank transformation. The Data Integration Service uses the Rank Index port to store the ranking position for each row in a group. After the Rank transformation identifies all rows that belong to a top or bottom rank, it then assigns rank index values. If two rank values match, they receive the same value in the rank index and the transformation skips the next value.

The RANKINDEX is an output port only. You can pass the rank index to another transformation in the mapping or directly to a target. Advanced Properties for Rank Transformation Top/Bottom – Specifies whether you want the top or bottom ranking for a column. Number of Ranks – Number of rows to include in the top or bottom ranking. Case-Sensitive – String Comparison Specifies whether the Data Integration Service uses case-sensitive string comparisons when it ranks strings. Clear this option to have the Data Integration Service ignore case for strings. Cache Directory- Local directory where the Data Integration Service creates the index cache files and data cache files. Default is the CacheDir system parameter. Rank Data Cache Size – Data cache size for the transformation. Default is Auto. Rank Index Cache Size Index – cache size for the transformation. Default is Auto. Tracing Level – Amount of detail that appears in the log for this transformation. You can choose terse, normal, verbose initialization, or verbose data. Default is normal. Defining Groups Like the Aggregator transformation, the Rank transformation lets you group information. Example: If you want to select the 10 most expensive items by manufacturer, you would first define a group for each manufacturer.

Rank Transformation in Informatica with Examples Procedure to create and configure Rank Transformation In the Mapping Designer, open a Mapping. Click Transformation > Create. Select Rank transformation. Enter a name and click Done. You will see one port RANKINDEX port already there. This port store the ranking of each record and can be used to populate target as well Add all additional port from source input which are going to be use in following transformation. Open the port tab and first check the Group by option for desired column ( for example deptno in our case)

Also check the Rank (R) option for the port which you want to do ranking. For example salary in our case.

We can define Group by indicator for multiple port, but Ranking can be done on single port only. Go to the properties tab, select the Top/Bottom value as Top and the Number of Ranks property as per need.

Click OK. Connect output ports of Rank Transformation to other transformation or target

Lookup Transformation in Informatica

Lookup Transformation The Lookup transformation is used to look up a source, source qualifier, or target to get the relevant data. You can look up flat file and relational tables. The Lookup transformation in Informatica works on similar lines as the joiner, with a few differences. For example, lookup does not require two sources. Lookup transformations can be connected and unconnected. They extract the data from the lookup table or file based on the lookup condition. This is the type passive transformation which allows you to perform lookup on relational table. Flat files, synonyms and views. When the mapping contain the workup transformation the integration service queries the lock up data and compares it with lookup input port values. The lookup transformation is created with following type of ports

1. 2. 3. 4.

Input port (I) Output port (O) Look up Ports (L) Return Port (R)

The lookup transformation support horizontal merging such as equijoin and non equijoin.

Informatica Lookup Transformation Uses Lookup transformation is used to return following tasks. – get a related value – in updating slowly changing dimension

Connected Lookups in Informatica In Connected lookup transformation, we connect input and output ports to other transformation in the mapping. For each input row, the Integration Service queries the lookup source or cache based on the lookup ports and the condition in the transformation. If the transformation is uncached or uses a static cache, the Integration Service returns values from the lookup query. If the transformation uses a dynamic cache, the Integration Service inserts the row into the cache when it does not find the row in the cache. When the Integration Service finds the row in the cache, it updates the row in the cache or leaves it unchanged. It flags the row as insert, update, or no change. The Integration Service passes return values from the query to the next transformation. If the transformation uses a dynamic cache, you can pass rows to a Filter or Router transformation to filter new rows to the target. If there is no match for the lookup condition, the Integration Service returns the default value for all output ports.

Informatica UnConnected Lookups An unconnected lookup is not part of mapping data flow  

It is neither connected to the source nor connected to the target An unconnected look up transformation can receive the multi input ports but returns single

   

output port which is designated by return port (R) An unconnected look up transformation is commonly used when the look is not need for every record. An unconnected look up transformation that supports to write expression The look up function used in conditional statement

Mapping Naming Standards in Informatica

Mapping Naming Standards: Stage to star mappings 1. 2. 3. 4. 5. 6.

M_Account_Dim (stage to star) M_Date_Dim (stage to star) M_Employee_Dim (stage to star) M_Market_Dim (stage to star) M_Product_Dim (stage to star) M_Transaction_Details_Fact (stage to star)

Date Dimension Loading: create the transformation type expression and sequence generator In expression transformation create the following Expression Port Name

Expression

Day_of_week ——> To_Decimal (To_Char(Full_Date , ‘D’) Date_Number_In_Month ——> To_Decimal (To_Char(Full_Date , ‘DD’) Date_Number_In_Year ——> To_Decimal (To_Char(Full_Date , ‘DDD’) Week_Number_In_Month —> To_Decimal (To_Char(Full_Date , ‘W’) Week_Number_In_Year —> To_Decimal (To_Char(Full_Date , ‘WW’) Month_Number —–> To_Decimal (To_Char(Full_Date , ‘mm’)

Month_Name —–> To_Char (Full_Date , ‘MON’) Quarter_Number —–> To_Decimal (To_Char(Full_Date , ‘Q’) Year —–> To_Decimal (To_Char(Full_Date , ‘yyyy’) Run_Week_Number —–> To_Decimal (To_Char(Full_Date , ‘W’)

Market Dimension – SedType1:

In expression transformation create a expression for Null handling for all column except INS_UPD_User Port Name Flag_Exception

Expression IFF(IS NULL(

) OR

Double click on router T/R click on Group Port Port Name Exp_Correct Exp_Exception In this Router T/R , copy the Exp_Exception Ports to Exp_Expression T/R and Exp T/R ports connect to the Excp_Target 

From Router T/R copy the Exp_correct ports to the Expression T/R

From transformation select the LOOK UP transformation From transformation select the Expression Port Name

Expression

Insert

Iff (ISNULL( True, False )

Update

Iff (NOT is NULL( True, False)

From Transformation select the Router T/R Double click on Router T/R click on group tab Port Name New Flag Update Flag From Transformation select the Exp and sequence generator

–> Copy the New Flag ports to the Exp –> Copy the Update Flag ports to the expression and connect Target

Product Dimension – SED Type1:

Note: Write a expression for NULL handling for all the ports except folowing ports. 1. INS_Upd_date 2. INS_Upd_User

Employee Dimension- SED Type 2:

Create a mapping with the name m_Employee_din Drop the source defination T_Employee Drop the target defination as a three Instances

Dm_T_Employee_DIm (New record to insert) Dm_T_Employee_DIm 1 (Update record to insert) Dm_T_Employee_DIm 2 (Update record to Update,enddate) Drop the Exception table Create the transformation type expression to handle the null values create the I/R type router to pass the correct date to one exp to pass the expression data to another expression I/R NOTE: define the null handling on all the ports except following ports Emp_insert_DATE EMP_I/p date_DATE INS_Upd_User

From expression transformation expression LOADING ACCOUNT DIMENSION: SED TYPE 2

Note: In expression T/R write an expression to handle the nulls on all ports except rp-detailslast-modified date, rp- details-lost-modified by From look up transformation click on condition From properties tab BSR: From exp T/R

New-flag

Iff(ISNULL(account- key ), ‘true’, ‘false’

Update-flag

Iff(not is null (account – key) AND

Is null(end-date)And (Client name ! =src-client) ‘True’ ‘false’ FACT table loading: There are three sources to load the data into fact table 1. Client order 2. Client allocation 3. Client execution Creation of source definition:

From tool menu select source analyzer from source menu select create enter the name stgtransaction-detail-fact Select database type oracle click create and done Double click the source definition selects the columns tab Column name Branch Account number Deport Client-flag Counter party-flag Full date Emp-web-SSO-ID market

Data type Varchar 2 Varchar 2 Varchar 2 Varchar 2 Varchar 2 date Varchar 2 Varchar 2

Precision 4 10 2 3 3 200 50

Scale

Not null

Key type

Product-ISI Client-order-amount Allocation-amount Execution Pending-amount    

Varchar 2 number number number number

200 20 20 20 20

Creating a mapping with the name m-transaction – detail- fact Drop the source and target definition Create the transformation type Double click the SQ transformation selects the properties tab

Transformation attribute Value SQL query In live query to join Clicks apply, click ok Double click on expression T/R select the ports tab UN check the output ports (except client order amount, allocation amount, execution amount) Create output port to make trim operation Branch

I

D-branch Similarly all ports this type

C

Trim (RT rim (C-branch) Full date

Trane (full date) (To (-char) (full date) Create transformation type expression exp- date- conversion From expression transformation copy the ports to the expression transformation Double click on express T/R select the ports tab

O

Uncheck the output port for a port name 0 – full – date Create an output port with the name full- date with the following expression To- date (to-char (full-date), (“mm-dd-yyyy”), “mm -dd-yyyy”) Click applies and clicks ok Create a look up T/R which perform a look up on data-dim market-dim, account –dim, product – dim, employee-dim From look up transformation copy the following ports to expression transformation Date- key, product-key, account-key, market-key, employee-key From expression transformation (from source) copy the 13 ports to the expression T/R Create the transformation type lookup which perform a look on target table (transaction- detailfact) From the expression T/R copy the following ports to the look up transformation (Branch, account-no, deport, client-flag, counterparty-flag, full-date, market-code, product-ISIN, EMP-web SSC-ID) Create the transformation type expression From source copy the entire source to expression transformation From look up T/R copy transaction-sequence-ID Double click on expression T/R, select the port tab New -flag

O

Iff (is null (transaction-sequence-ID) ‘True’,‘false’ Update -flag Iff (not is null (transactionsequence-ID) ‘True’,‘false’ Click applies, click ok

O

Create the transformation type router T/R From the Expression T/R copy all ports to the router T/R Double click on the router select the group tab

New-flag flag=’true’

new-

Update-flag update=’true’ Click applies and clicks ok

Define a new record follow: Create the transformation type Expression, Update strategy, Sequence generate

Define update follow: Create the T/R type expression update strategy

SQL Transformation: Create a target table which bellows structure SQL> create table EMP –SQL ( EMP no EMP name Job Dept no

Number (4) Varchar2 (15) Varchar2 (15) Number (4)

Hire date

Date

Sal

Dumber (7)

Comm

Number (5)

MGR 

Number (4) Go to tool menu and click on target designer import the target table

EMP- SQL:   

Create a mapping with the name m-cmp-SQL Drag and drop the source tmp and target (EMP-SQL) to the mapping designer Go to transformation menu click on create select the T/R type SQL T/R

Click on create select Query-mode

DB type is oracle click ok

Double click on the SQL transformation click on SQL ports Port

INPUT PORTS

name data type

P

S

I

o

Emp no Number OUT PUT PORT EMP no

Number

EMP name Varchar2 Job

Vcarchar2

Dept no

Number

Hire date

date

Sal

number

Comm.

number

MGR

number

Click on SQL query , write SQL query in that window

Bellow Select Emp number, E name, job, dept no, hire date, sal, comm, MGR, Where empno =? empno? Parameter Click ok Click applies and clicks ok Select emp no from SQ and link to the SQL T/R input ports Select the all the output column from SQL T/R link to the Target In session level provide the relation connection information (BSR-Reader) to SQL T/R

EVENT WAIT AND EVENT –RISE in Inforamtica

EVENT WAIT AND EVENT –RISE:       

Create a work flow with the name w-event-wait-raise Go to task menu click on create Select the task type as event wait, enter the name event-source click-create Drag the session s10, s20, s30 to the workflow designe Double click on the event wait Click on events tab Please event the name of the file to watch D:/path/filename.txt click on ok

Tasks and types of tasks:

Event rise task: The integration service rises the ‘user defined events ‘during the work flow run    

The user defined events declared in work flow properties event tab The event rise task are always use in conjunction with event wait task The event raise task send a signal to the event wait task that a particular set of predetermined event have occurred A user defined event is defined completion of tasks from start task to event raise task

Event wait task: An event wait task waits for a specific event to occur to start the next task in the work flow  

Pauses the processing of the pipeline until a specified event occurs There are two types of events can be defined

1. File watch event (predefined event) 2. User defined event

Procedure:         

From tools menu selects work flow designer From work flow menu select create ] Enter the work flow name w-star-load Selects the event tab From tools bar click on new Enter the event name dim- load – complete, click ok Drop the sessions beside the work flow From the workflow designer select task, select event raise and event wait Create the task type event raise and event wait

Make link between tasks as described in work flow Diagram Double click on event raise task, select properties tab

Attribute Value User defined event Dim-load-complete From work flow designer select task, select command and event wait

Double click on command task and select command tab From tool bar click on add a new command

Name Command Success Copy e:/result.txt to c:/ batch 4pm Click on apply and click ok Double click on event wait task selects the event tab Select the predefined event Enter the name of the file to watch C:\batch 4pm \result.txt Click on apply and click on ok

Assignment task: It allows you to assign the values (or) expressions to the user defined workflow variables The user defined workflow variables are declared in a work flow properties variable tab

Decision task: The decision task allows you to define the condition and the condition is evaluated by integration service, returns true (or) false. The decision task as a predefined condition variable called $decision-task-name. Condition, which is assigned with results of decision condition 

Use the task instead of using multiple link condition

Procedure:   

From tools menu, selects workflow designer, from the work flow menu select create Enter the work flow name w-daily-weekly-wad Select the variable tab from tool bar click on add a new variable

Name Data type Presentence $$wkf-runs Integer

Enter the default value =0 Bellow Click ok       

Drop the session beside the work flow Create the task type assignment and decision make a links between the task as described above define the link condition between the S10 and assignment task =succeeded double click on assignment task, select the expression tab from tool bar click on add a new expression

USER DEFINED VARIABLE OPERATOR EXPRESSION $$WKF-runs = $$wkf-RUNS+ Click on apply and click ok

$S10.status

Define the link condition between assignment and decision $assign. Status=succeeded Double click on decision task select the properties tab

Attribute

Value

Decision name MOD ($$WHF-runs, 7 )=0

Click on apply and click ok Define the link condition between the decision and S20 $Decision. Condition=true To view the presentence value right click on work flow select

ETL Project Architecture in Informatica

ETL Project Architecture: These are two stages defined in current project architecture 1. ETL Stage1 2. ETL Stage2

Data Profiling: (ods)

Its process of study and analyzing source data. We can detect records with in the Null Values, duplicate Records, Inconsistency data and data definition.

Key Points – Sequence Generator:

1) Start value — First Value 2) Increment By — Amount to Increment at each iteration 3) End Value — Highest value to use 4) Cycle — If checked, sequence generator returns to start value when end value is reached otherwise it stops. 5) Number of cached Values — Enables storing multiple values when the same sequence generator is used in multiple sessions simultaneously 6) Reser — it checked each session returns to starts value; otherwise each new session continues from the last stored value. 7) Trancing Level — Level of detail to be written to session logs

Design a Mapping to perform Round robin Loading:

Senarios 2 :

Workflow Senarios: A workflow is having a 5 session run first 4 sessions in parallel, If all four sesion are soccers then execute 5 sesion Hint: Use Link Condition

Dynamic LOOKUP Cache: The Integration Service inserts the records and updates the records in the cache. Use dynamic LOOKUP cache for eliminating duplicates (or) In implementing slowly changing dimensions Type1 The dynamic LOOKUP cache is used “when you perform a LOOKUP on target table” The dynamic LOOKUP transformation allows for the synchronization of the dynamic target LOOKUP table image in the memory with its physical tale in the database.

New LOOKUP row: New look up Description 0

The integration service does not update (or) Insert the row in the cache

1

The integration service inserts the row into the cache

2

The integration service update the row in the cache

Key Points:

The LOOKUP transformation associates port matches a LOOKUP input port with the corresponding port in the LOOKUP cache. The “Ignore NULL ports for updates” should be checked to port where NULL The “Ignore a camparision” should be checked for any port that is not to be compared. The flag “New LOOKUP Row” indicates the type of row manipulation of the cache. If an input row creates an insert in the LOOKUP cache. The flag is set to “1” if an input row creates an updates of the LOOKUP ache the flag is set to “2”. If no changes is detected the flow is set to “0”. A filter or router T/R can be used with an updates to set the proper row flag to update a target table.

Procedure: Source definition emp Create a target defination with the name emp_DC Empkey + All 8 columns

Create a mapping with the name m-employee-dynamic-cache .   

Drop the source defination emp Drop the target defination as a two Insentation Create the transformation type LOOKUP which perform a LOOKUP on the target table (emp_Dc)

From SQ-emp copy all the ports to the LOOKUP transformation Double click on the LOOKUP transformation select the condition tab LOOKUP table Column Operator Transformation port emp no

=

EmpNO

Select the properties tab Transformation attribute

Value

Connection information

BSR – Writer

Dynamic look up cache Insert else update

Select the ports tab For a port Name empkey select the data type Integer Port Name Ename

Associated Port Ename1

Job

Job1

MGR

MGR1

Hiredate sal comm

Hiredate1 sal1 comm1

Deptno

Deptno1

Click apply and click ok Create the transformation type router From LOOKUP transformation copy the following ports to the router transformation (new LOOKUP row, empkey, empno, ……..deptno) Double click the router transformation select the groups tab GroupName

GroupFilter Condition

New

New LOOKUP row = 1

Update

New LOOKUP row = 2

Click apply and click Ok

New Record Flow: Create the transformation type Update strategy From New output group copy the ports to [except new LOOKUP] Update strategy and develop the express DD-Insert . From Update strategy transformation connect the ports to target.

Update Flow: Create the transformation type Update strategy From Updateoutputgroup copy th eports Update strategy transformation and develop the following expression DD-Update From Update strategy transformation connect the port to target

UnConnected stored procedure – Drop and Create Index: Create the following two stored procedures in the target databse account.

Create or Replace procedure Emp_Create_Index { V_Table_Name in varchar2; V_Index_Col_Name in varchar2; V_Index_Name in varchar2; } as begin Execute Immediate Create Index space ‘ll V_Index_Namell’ space on space ‘llV_Table_Namell’ C’llV_Index_Col_Namell’)’; end; /

Procedure 2: Create or REplace procedure Emp_drop_Index { V_Index_Name in varchar2 } as begin Execute Immediate ‘Drop Index space ‘ ll V_Index_Name; end;

/

SetUp Target Table Defination With Index: SQL> Create table emp_TGt as select * from scott.emp where 1=2; enter Table create SQL> Select INDEX_NAME From USER_INDEXEX WHERE TABLE_NAME\’EMP-TGT’; SQL> Create Index EMP_IDX ON EMP_TGT(EMP NO);

Mapping Designing: Create source defination with the Name emp Create a target defiantion withe the Name emp-TGT [using target designer tool] Create a mapping with the Name m_drop_create_Index Drop the source and target defination From Sourcequalifier Connect the ports to target Create two stored procedure transformations with the Name

SP_Drop_Index and SI_Create_Index: Double click on the stored procedure transformation with the name SP_drop_Index Select the properties tab

Transformation Attribute stored procedure

Value Target_preeLoad

call text connection information

emp_drop_index(“emp_Index”) BSR_Writer

Click Apply and click Ok Double click on stored procedure with the Name Sp_create_Index Select the properties tab

Transformation Attribute stored procedure type call text connection Information

Value terget_postLoad emp_create_Index(“Emp_TGT”) BSR-Writer

Click Apply and Click Ok

UnConnected Stored Procedure – Returning Multiplpe Output Ports: Execute the following stored procedure in the source database account(username scott) SQL> Create or Replace procedure Emp{ V_empNo IN Number; TOTSAL OUT Number; TAX OUT Number; HRA OUT Number; } as begin

SELECT SAL+NVL(COMM, 0), Sal *0.1, sal *0.4 INTO TOTALSAL , TAX, HRA FROM EMP WHERE EMPNO = V_EMPNO; END; / Create source defination with the Name emp Create a target defination with the Name emp_prc_var EmpNo, EName, Job, Sal, TOTALsal, TAX, HRA,deptno Create a mapping with the Name m_prc_var drop the source and target defination Create the transformation type storedprocedure and expression From source qualifier capo the required ports to the expression transformation Double click on expression transformation select the ports tab

Port name

Data type

precision

scale

V – tax

Decimal

7

2

V – HRA

decimal

7

2

TOTAL SAL

7

2

TAX

7

2

V – Tax

HRA

7

2

V – HRA

:SP – PROC – VRA(Empno,proc –result,V- tax, V- HRA) Click Apply and Click Ok

I O

V

Expression

From expression transformation connects the ports to the targets. From repository menu click on save

SHORTCUTS: –>Shortcut is a reusable component –> For create a shortcut , first we have to create a sharable folders –> Open the destination folder –> Drag the object (source, target, mapping) –> From the shared folder into the destination folder i.e., source analyzer and mapping designer –> You can create a shortcut to a shred folder in the same repository –> When you will create a shortcut in the same repository that is known as local shortcut.

Creation of Sharable Folder: Open the client repository manager from folder menu select create enter the folder name AXEDW (any name) select allow shortcut [-/] click oko open the client power center designer Activate the sharable folder (AXEDW) Create source defination with the emp in the source analyser tool Activate the destination folder wher we want too shortcut. From sharable folder drag the source defination emp , drop on source analyzer workspace which belongs to destination folder Click on Yes

Version Control: –> By using version control we are maintaining the history of the metadata objects. –> A versioned repository stores multiple versions of an objects –> Each version is a seperate object with unique number. –> You can perform the following change management tasks to create and manage multiple version of objects in the repository. 1. Checkin 2. Check out

Check in:   

You must save an object before you can check it in. When you check in an object the repository creates a new version of the object and assigns it a version member. The repository increments the version member when you check in an object.

Check Out:   

It edit an object you must check out the object. When you check out an object the repository obtains a write intent lock on the object. No other users can edit the object when you have it checked out.

Check in: Select the mapping m_customer_dimension , right click on that mapping select the versioning and click on checkIn Name:Date:Goal

Click on Applyall

Check out: Select the mapping m_customer_dimension , right click on that mapping select the versioning and click on checkout Comments

Click on Apply to all 

From version menu select the two version of the mappping , right click on compare , select the selected version

Click on save file , enter the file name 4pm_compare click on save

Partition Points Overview: –> Partition points mark the boundaries between thread in a pipeline the integration service redistributes rows of do at partition points. –> You can edit partition to increase the number of transformation threads and increase threads and increase session performance.

Types of Partition: 1. 2. 3. 4. 5.

Key range partition Pass through partition Round robin partition Hash partition Database partition

1) Key Range Partition: –> With key range partitioning service attributes rows of data based on a port that you define as the partition key –> For each port you define a range of values.

–> The integration service uses the key and ranges send rows to the appropriate partition. (1) How many repository are created in real time? ans: minmum 3 , maximum 5 1. 2. 3. 4.

Designer Testing(Q A) Pre Population Production

Procedure: –> Create a mapping with the name m_emp_partition –> Drag and drop you source target emp_partition to the mapping designer –> Copy all the columns to sa_emp and connect to the target –> Doble click on the sesion , click on mapping tab –> From left pane click on partition (below) tab –> Select the source qualifier (SQ_emp) , click on edit partition point –> Select the parition type keyrange Name

Destination

Partition 1 Partition 2 Partition 3 Click on Ok Click on edit key select the SAL column click on Add click Ok Click on target emp_partition , click on edit partition point Select the partition type , keyrange , click on Ok Select the SAL column , click on Add

Click Ok , save on repository

Business Requirements: Senario 1: Client Name

BSR(customer Name)

MARKET

HYD

Product Client order

IBMshare 5000

Account

RAM(A person from Bank of America)

Client allocation

500

Client execution

500

Pending Amount

0

Senario 2: Client Name MARKET Product Client order Account Client allocation

HYD IBMshare 1000 RAM(A person from Bank of America) 1000

Client execution

700

Pending Amount

300

Scenario 3: Multiple client order and single market order

Scenario 4: Designing Database from Requirements: The datamart is designed with following dimensional table and facttables. DM_T_DATE_DIM

* ERVIN is a data tool

DM_T_MARKET_DIM

* Tode is database tool

DM_T_PRODUCT_DIM DM_T_ACCOUNT_DIM DM_T_EMPLOYEE_DIM TRANSACTION_DETAIL_FACT

ETLStage1 Implementation: The source is defined with the flatfiles, the following are the delimited flatfiles which provides the data for extraction. Accounts.Txt Market.Txt Product.Txt

Client_Allocation.Txt Client_Execution.Txt Client_Order.Txt Employee.Txt

Defining Staging Database: Define the staging database as a target to perform the dataprofiling. Create the following table in a staging database account

List of Table: T_Product T_Account T_Employee T_Market Client_Order Client_Allocation Client_Execution T_date

ETL Stage 1 Mappings: Design the simple pass mappings which migrates the data from source to staging. MT_stg_Account_Flatfile_Ora(source to stage) MT_stg_Client_Allocation_Flatfile_Ora(source to stage)

MT_stg_Client_Execution_Flatfile_Ora(source to stage) MT_stg_Client_Order_Flatfile_Ora(source to stage) MT_stg_Employee_Flatfile_Ora(source to stage) MT_stg_Market_Flatfile_Ora(source to stage) MT_stg_Product_Flatfile_Ora(source to stage)

Defining Source metadata: Logon to oracle with the user account system –> Create a user Account with the Name SRC –> Execute the following source table structure Product_SRC Account_SRC Employee_SRC Market_SRC Client_Order_SRC Client_Allocation_SRC Client_Execution_SR Create odbe connection with the Name SRC_odbc Import the source table definations from user account SRC Convert the table definations to flatfile

Defining Staging Metadata: Logon to the oracle with the user accout system

Create a user account stg Executing the following table structures T_Product T_Account T_Employee T_Market Client_Order Client_Allocation Client_Execution T_date Create the following stored procedure to populate the data into the stage table T_date SQL> Create or Replace procedure date_stg_proc as v_start_date date : = to_date(’01-01-2007′,’dd-mm-yyyy’) begin for i in 1…2000 Loop insert into t_date values (v_start_date); v_start_date : = v_start_date + 1; end Loop; end; Execute the above stored procedure in staging database SQL> Exec date_stg_proc;

SQL> commit;

ETL Stage2 Implementation: Source System: Define the staging database source system with the following table T_Account T_Product T_Market T_Employee Client_Order Client_Execution T_Date

Defining The Target System: Logon to the oracle with the user Account System SQL> Create user CDM identified by CDM SQL> grant DBA to CDM SQL> connect CDM/CDM Execute the following tables DM_T_Account_DIM DM_T_Account_DIM_EXCEP DM_T_DATE_DIM DM_T_EMPLOYEE_DIM

DM_T_EMPLOYEE_DIM_EXCEP DM_T_Product_DIM DM_T_Product_DIM_EXCEP DM_T_MARKET_DIM DM_T_MARKET_DIM_EXCEP TRANSACTION_DETAIL_FACT create the ODBC connection with the name CDM_ODBC Import the target table defination from CDM user Account –> product and exception product make to change the product key and excep key will make it as a forst and generate SQL.

Designing and mapping with SED implementation in Informatica Design a mapping with SED type 1 implementation:

A Type one dimension keeps only current data in the target. It doesn’t any history.

Procedure:

Create a source definition with name EMP Create a target definition with the name EMP –DIM – Type1(Emp key, Emp no, tname, job,sal) Note:

Emp key should be primary key Creating a mapping with the name M_ EMPLOYEE _ DIM_Type1 drop the source definition Emp Drop the target definition as a 2 instances

1. EMP_ DIM_Type1(insert) 2. EMP_ DIM_Type1(update)

From transformation menu select create Select the transformation type look up Enter the name LKP_ TRG click on create From target select the table EMP _DIM_Type1 click ok Click done From source qualifier (SQ- EMP) copy the port empno to the look transformation. Double click on the look up transformation select condition tab from tool brace click on add a new condition.

Look up table column

Operation

Empno

=

Transformation Port Emp no1

Click apply ,click ok Create the transformation type expression transformation (Exe _src_lookup) From SQ_ EMP copy the following ports to expression transformation empno, ename, job, sal From look up transformation copy the port emp key to the expression transformation. Double click on expression transformation select the ports tab

Port Name

Data type

Precision

New- Flag

String

10

Iff(IS NULL(temp key),’true’, ’False’)

Update – Flag

String

10

Iff(IS NOT NULL(temp key),’true’, ’False’)

Create the transformation type router

IOV Expression

From expression transformation copy the port to the router transformation ,couble click on router transformation select group tab

Group Name

Group Filter condition

New- Record

New – flag =’True’

Update – Record

Update – flag=’True’

Click apply and click ok

Defining new records data flow:

Create the transformation type expression , update strategy and sequence generator From router transformation , from new record output group copy the ports to the expression transformation (emp no, Ename, job,sal) from expression transformation copy the ports to update strategy transformation. Double click on update strategy transformation select properties tab

Transformation Attribute

Value

Update strategy expression

DD –insert (or) 0’

Click apply and click ok From updating transformation connect the ports to the target , from sequence generator transformation connect the net work port to the EMP key of target table.

Defining update record data flow:

Create the transformation type expression and update strategy from router transformation , from update record output group copy the following ports to the expression transformation (Emp key, emp no, ename, job,sal)

From expression transformation copy the ports to the update strategy transformation. Double click on update strategy transformation select properties tab

Transformation Attribute

Value

Update strategy expression

DD- update

From update strategy transformation connect the ports to the target from repository menu click on save. Create a session with the name S_M _EMPLOYEE_DIM_Type1, double click the session select the properties tab

Attribute

Value

Dollar source connection value

Batch 4pm – source – reader

Dollar target connection value

Batch 4pm- target – writer

Select the mapping tab set the reader and writer connections, click apply and click ok SQL> update emp set sal =sal +11000 where empno=10; SQL> commit;

Design a mapping with SED type2 implementation:

A type 2 dimension stores complete history in the target for each update at insert a new record in the target. There are 3 different methods to maintain the history 1. Keep the version number in a separate column 2. Mark the current dimension record with the flag 3. Use start date and end date [date range]

Procedure:

Create a source definition with the name EMP Create a target definition with the name EMP- type2 (emp key, emp no, enmae, sal, job, deptno, version) Create a mapping with the name M_EMP_DIM_TYPE2 Drop the source definition Drop the target definition as a 2 instances 1. EMP_type2(new record to insert) 2. EMP _type2 (update record as insert)

Transformation menu select create select the transformation type look up enter name LKP – TRG click on create From target select table EMP- type2 click ok click on done From SQ- EMP copy the port EMPNO to the look up transformation Double click on the lookup transformation select the condition tab tool bar click on add a new condition.

Look – up table column Emp no

Operator

Transformation port

=

Empno1

Click apply and click ok Create the transformation type expression from source qualifier, copy the following ports to the expression transformation From look up transformation copy the following ports to expression transformation(EMP key,sal,version) Double click on expression T/R select ports tab

Port Name

Data type Precision Scale IOV Expression

New – Flag

String

10

IFF(IS NULL(Emp key) ‘True’, ‘False’)

Update – Flag String

10

IFF(NOT IS NULL(Emp key),AND(Sal!=Sal1)’True’, ‘False’)

Click apply and click ok Create the transformation type router from expression T/R, copy the following ports to the router transformation(EMPno, Ename,job,sal,deptno,version,new-flag,update-flag) Double click the router T/R select the groups tab

Group Name

Group filter condition

New – record

New – Flag = ‘True’

Update – record

Update – Flag = ‘True’

Click apply and click ok

Defining new record dataflow:

Create the transformation type expression, update strategy and sequence generator transformation From router T/R , from new record output group copy the following ports to the expression transformation emp no, ename, job,sal,deptno Double click on expression t/R select ports tab

Port Name

Data type

Precession

Version

Decimal

5

Click apply ,click ok

IOV

Expression 0

From expression T/R copy the ports to update strategy T/R Double click on update strategy T/R select properties

Transformation Attribute

Value

Update strategy expression

0 (or) DD- Insert

Click apply , click ok From update strategy T/R connect the port to target From sequence generator T/R connect the textual port to the EMP key of the target table

Defining the update records as insert:

Creating the transformation type expression and update strategy from router T/R , for update record output group copy the following ports to expression T/R (EMPno, Ename,job,sal,deptno,version) Double click on expression T/R select the ports tab Uncheck the output port for a port name version From tool bar click on add a new port

Port Name

Data type

Precession

Version

Decimal

5

IOV

Expression Version 3+2

Click apply and click ok From expression T/R copy the ports to update strategies T/R and develop the following strategy expression DD- insert From update strategy T/R connect the ports to target

From sequence generator T/R connect the to the EMP key of target table.

Look up SQL overwrite:

Double click on the look up T/R select properties tab T/R Attribute

Value Select

Out. Emp Key as Emp key, Out. Ename as Ename, Out. Job as Job, Out. Sal as Sal, Look up SQL overwrite

Out. Dept no as dept no, Out . version as version, Out . emp no as emp no From emp – type2 out Where out. Emp key =(select max(Inn .Empkey) From Emp – type2 Inn Where Inn .emp no = out.empno)

Related Documents


More Documents from "ullas"