Saturday, 21 January 2017

Informatica Interview Question and Answers



1.
What is Data warehouse?

According to Bill Inmon, known as father of Data warehousing. “A Data warehouse is a subject oriented, integrated ,time variant, non volatile collection of data in support of management’s decision making process”.

2.
what are the types of data warehouses?

There are three types of data warehouses
  • Enterprise Data Warehouse
  • ODS (operational data store)
  • Data Mart

3.
What is Data mart?

A data mart is a subset of data warehouse that is designed for a particular line of business, such as sales, marketing, or finance. In a dependent data mart, data can be derived from an enterprise wide data warehouse. In an independent data mart can be collected directly from sources.

4.
What is star schema?

A star schema is the simplest form of data warehouse schema that consists of one or more dimensional and fact tables.

5.
What is snow flake schema?

A Snowflake schema is nothing but one Fact table which is connected to a number of dimension tables, The snowflake and star schema are methods of storing data which are multidimensional in nature.
6.
What are ETL Tools?

ETL Tools are stands for Extraction, Transformation, and Loading the data into the data warehouse for decision making. ETL refers to the methods involved in accessing and manipulating source data and loading it into target database.

7.
What are Dimensional table?

Dimension tables contain attributes that describe fact records in the fact table.

8.
What is data Modelling?

Data Modeling is representing the real world set of data structures or entities and their relationship in their of data models, required for a database.Data Modelling consists of various types like :
  • Conceptual data modeling
  • Logical data modeling
  • Physical data modeling
  • Enterprise data modeling
  • Relation data modeling
  • Dimensional data modeling.

9.
What is Surrogate key?

Surrogate key is a substitution for the natural primary key. It is just a unique identifier or number of each row that can be used for the primary key to the table.

10.
What is Data Mining?

A Data Mining is the process of analyzing data from different perpectives and summarizing it into useful information.
11.
What is Operational Data Store?

A ODS is an operational data store which comes as a second layer in a datawarehouse architecture. It has got the characteristics of both OLTP and DSS systems.

12.
What is the Difference between OLTP and OLAP?

OLTP is nothing but OnLine Transaction Processing which contains a normalised tables .
But OLAP(Online Analtical Programming) contains the history of OLTP data which is non-volatile acts as a Decisions Support System.

13.
How many types of dimensions are available in Informatica?

There are three types of dimensions available are :
  • Junk dimension
  • Degenerative Dimension
  • Conformed Dimension

14.
What is Difference between ER Modeling and Dimensional Modeling?

ER Modeling is used for normalizing the OLTP database design.
Dimesional modeling is used for de-normalizing the ROLAP / MOLAP design.

15.
What is the maplet?

Maplet is a set of transformations that you build in the maplet designer and you can use in multiple mapings.
16.
What is Session and Batches?

Session: A session is a set of commands that describes the server to move data to the target.
Batch: A Batch is set of tasks that may include one or more numbar of tasks (sessions, ewent wait, email, command, etc).

17.
What are slowly changing dimensions?

Dimensions that change overtime are called Slowly Changing Dimensions(SCD).
  • Slowly Changing Dimension-Type1 : Which has only current records.
  • Slowly Changing Dimension-Type2 : Which has current records + historical records.
  • Slowly Changing Dimension-Type3 : Which has current records + one previous records.

18.
What are 2 modes of data movement in Informatica Server?

There are two modes of data movement are:
Normal Mode in which for every record a separate DML stmt will be prepared and executed.
Bulk Mode in which for multiple records DML stmt will be preapred and executed thus improves performance.

19.
What is the difference between Active and Passive transformation?

Active Transformation:An active transformation can change the number of rows that pass through it from source to target i.e it eliminates rows that do not meet the condition in transformation.
Passive Transformation:A passive transformation does not change the number of rows that pass through it i.e it passes all rows through the transformation.

20.
What is the difference between connected and unconnected transformation?

Connected Transformation:Connected transformation is connected to other transformations or directly to target table in the mapping.
UnConnected Transformation:An unconnected transformation is not connected to other transformations in the mapping. It is called within another transformation, and returns a value to that transformation.
21.
What are different types of transformations available in Informatica?

There are various types of transformations available in Informatica :
  • Aggregator
  • Application Source Qualifier
  • Custom
  • Expression
  • External Procedure
  • Filter
  • Input
  • Joiner
  • Lookup
  • Normalizer
  • Output
  • Rank
  • Router
  • Sequence Generator
  • Sorter
  • Source Qualifier
  • Stored Procedure
  • Transaction Control
  • Union
  • Update Strategy
  • XML Generator
  • XML Parser
  • XML Source Qualifier

22.
What are Aggregator Transformation?

Aggregator transformation is an Active and Connected transformation. This transformation is useful to perform calculations such as averages and sums (mainly to perform calculations on multiple rows or groups).

23.
What are Expression transformation?

Expression transformation is a Passive and Connected transformation. This can be used to calculate values in a single row before writing to the target.

24.
What are Filter transformation?

Filter transformation is an Active and Connected transformation. This can be used to filter rows in a mapping that do not meet the condition.

25.
What are Joiner transformation?

Joiner Transformation is an Active and Connected transformation. This can be used to join two sources coming from two different locations or from same location.
26.
Why we use lookup transformations?

Lookup Transformations can access data from relational tables that are not sources in mapping.

27.
What are Normalizer transformation?

Normalizer Transformation is an Active and Connected transformation. It is used mainly with COBOL sources where most of the time data is stored in denormalized format. Also, Normalizer transformation can be used to create multiple rows from a single row of data.

28.
What are Rank transformation?

Rank transformation is an Active and Connected transformation. It is used to select the top or bottom rank of data.

29.
What are Router transformation?

Router transformationis an Active and Connected transformation. It is similar to filter transformation. The only difference is, filter transformation drops the data that do not meet the condition whereas router has an option to capture the data that do not meet the condition. It is useful to test multiple conditions.

30.
What are Sorter transformation?

Sorter transformation is a Connected and an Active transformation. It allows to sort data either in ascending or descending order according to a specified field.
31.
Name four output files that informations server creates during session running?

  • Session Log
  • Workflow Log
  • Errors Log
  • Badfile

32.
Why we use stored procedure transformation?

A stored procedure transformation is an important tool for populating and maintaing databases.

33.
What are the difference between static cache and dynamic cache?

Dynamic cache decreases the performance in comparision to static cache.
Static cache do not see such things just insert data as many times as it is coming

34.
Define maping and sessions?

Maping: It is a set of source and target definitions linked by transformation objects that define the rules for transformation.
Session : It is a set of instructions that describe how and when to move data from source to targets.

35.
What is a command that used to run a batch?

pmcmd is used to start a batch.
36.
What is Datadriven?

The informatica server follows instructions coded into update strategy transformations with in the session maping determine how to flag records for insert, update, delete or reject.

37.
What is power center repository?

The PowerCenter repository allows you to share metadata across repositories to create a data mart domain.

38.
What is parameter file?

A parameter file is a file created by text editor such as word pad or notepad. U can define the following values in parameter file.
Maping parameters
Maping variables
session parameters.

39.
What are the types of lookup caches?

  • Static cache
  • Dynamic cache
  • Persistent cache
  • Shared cache
  • Recache.

40.
What are Stored Procedure transformation?

Stored Procedure transformation is an Passive & Connected or UnConnected transformation. It is useful to automate time-consuming tasks and it is also used in error handling, to drop and recreate indexes and to determine the space in database, a specialized calculation.
41.
What is fact table?

The centralized table in a star schema is called as fact table. Fact tables are three types
  • additive
  • non-additive
  • semi additive

42.
What is Data warehouse?

According to Bill Inmon, known as father of Data warehousing. “A Data warehouse is a subject oriented, integrated ,time variant, non volatile collection of data in support of management’s decision making process”.

43.
What is Data Transformation Manager(DTM)?

After the load manager performs validations for the session, it creates the DTM process. The DTM process is the second process associated with the session run.

44.
How can you define a transformation?

A transformation is a repository object that generates, modifies, or passes data. The Designer provides a set of transformations that perform specific functions.

45.
What are Lookup transformation?

Lookup transformation is Passive and it can be both Connected and UnConnected as well. It is used to look up data in a relational table, view, or synonym. Lookup definition can be imported either from source or from target tables.
46.
What are Source Qualifier transformation?

Source Qualifier transformation is an Active and Connected transformation. When adding a relational or a flat file source definition to a mapping, it is must to connect it to a Source Qualifier transformation. The Source Qualifier performs the various tasks such as overriding default SQL query, filtering records; join data from two or more tables etc.

47.
What is difference between maplet and reusable transformation?

Maplet consists of set of transformations that is reusable.
A reusable transformation is a single transformation that can be reusable.

48.
What are Update Strategy transformation?

Update strategy transformation is an active and connected transformation. It is used to update data in target table, either to maintain history of data or recent changes. You can specify how to treat source rows in table, insert, update, delete or data driven.

49.
How many types of dimensions are available in informatica?

There are three types of dimensions.
  • Star schema
  • Snowflake schema
  • Glaxy schema

50.
What is difference between maplet and reusable transformation?

Maplet : one or more transformations.
set of transformations that are reusable.
Reusable transformation: only one transformation.
Single transformation which is reusable.
51.
What are different types of parsing?

  • Quick parsing
  • Thorough parsing

52.
What are Lookup and Fact Tables?

A lookup (Dimension) table contains information about the entities. In general the Dimension and details objects are derived from lookup tables. A fact table contains the statistical information about transactions.

53.
What is Designer?

Designer is the Business objects product that is intended to develop the universes. These universe is the semantic - layer of the database structure that isolates from technical issues.

54.
What is Surrogate Key?

Surrogate keys are keys that are maintained within the data warehouse instead of keys taken from source data systems.

55.
What are the pitfalls of DWH?

  • Limited value of data (Historical data not current data)
  • DW solutions complicate business processes
  • DW solutions may have too long a learning curve
  • Costs of cleaning, capturing and delivering data
56.
How do you handle large datasets?

By Using Bulk utility mode at the session level and if possible by disabling constraints after consulting with DBA; Using Bulk utility mode would mean that no writing is taking place in Roll Back Segment so loading is faster. However the pitfall is that recovery is not possible.

57.
What are the limitations of handling long datatypes?

When the length of a datatype (e.g varchar2(4000)) goes beyond 4000, Informatica makes this as varchar2(2000).

58.
What are the types of OLAP?

ROLAP (Relational OLAP) - Users see their data organized in cubes and dimensions but the data is really stored in RDBMS. The performance is slow. A storage mode that uses tables in a relational database to store multidimensional structures.
MOLAP (Multidimensional OLAP) - Users see their data organized in cubes and dimensions but the data is really stored in MDBMS. Query performance is fast.
HOLAP (Hybrid OLAP) - It is a combination of ROLAP and HOLAP. EG: HOLOs. In this one will find data queries on aggregated data as well as detailed data.

59.
What is the difference between data mart and data warehouse?

Data mart used on a business division/department level where as data warehouse is used on enterprise level.

60.
What is Meta data?

Data about the data, contains the location and description of data warehouse system components such as name, definitions and end user views.
61.
How does the recovery mode work in informatica?

In case of load failure an entry is made in OPB_SERV_ENTRY(?) table from where the extent of loading can be determined.

62.
What is Aggregate Awareness?

Aggregate awareness is a feature of DESIGNER that makes use of aggregate tables in a database. These are tables that contain pre-calculated data. The purpose of these tables is to enhance the performance of SQL transactions; they are thus used to speed up the execution of queries.

63.
What is a difference between OLTP and OLAP?

OLTP
  • It focus on day to day transaction.
  • Data Stability
  • Dynamic
  • Highly normalized.
  • Access Frequency High.
OLAP
  • It focus on future predictions and decisions
  • Static until refreshed
  • Demoralized and replicated data
  • Medium to low.

64.
When should you use a star schema and when a snowflake schema?

A star schema is a simplest data warehouse schema. Snowflake schema is similar to the star schema. It normalizes dimension table to save data storage space. It can be used to represent hierarchies of information.

65.
What parameters can be tweaked to get better performance from a session?

DTM shared memory, Index cache memory, Data cache memory, by indexing, using persistent cache, increasing commit interval etc.
66.
What are the benefits of DWH?

  • Immediate information delivery
  • Data Integration from across, even outside the organization
  • Future vision of historical trends
  • Tools for looking at data in new ways
  • Enhanced customer service.

67.
Is It Possible to invoke Informatica batch or session outside Informatica UI?

PMCMD.

68.
Why we are going for surrogate keys?

  • Data tables in various source systems may use different keys for the same entity.
  • Keys may change or be reused in the source data systems.
  • Changes in organizational structures may move keys in the hierarchy.

69.
When is more convenient to join in the database or in Informatica?

  • Definitely at the database level
  • at the source Qualifier query itself
  • rather than using Joiner transformation

70.
How do you measure session performance?

By checking Collect performance Data check box.

What are the differences between Connected and Unconnected Lookup?

The differences are illustrated in the below table
Connected Lookup
Unconnected Lookup
Connected lookup participates in dataflow and receives input directly from the pipeline
Unconnected lookup receives input values from the result of a LKP: expression in another transformation
Connected lookup can use both dynamic and static cache
Unconnected Lookup cache can NOT be dynamic
Connected lookup can return more than one column value ( output port )
Unconnected Lookup can return only one column value i.e. output port
Connected lookup caches all lookup columns
Unconnected lookup caches only the lookup output ports in the lookup conditions and the return port
Supports user-defined default values (i.e. value to return when lookup conditions are not satisfied)
Does not support user defined default values

What is the difference between Router and Filter?

Following differences can be noted,
Router
Filter
Router transformation divides the incoming records into multiple groups based on some condition. Such groups can be mutually inclusive (Different groups may contain same record)
Filter transformation restricts or blocks the incoming record set based on one given condition.
Router transformation itself does not block any record. If a certain record does not match any of the routing conditions, the record is routed to default group
Filter transformation does not have a default group. If one record does not match filter condition, the record is blocked
Router acts like CASE.. WHEN statement in SQL (Or Switch().. Case statement in C)
Filter acts like WHERE condition is SQL.

What can we do to improve the performance of Informatica Aggregator Transformation?

Aggregator performance improves dramatically if records are sorted before passing to the aggregator and "sorted input" option under aggregator properties is checked. The record set should be sorted on those columns that are used in Group By operation.
It is often a good idea to sort the record set in database level (click here to see why?) e.g. inside a source qualifier transformation, unless there is a chance that already sorted records from source qualifier can again become unsorted before reaching aggregator
You may also read this article to know how to tune the performance of aggregator transformation

What are the different lookup cache(s)?

Informatica Lookups can be cached or un-cached (No cache). And Cached lookup can be either static or dynamic. A static cache is one which does not modify the cache once it is built and it remains same during the session run. On the other hand, A dynamic cache is refreshed during the session run by inserting or updating the records in cache based on the incoming source data. By default, Informatica cache is static cache.
A lookup cache can also be divided as persistent or non-persistent based on whether Informatica retains the cache even after the completion of session run or deletes it

How can we update a record in target table without using Update strategy?

A target table can be updated without using 'Update Strategy'. For this, we need to define the key in the target table in Informatica level and then we need to connect the key and the field we want to update in the mapping Target. In the session level, we should set the target property as "Update as Update" and check the "Update" check-box.
Let's assume we have a target table "Customer" with fields as "Customer ID", "Customer Name" and "Customer Address". Suppose we want to update "Customer Address" without an Update Strategy. Then we have to define "Customer ID" as primary key in Informatica level and we will have to connect Customer ID and Customer Address fields in the mapping. If the session properties are set correctly as described above, then the mapping will only update the customer address field for all matching customer IDs.

Under what condition selecting Sorted Input in aggregator may fail the session?

  • If the input data is not sorted correctly, the session will fail.
  • Also if the input data is properly sorted, the session may fail if the sort order by ports and the group by ports of the aggregator are not in the same order.

Why is Sorter an Active Transformation?

This is because we can select the "distinct" option in the sorter property.
When the Sorter transformation is configured to treat output rows as distinct, it assigns all ports as part of the sort key. The Integration Service discards duplicate rows compared during the sort operation. The number of Input Rows will vary as compared with the Output rows and hence it is an Active transformation.

Is lookup an active or passive transformation?

From Informatica 9x, Lookup transformation can be configured as as "Active" transformation.
However, in the older versions of Informatica, lookup is a passive transformation

What is the difference between Static and Dynamic Lookup Cache?

We can configure a Lookup transformation to cache the underlying lookup table. In case of static or read-only lookup cache the Integration Service caches the lookup table at the beginning of the session and does not update the lookup cache while it processes the Lookup transformation.
In case of dynamic lookup cache the Integration Service dynamically inserts or updates data in the lookup cache and passes the data to the target. The dynamic cache is synchronized with the target.
In case you are wondering why do we need to make lookup cache dynamic, read this article on dynamic lookup

What is the difference between STOP and ABORT options in Workflow Monitor?

When we issue the STOP command on the executing session task, the Integration Service stops reading data from source. It continues processing, writing and committing the data to targets. If the Integration Service cannot finish processing and committing data, we can issue the abort command.
In contrast ABORT command has a timeout period of 60 seconds. If the Integration Service cannot finish processing and committing data within the timeout period, it kills the DTM process and terminates the session.

What are the new features of Informatica 9.x in developer level?

From a developer's perspective, some of the new features in Informatica 9.x are as follows:
  • Now Lookup can be configured as an active transformation - it can return multiple rows on successful match
  • Now you can write SQL override on un-cached lookup also. Previously you could do it only on cached lookup
  • You can control the size of your session log. In a real-time environment you can control the session log file size or time
  • Database deadlock resilience feature - this will ensure that your session does not immediately fail if it encounters any database deadlock, it will now retry the operation again. You can configure number of retry attempts.

How to Delete duplicate row using Informatica

Scenario 1: Duplicate rows are present in relational database

Suppose we have Duplicate records in Source System and we want to load only the unique records in the Target System eliminating the duplicate rows. What will be the approach?
Assuming that the source system is a Relational Database, to eliminate duplicate records, we can check the Distinct option of the Source Qualifier of the source table and load the target accordingly.

 




IF ANY Video Tutorials or IT Online Training NEEDED PLEASE MAIL TO info@monstercourses.com, http://www.monstercourses.com/