How to integrate Informatica Data Quality (IDQ) with Informatica MDM

Overview

Data cleansing and standardization is an important aspect of any Master Data Management (MDM) project. Informatica MDM Multi-Domain Edition (MDE) provides reasonable number of cleanse functions out-of-the-box. However, there are requirements when the OOTB cleanse functions are not enough and there is a need for comprehensive functions to achieve data cleansing and standardization, for e.g. address validation, sequence generation. Informatica Data Quality (IDQ) provides an extensive array of cleansing and standardization options. IDQ can easily be used along with Informatica MDM.

This blog post describes the various options to integrate Informatica MDM and IDQ, explains the advantages and disadvantages of each approach to aid in deciding the optimal approach based on the requirements.

Informatica MDM-IDQ Integration Options

There are three options through which IDQ can be integrated with Informatica MDM.

  1. Informatica Platform staging
  2. IDQ Cleanse Library
  3. Informatica MDM as target

Option 1: Informatica Platform Staging

Starting with Informatica MDM’s Multi-Domain Edition (MDE) version 10.x, Informatica has introduced a new feature called “Informatica Platform Staging” within MDM to integrate with IDQ (Developer Tool). This feature enables to directly stage/cleanse data using IDQ mappings to MDM’s Stage tables bypassing Landing tables.

Figure 1: Informatica Platform Staging Process

Advantages

  • Stage tables are immediately available to use in the Developer tool after synchronization eliminating the need to manually create physical data objects.
  • Changes to the synchronized structures are reflected into the Developer tool automatically.
  • Enables loading data into Informatica MDM’s staging tables bypassing the landing tables.

Disadvantages

  • Creating connection for each Base Object folder in Developer tool can be cumbersome to maintain.
  • Hub Stage options like Delta detection, hard delete detection, and audit trails are not available.
  • System generated columns need to be populated manually.
  • Rejected records are not captured in _REJ table of the corresponding stage table but get captured in .bad file.
  • Invalid lookup values are not rejected while data loads to stage unlike in Hub Stage Process. The record with invalid value gets rejected and captured by the Hub Load process.

Option 2: IDQ Cleanse Library

IDQ allows creating functions as operation mappings and deploy them as web service which can then be imported in Informatica MDM Hub implementation as a new type of cleanse library defined as IDQ cleanse library. This functionality allows usage of the imported IDQ cleanse functions, just like any other out-of-the-box cleanse function. Informatica MDM Hub acts as a Web service client application that consumes IDQ’s web services.

Figure 2: Hub Stage Process design using IDQ imported function

Advantages

  • Easily build transformations in IDQ’s Informatica Developer tool rather than creating complex java functions.
  • Unlike Informatica Platform staging, Hub Stage process options such as delta detection, hard delete detection, audit trail are available for use.

Disadvantages

  • Physical data objects need to be manually created for each staging table and manually updated for any changes to the table.
  • IDQ function must contain all transformation logic to leverage batching of records. If any transformation logic is additionally defined in the MDM map, then calls to the IDQ web service will be single record leading to performance issue.
  • Web service invocations are synchronous only, which can be a concern for large data volume.

Option 3: Informatica MDM as target

3.1 Loading data landing tables

Informatica MDM can be used as target for loading the data to landing tables in Informatica MDM.

Figure 3: Using IDQ as ETL tool for loading data into landing table

Advantages

  • Single connection created in Developer tool for Informatica MDM is less cumbersome when compared to creating multiple connections with Informatica platform staging.
  • No need to standardize data in the Hub Stage Process.
  • Unlike Informatica Platform staging, Hub Stage process options – delta detection, hard delete detection, audit trail are available to use.

Disadvantages

  • Physical data objects need to be manually created for each landing table and manually updated for any changes to the table.
  • Need to create mappings at two levels (i) source to landing and (ii) landing to staging (direct mapping).

3.2 Loading data staging tables (bypassing landing tables)

Informatica MDM can be used as target for loading the directly to staging tables in Informatica MDM, bypassing landing tables.

Figure 4: Using IDQ as ETL tool to load data directly into staging tables

Advantages

  • Single connection created in Developer tool for Informatica MDM is less cumbersome when compared to creating multiple connections with Informatica platform staging.
  • Can be used for lower version of Informatica MDM where Informatica Platform staging option is not available.

Disadvantages

  • Physical data objects need to be manually created for each staging table and manually updated for any changes to the table.
  • Hub Stage Delta detection, hard delete detection, and audit trails options are not available.
  • System generated columns need to be populated manually.
  • Rejected records are not captured in _REJ table of the corresponding stage table but get captured in .bad file.
  • Invalid lookup values are not rejected while data loads to stage unlike in Hub Stage Process. The record with invalid value gets rejected and captured by the Hub Load process.

Conclusion

While there are multiple options to integrate IDQ with Informatica MDM, based on client’s requirements, the ideal approach is analyzed and recommended.

About the Author

Author: Meenatchi D.

Meenatchi D. is a tech savvy Consultant with widespread experience in Data Integration (DI) and Business Intelligence (BI) space and worked on leading tools from IBM, Informatica and Microsoft. She also has sound knowledge of Master Data Management (MDM) concepts and is trained in IBM and Informatica MDM.

Co-Author: Sachin Dedhia

Sachin Dedhia is an Architect at Mastech InfoTrellis and has 13 plus years of extensive experience in design and development of Data Integration (DI), Business Intelligence (BI) and Master Data Management (MDM) solutions. He leads the Informatica MDM practice providing his expertise on the projects, designing internal trainings, assets and accelerators.

Data Quality, IDQ, Informatica Data Quality, Informatica IDQ, Informatica MDM, Landing Table, MDE, MDM, Multi-Domain Edition, Staging Table

18 responses to “How to integrate Informatica Data Quality (IDQ) with Informatica MDM”

  1. […] http://www.infotrellis.com/integrate-informatica-data-quality-idq-informatica-mdm/ […]

    • Kevin says:

      Hello Mate,

      Love it absolutely! So crystalline. No mumbo jumbo. No non-sense. Straight and simple. You guys need a standing ovation for your good work.

      I have requirement for processing a dynamic expression logic.
      It allows you to proactively discover, profile, monitor, and cleanse your data in a consistent and reusable manner- regardless of the underlying platform and technologies Informatica IDQ training USA.
      Ex: I have fields A, B, C read from a source. I have to do arthmatic operations on these fields and save in a port named D.
      The logic for the calculation is stored in a table as a string ‘A+B+C’. and the logic changes very often. the next day the logic might be like ‘A-(B+C)’ or          ‘A-B’ or ‘A*B/C’.
      So i need to retrieve this logic from lookup and based on the logic I need to calculate the value of port D.
      If I lookup the table and fetch the string of logic and place it in a variable port, it is treated as a string value but not as a Expression logic.
      Ex: ‘A+B+C’ is treated as ‘A+B+C’ , but i want the port to treat it as A+B+C.

      Super likes !!! for this amazing post. I thinks everyone should bookmark this.

      Kind Regards,
      Kevin

  2. […] Read more at, http://www.infotrellis.com/integrate-informatica-data-quality-idq-informatica-mdm/ […]

  3. […] Land Process is optional and can be skipped, along with other Land Process options as discussed in our blog, How to integrate Informatica Data Quality (IDQ) with Informatica MDM. […]

  4. Wilk says:

    Great information, thanks a lot !

  5. kevin says:

    “Aloha,
    You make learning and reading addictive. All eyes fixed on you. Thank you being such a good and trust worthy guide.
    could you please check the link again that you’ve sent? I only get an error message that a Sharepoint error had occurred.
    As of my understanding the PowerExchange products cannot be used with PowerCenter Express (PCX). PCX does provide quite a few connectivity options, so if you can find some ODBC or JDBC driver to access SFDC you might be able to get your job done. Otherwise I fear you will have to switch to one of the full-fledged PowerCenter or Informatica Data Quality products.
    But great job man, do keep posted with the new updates.
    Best Regards,
    kevin”

  6. Hai thanks for sharing such a nice information. you can here more on informatica mdm. https://goo.gl/7XhTgG

  7. Hi Man,

    Hip Hip Hooray!
    I was always told that slightly slow in the head, a slow learner. Not anymore! It’s like you have my back. I can’t tell you how much I’ve learnt here and how easily! Thank you for blessing me with this effortlessly ingestible digestible content.

    Please refer to Informatica Application Services Guide for more information on configuring Metadata Manager Service & refer to Informatica Metadata Manager Administrator Guide & User Guide for more information about Service usage.

    Anyways great write up, your efforts are much appreciated.

    Gracias
    Krishna Kumar

  8. Arya says:

    I appreciate your work on Informatica. It’s such a wonderful read on Informatica.Keep sharing stuffs like this. I am also educating people on similar technologies so if you are interested to know more you can watch this:-

  9. Shivangi Roy says:

    Privileged to read this blog on Informatica.Commendable efforts to put on research the Informatica.Please enlighten us with regular updates on Informatica.Friends if you’re keen to learn more about AI you can watch this amazing tutorial on the same.

  10. Shivangi Roy says:

    I appreciate your work on Informatica. It’s such a wonderful read on Informatica.Keep sharing stuffs like this. I am also educating people on similar technologies so if you are interested to know more you can watch this:-

  11. Ramakrishna Kotlo says:

    where can create delta detection, hard delete detection, audit trail?

    • Asha Shyam Sundar, Mastech InfoTrellis says:

      Hi Ramakrishna Kotlo,
      Thanks for posting your query. Please email your query at info@mastechinfotrellis.com which will be addressed by our technical team.

      Regards
      Asha

  12. Linda says:

    Hi There,

    Gratitude for putting up this prolific article! You truly make everything a cake walk. Genuinely good stuff, saving time and energy.

    I am newbie to Informatics, we have a requirement to Integrate to SFDC to SQL server using Power enter, for this I downloaded Power enter express Developer edition but couldn’t see any provision for Sales force say any connectors or Adaptors’ read Power exchange is used to connect to SFDC and register as Salesforce objects and then plug-in needs to be installed in PCExpress then integration could happen. If anyone could provide the document or steps for this that would be really helpful.
    By the way do you have any YouTube videos, would love to watch it. I would like to connect you on LinkedIn, great to have experts like you in my connection (In case, if you don’t have any issues).
    Please keep providing such valuable information.

    Merci Beaucoup,
    Linda

    • Asha Shyam Sundar, Mastech InfoTrellis says:

      Hi Linda,

      Thanks for posting your query. Please send a detailed mail to info@mastechinfotrellis.com to get assistance from our technical team.

      Regards
      Asha

  13. Irene Hynes says:

    Hi,
    Thanks a trillion mate!
    It works like charm, saved a lot of energy & time.

    Informatica MDM Training Take the value from ROWID_MQ_RULE column in C_REPOS_MQ_DATA CHANGE table and check that ROWID_MQ_RULE value in C_REPOS_MQ_RULE table where you have all the triggers information.

    I read multiple articles and watched many videos about how to use this tool – and was still confused! Your instructions were easy to understand and made the process simple.
    Many Thanks,
    Irene Hynes

  14. Akhil Govind says:

    Thanks for the valuable information. Please keep posting more. For more on Blueprism Training

Leave a Reply

Your email address will not be published. Required fields are marked *