How can you govern your master data without knowing your master data?
For many years I’ve been saying that the one thing all MDM clients have in common is that the quality of data in their source systems is not as good as they thought. Over the past several years I’ve found that all MDM clients have a second thing in common: they are unaware of the quality of data in their MDM hub and they don’t know how the data is changing. This is surprising since an MDM hub contains your most critical business data that is used in real-time processes and analytics across the organization. How can you govern your data when you don’t know its trend in quality, how it is being used and how it is changing over time? This is flying blind.
There are a few contributing factors to this issue. The first is that MDM products don’t provide capabilities to analyze and report on data. The second is an MDM hub is not the appropriate place to do this.
MDM products provide capabilities for master data management and not master data analytics.
Popular MDM products such as IBM InfoSphere MDM, Informatica Siperian and others don’t have any practical capabilities to analyze and report on the master data. Yes they all have a very strong focus on entity resolution so that you can de-duplicate data but they all fall short on addressing and tracking other types of quality issues (aka as policies), reporting on how the data is changing over time, reporting on who is using the data and how they are using it and so on. These MDM products do come with some reporting capabilities but they usually come with disclaimers that they may negatively impact performance and operations of the MDM hub and therefore should be used with care, which is a strong indication they are not the appropriate place for these activities. This is for a couple of reasons.
First, the underlying data models in the MDM hubs are operational in nature and designed for managing master data. They aren’t designed for analyzing and reporting on master data and transactions against the master data. For example, they lack dimensional structures that can be used to slice and dice the data in different ways including the critical time dimension. Also they lack structures (such as aggregate tables) that directly support reporting in efficient manners. It should be noted, however, that many of the products do contain features to broadcast information on activities within the hub that can be collected, aggregated/analyzed and reported on or streamed to dashboards to show near real-time activity. So in a sense they are enabling master data analytics.
Second, MDM hubs are often used to support real-time, low-latency operations such as integration to call centers, web channels and other business processes. You don’t normally mix real-time operations with analytical operations without putting one or the other at risk. In other words, MDM products are there to manage master data. They are not there to analyze master data. It is analogous to operational systems versus analytical systems.
What is master data analytics?
I define “master data analytics” as the discipline of gaining insights and uncovering issues with master data to support data governance, increase the effectiveness of the MDM program and justify the investment in it. A master data analytics solution should provide the following capabilities:
- Describe how the master data is changing over various time dimensions and time periods including new, updated, consolidated and de-consolidated data.
- Deep analytics and discovery of quality issues that goes beyond what is done within the MDM hub with traceability back to the source systems so that issues can be addressed at source.
- Describe trends in discovery of quality issues AND resolution of those issues.
- Provide the current state of outstanding data stewardship tasks and how they are trending across different time dimensions.
- Describe the composition of the data.
- Describe the transactional activity, who is consuming the master data, how they are consuming it and if their SLAs are met or not.
- Provide insights into capacity planning for future phases in the MDM program.
Emerging master data analytics products
It is very common for clients to build some level analytics and reporting on their master data. They’ve had no choice because it is a need in their MDM program but vendor MDM products don’t provide sufficient capabilities for this. There are, however, products starting to emerge in this important space. InfoTrellis has the most mature offering.
InfoTrellis was first to market with a master data analytics product in 2011 called “Reporting and Operational Monitoring for MDM”, or ROM for short.
If you want to learn more about ROM including how InfoTrellis uses it in services engagements to accelerate MDM implementations then don’t hesitate to email us.