There has been an interesting shift in the MDM space over the last few years. It wasn’t long ago that the most common question used to be “What is MDM?” – these days that question is instead “What are the best practices in implementing and sustaining MDM?”
There are best practices that have become common knowledge, one example being the practice of approaching MDM as a “program” and not a “project”, employing phased implementations that provide incremental business value.
Other best practices have yet to enter the mainstream; among them the absolutely essential practice of establishing MDM not in isolation but as part of a broader Data Governance program – a practice that cannot be undervalued for its impact on long term success. This is an approach that takes time to see the effects and understand the value of, which goes a long way towards explaining why it so often gets overlooked, especially in light of the fact that MDM is still a relatively young idea for many companies. You can get MDM off the ground without Data Governance, but over time you will certainly feel the effects of gravity much more without it.
We understand that successful Data Governance will lead to better and higher value business outcomes by managing data as a strategic asset. It is also widely recognized that a critical success factor in effective Data Governance is having the right metrics and insights into the data. Taking it one step further, if you concede that master data is the most strategic data for many organizations (most people would), having the right metrics and insights into that master data is a must.
MDM requires Data Governance to be successful beyond the first phases of implementation – and Data Governance requires metrics and insights into master data to be successful. So what are these required metrics and insights and where do they come from?
Metrics and insight
The most important metrics and insights about your master data are as follows:
What is the composition of your master data?
When you bring data in from multiple sources and “mesh” it together you’ll want to understand what the resulting “360 view” of that data looks like, as it will provide interesting insights and opportunities. For example, on average how many addresses does each customer have? How many customers have no addresses? More than five addresses? How many customers have both US and Canadian addresses?
How is your master data changing and who is impacting the change?
In any operational system you want to know how many new records have been added and how many existing records have been updated for different time dimensions (e.g., daily, weekly, monthly) and time periods. In an MDM hub, you need to take this a step further and understand entity resolution metrics – such as how many master data records have been collapsed together and split apart. Entity resolution is the key capability of an MDM hub responsible for matching and linking/merging records from multiple sources, and you therefore need on-going metrics on it in order to optimize it.
Furthermore, it is also important to understand what sources are causing the changes, given that master data records are composed of records from multiple sources. Is the flow of information what you expect?
How are quality issues trending and where are they originating?
It is obviously important to know the current state of quality and how many issues are outstanding for resolution, aiding in your ability to address these issues in priority order. It is, however, also important to see the bigger picture and be aware of how quality issues are trending over given time dimensions and time periods. Ultimately you want to fix any data quality issues at their source, and in order to do this you will need to understand which of your sources are providing poor quality data to the MDM hub.
Take address data, for example. You may detect that a number of address records in the MDM hub have a value of “UNKNOWN” for the city element. With proper Data Governance you are able to trace these values back to a particular source system, and from there address the issue at source. The result is being able to see and track this particular quality issue trending downwards over time.
Not only does this help in increasing the quality of the data but can also be used to justify the existence of the MDM program, especially if you can put a unit cost to a quality issue (possible for some quality issues like bad addresses). It is extremely difficult to put a price on data – but comparatively easy to put a cost on bad data.
How is quality issue resolution trending?
Ultimately you want to see new quality issues trending downwards over time, but oftentimes you still need to deal with resolving existing quality issues. It is important to be able to see if the overall quality of the MDM hub is increasing or decreasing. As above, having metrics and trends on the resolution of issues measured against a unit cost is a valuable and meaningful resource for data governance councils to have in hand to justify their efforts.
Sometimes your quality issues can be resolved through external events, such as a customer calling to update their address that may have a “returned mail” flag on it. Other times quality issues are resolved by data stewards. Quality issue resolution trends help to understand not just the outstanding data stewardship workload but also their productivity, which is useful in team planning.
Who is using the master data, how are they using it and are you meeting their SLAs?
It is common for consumers of MDM hubs to grow over time until eventually there are many consuming front-end channel systems and back-end systems. I’ve seen MDM implementations grow from one or two consumers in initial phases to many consumers across the enterprise, invoking millions and tens of millions of transactions a day against the MDM hub. Understanding what workload each consumer is putting on the MDM hub, error rates and SLA attainment is essential information for a data governance council to have. To give the most obvious example, having access to this information allows for capacity planning to ensure the MDM hub will continue to handle future workloads.
The missing link – where do the metrics and insight come from?
The key metrics and insights listed above are required for successful MDM and Data Governance. But where do you get them from? They are not something provided by operational MDM hubs, as the hubs themselves are focused on operational real-time management and use of master data. It is not their duty to capture facts and trending information to support the analysis of master data that produces the metrics and insights. That’s more of an “analytical process”, and it doesn’t fit well within an operational hub. Instead, what we’re talking about is the job of “Master Data Analytics”.
I define Master Data Analytics as the discipline of gaining insights and uncovering issues with master data to support Data Governance, increase the effectiveness of the MDM program, and justify the investment in it.
This has been a missing capability in the overall MDM space for some time now. Some clients have addressed this capability by custom-building it and, even worse, some clients have done nothing at all. Seeing firsthand the need for a solution to this universal stumbling block, our team began work some time ago on providing that solution. There is now a best of breed product by InfoTrellis called ROM that incorporates our experience of over 12 years of implementing MDM for Master Data Analytics that delivers these required metrics and insights for success.
InfoTrellis ROM provides a set of analytics and reports that are configurable and extendible to support Data Governance – and you can think of it as a technology component in your overall MDM program that is an extension to your existing MDM hub.
One very big advantage of ROM is it allows you to capture your master data policies (e.g., quality concerns) and test them against your source system data prior to implementing MDM, providing initial snapshots of quality issues to prioritize and manage in the implementation.