MDM (Master Data Management)

Post date: Mar 31, 2013 6:39:50 PM

I would like to share my understanding of MDM from its requirement in an organization

Master Data Management

Management doesn't mean a technology all agree? Management is a practice, approach, anything but not technology. For the time being I am taking it as a practice.

Master data means nothing but non transactional data. So what? Why am I worried about it. DBA's used to manage the database so master data too. Exactly.

Than what are the challenges of managing master data?

Now if an organization has various department, offices located at various parts of a country or even globe, in such a case even if all of them are part of a single organization and obviously they have data related to same organization but since they are located at different places and their functions are different they can have duplicate data, not matching with each other (in structure or definition). Inconsistencies may be there if master data which is common to whole organization is not defined and standardized and properly controlled, processed blah blah blah... So the challenges:- Consistency of organizations master data.

There should be a process to do following things on master data of the organization:-

  • collect

  • aggregate

  • match

  • consolidate

  • assure quality

  • persist

  • distribute

Obviously tools will also be required to do above mentioned activites.

So MDM is a set of process and tools to manage master data of an organization. One important silver bullet- When we are talking about distributed department having their own copy of data, what does it indicate. Distributed environment requires silver bullet SOA :). So important to consider here is that MDM is very important for success of SOA.

If organizations have 40, 50 and more databases and systems and data of the organization is not managed properly there may exist a situation where every system has its own definition of master data. Since master data is shared among various operations various issues operational issues may arise

Customer satisfaction

If you don't have a single view of a master / shared information there will be a communication issue. Say for an example in a mobile service provider company there is no standard shared master data of customer. If a customer has already discontinued the service, their billing department may continue sending monthly charges bills to the customer. This is my personal experience with more than one service providers.

This leads to dissatisfaction in customers. This is not only applicable to a scenario of discontinued customer, this is also true for customer who have already paid their bills but still get calls from agencies who work for service provider.

Operational efficiency

What a provisioning department or billing department will do if it has different view of customer data. Obviously efficiency of the overall system will be impacted.


How can you trust a piece of information if you know what ever it is showing may not be correct. Something exactly same exist with different name of definition;. In such a condition how decision support system can work.

Regulatory compliance

Inconsistency may create issues related to regulatory compliance. You don't have a single definition of lots of things in the organization and hence you will produce wrong facts and figures in front of regulators. This can lead to compliance issues.

Management of master data becomes more challenging in acquisitions and mergers scenarios where two organizations have their own data but ultimately it is going to be a single organization so some data must be common and shared.

What should be the process of Master data management? What are the activites which needs to be done? Let us examine this very quickly.

    • Identification of source of data

    • Collect Meta-data

    • Collect Data

    • Create Master Data Model

    • Select appropriate tools

    • Transform

    • Normalize

    • Apply Rules

    • Data correction

    • Generate and Test Master Data

    • Change producer or consumer if required

Following will run along with above activities in MDM journey.

Data governance

Body to set the rules, decide standards and monitor MDM process.

Data stewardship

Who will run the MDM projects for their own departments. Owners of data/information

There are many ways to maintain master data. Specially when it comes to maintain master data in an existing system where there are number of applications ad they have their own data along with copy of master. I am trying to summarize various options to maintain master list here. In my future posts I will again revisit approaches.

Only one copy of master— Single master data. Every change is directly made to the master data. Change in user of master data applications is required to use new data. Benefit of this is that you have consistency but it is not always practical.

Many copies, one maintenance— Master data added to the single copy and changes sent to the systems where copies are stored locally. These applications can change the data which is not part of master. Benefit if this is that it requires minimal in the applications.

Merging Master — Applications can change their their copy of the master data. Changes sent to the master where changes are merged into master. These changes are sent back to source systems to update local copies. Benefit of this approach is minimal changes in the source systems. There can be issues like:- Conflicts of same update by two systems, adding a new item in one system where other system already have that item will require remerge. Matching process to avoid duplicates in the master

There are different sets of data in an organization like product information, customer data and location data. Any MDM product or solution may use various hubs to manage these sets of data. The hubs can manage their data in any of the approach (I have explained in my previous post -one copy, multiple copies etc). Here I am putting some more thoughts on styles of master data management:-

Single Copy Master (Central Hub)- A single database for master data with all the information required for applications use it. After linking and matching data is consolidated at a central hub and published to various data sources. Central Hub prevents duplication. Currently running application may require changes to use master data.

Many Copies one Look-up Service- Master data is maintained within user application databases. The central hub stores lists of keys. These keys are used to access attributes for a an entity and these attributes are linked to databases. To access master data, data is located, a query is distributed among various databases, and a list of the requested data is formed on the fly.

This style doesn't require many changes in existing systems but in case of huge databases and addition of more and more databases this can become too complex for efficient access of data. Another issue I have already mentioned in my previous post is chances of duplicates in various systems. With addition of new databases queries may require changes.

Mixed style- Mix of both the styles. Master data is kept on the native databases, keys and Ids are generated to access this data. Some of its important attributes are replicated to the central hub. Central hub can service the common requests, and queries are distributed among various native databases specific attributes (which are generally not frequently used). This improves efficiency of data access at the same time requires less changes in the applications. Update of replica may have issues in this style. Identification of attributes, standardizing format can be a challenging task.

Managing master data was one of the most difficult tasks for organizations. There are ways to solve this problem but tools can help managing master data across organization. Some of the tools are data networks, data warehouses, data marts and an operational data store. Now a days almost all vendors are offering tools for master data management. Here I will throw some light on few types of tools, what they are? what they provide etc.

Data networks

Data networks facilitate in transmission of data within an organization. They provide information transfer and storage facility. A common server can have data which can be used by different departments or even group companies of a corporate. With data networks it is possible to gather master data from various sources. Obviously this will require consolidation and management. Another function data networks provide is sharing of important business information which is used daily transactions.

Data warehouses

Provide storage for the master data. Data can be analyzed and reported through it. Along with storage of data they also provide processes for the retrieval, analysis and management of data for proper access. Since they use common data model and they are optimized for data retrieval they facilitate easy analysis and reporting. This way source system (involved in transactions are relieved of this task and can work efficiently without degrading performance of source systems. Datawarehouses provide long storage of data which may not be possible in transactional db systems.

Data Mart

Although data marts can exist independently but often they are consolidated to dorm a datawarehouse. In some cases subsets of data is extracted to create data mart. Generally data marts are subset of datawarhouse. More specific data (related to a particular aspect) is kept in data mart. Individual departments or group of users use data marts to access data related to their business requirements. Since they are focused on specific information, frequently used data access from data marts is easy.

Transactional Databases

They store real time transactional data.

(Note : You can also vote on this blogs poll to choose best MDM product but I have also posted a detailed feature based survey at Survey Monkey)

MDM is becoming more important and relevant now a days for enterprises. There are tools, products which can help implementing Master Data Management but lots of vendors have launched MDM products in the market. Some are big players and some are small players but specialists. Which product to choose is a question which needs to be answered.

MDM Selection Criteria

Before choosing a product there are some criteria which needs to be prioritized for an organization and on the basis of these criteria a product supporting features in favor of the organizations requirements can be considered. Here are few things which can be considered:

    • Support for Multiple-Entity types for example customers, accounts, products, etc.

    • It should have support for requirements of various user types and usage scenarios (analysis, reporting etc.).

    • Data governance and administration features and capabilities.

    • Support for unstructured data and search capabilities for it at enterprise level.

    • Centralised or decentralised repositories support.

    • Ease of deployment

    • Pre-built Domain Data Model

Comparison of MDM Products

To compare various MDM products I am conducting a survey through Survey Monkey to get feedback on these features to select a MDM product.

I request you all to vote in this survey. It is a single page survey and doesnt require too much of effer. Just click and finish. Please click here to vote.

This survey will provide enough ground for MDM product comparison. Which one is the best MDM? Answer to this question will depend upon what are the organzations priorities but these features will help them choosing a MDM product based on their priority.

TDWI & Baseline computing have designed an assessment tool for MDM readiness. According to TDWI website: "This interactive survey has been designed to help you gauge how prepared your company is to acquire an MDM solution and launch a sustainable MDM program"

It asks about 37 questions on various pages. Questions are based on your general understanding of your organization to figure out what people in your organization think about master data, what do you do to clean duplicate data etc. Based on these and other types of question it sends you a report to your email ID.

Important thing about this survey is it has questions touching various aspects of data:- Quality, Processing, Change Management, Organization, MDM Perception people have in your organization, Rules and Policies, Data Access & Navigation etc. Its report shows your score in various categories as compared to other organization which have undergone this survey.

Since its score is showing your scores as compared to other organizations score it can be useful to know maturity of MDM at your organization. But at the same time it can be erroneous because organizations participated can not be a scale to know your maturity.