How Well Does the Model Leverage Generic Structures?
An application’s flexibility and data quality depend quite a bit on the underlying data model. In other words, a good data model can lead to a good application and a bad data model can lead to a bad application. Therefore we need an objective way of measuring what is good or bad about the model. After reviewing hundreds of data models, I formalized the criteria I have been using into what I call the Data Model Scorecard.
By: Steve Hoberman, [email protected]
The Scorecard contains 10 categories:
- How well does the model capture the requirements?
- How complete is the model?
- How structurally sound is the model?
- How well does the model leverage generic structures?
- How well does the model follow naming standards?
- How well has the model been arranged for readability?
- How good are the definitions?
- How well has real world context been incorporated into the model?
- How consistent is the model with the enterprise?
- How well does the metadata match the data?
This is the fifth of a series of articles on the Data Model Scorecard. The first article on the Scorecard summarized the 10 categories, the second article focused on the correctness category, the third article focused on the completeness category, the fourth article focused on the structure category, and this article focuses on the abstraction category. That is, How well does the model leverage generic structures? For more on the Scorecard, please refer to my book, Data Modeling Made Simple: A Practical Guide for Business & IT Professionals.
How well does the model leverage generic structures?
This category gauges the use of generic data element, entity, and relationship structures. One of the most powerful tools a data modeler has at their disposal is abstraction, the ability to increase the types of information a design can accommodate using generic concepts. Going from Customer Location to a more generic Location for example, allows the design to more easily handle other types of locations, such as warehouses and distribution centers. This category ensures the correct level of abstraction is applied on the model.
In applying this category to a model, I look for structures that appear to be under abstracted or over abstracted:
Under abstracting. If a data model contains structures that appear to be similar in nature (i.e. similar types of things), I would question whether abstraction would be appropriate. Factored into this equation is the type of application we are building. A data mart for example, would rarely contain abstract structures while a data warehouse which requires flexibility and longevity would be a good candidate for abstraction.
See Figure 1 for an example of under abstracting. If this structure is part of a data warehouse model which requires longevity in the face of ever changing requirements, we would question whether the Customer’s phone numbers should have been abstracted. Removing the phone number data elements and creating a separate Customer Phone structure where phone numbers are stored as values instead of elements will provide more flexibility.
Figure 1 – Possibly under abstracting
Over abstracting. Likewise, if I see too much abstraction on a model, I would question whether the flexibility abstraction can bring is worth the loss of business information on the model and the additional cost of time and money to implement such a structure. Writing the scripts to load data into an abstract structure or extract data out of an abstract structure is no easy task. In fact, a complete generalization but I have found that modelers who used to be developers tend to be the shrewdest abstracters because they understand the cost.
See Figure 2 for an example of over abstracting. The purpose of this model was limited to obtaining a detailed understanding of Customer. Specifically, the business sponsor summarizes their requirement as
“We need to get our arms around Customer. Our company has customer maintained in multiple places with multiple definitions. We need a picture which captures a single agreed-upon view of customer.”
Figure 2 – Definitely over abstracting
A Party can be a person or organization, and that person or organization can play many roles. One of these roles is Customer. Although the final Customer model might contain such an abstract structure, jumping straight to Party and Party Role before understanding Customer mistakenly skips the painful activity of getting a single view of customer
As a proactive measure to ensure the correct level of abstraction, I recommend performing the following activities:
- Ask the “value” question. As a proactive measure to ensure the correct level of abstraction, I find myself constantly asking the “value” question. That is, if a structure is abstracted, can we actually reap the benefits some time in the not so distant future? In Figure 1 for example, the Customer’s names are abstracted into the Customer Name entity. The “value” question might take the form of, “I see you have abstracted Customer Name. What are other types of customer names you envision in the next 2-3 months?”
- Abstract after normalizing. When you normalize, you learn how the business works. This gives you a substantial amount of information to make intelligent abstraction decisions.
- Consider type of application. Some types of applications, such as data warehouses and operational data stores, require more abstraction than other types of applications, such as data marts. A good rule of thumb is if the application needs to be around a long time, yet its future data requirements can not be determined, abstraction tends to be a good fit.
About the Author
Steve Hoberman is the most requested data modelling instructor in the world. Introduced at over 50 international conferences as everything from a “data modelling guru” to “data modelling rock star”, Steve balances the formality and precision of data modelling with the realities of building software systems with severe time, budget, and people constraints. In his consulting and teaching, he focuses on templates, tools, and guidelines to reap the benefits of data modelling with minimal investment. He taught his first data modelling class in 1992 and has educated more than 10,000 people about data modelling and business intelligence techniques since then, spanning every continent except Africa and Antarctica. Steve is known for his entertaining, interactive teaching and lecture style (watch out for flying candy!), and is the author of five books on data modelling, including the bestseller Data Modelling Made Simple. He is the founder of the Design Challenges group, inventor of the Data Model Scorecard® , and CEO of Technics Publications.
Steve will be presenting the following seminars in London for IRM UK in 2015:
- Data Modelling Fundamentals – 10-11 March 2015; 20-21 October 2015, London
- Data Modelling Masterclass – 12-13 March 2015; 22-23 October 2015, London
All articles are © 2014 by the authors