Given the limitations of data quality dimensions and categories as discussed in the previous post, is there a useful alternative?
Well Suzanne Embury and colleagues think so. Firstly they acknowledge that data quality cannot be precisely measured, the best one can hope for is an estimate. Real world information changes constantly so we cannot know how good our representation of it is at any point in time.
They advocate using quality views, which are user configurable software components that take input data, apply quality checks to each element in the data set, assign a quality score to each element and output the data set with quality labels attached relating to the user specified aspects of quality. The picture below outlines the steps in the execution of a quality view.
As can be seen from the diagram the quality view is analogous to the transformation stage of an ETL (Extract, Transform, Load) process. In the first part of the quality view input data is examined to gather information necessary for the decision procedure in the second part of the process to quality score it. The third part of the process involves some transformation of the data according to its score.
A major advantage to this approach is that it empowers the user to define their own quality requirements rather than having to accept those of the database or information system designer. Embury & Missier in a chapter entitled “Forget Dimensions – Define Your Information Quality using Quality View Patterns” in a recent book The Philosophy of Information Quality outline how useful quality views proved in a number of data quality related projects that they worked on.