Over the past twenty years, companies, leaders and the public have realized the importance and significance of data. What still eludes many businesses is how to leverage the value of data properly. One term that comes up often in these discussions is “data governance,” but like many data terms (e.g. “big data”), its meaning is a bit nebulous.
Defining data governance
According to the Data Governance Institute, “Data Governance is the exercise of decision-making and authority for data-related matters.“ That’s a bit vague and sounds like it’s achieved through powerful magic handed down by the data gods. No wonder many companies don’t want to try that. The reality, as always, is a bit messier but also a bit more hopeful.
Here’s my definition: Data governance is the process (and documentation of said process) by which data is gathered, interpreted, stored, processed and made available.
A look around the web will return many different analogies for data governance: it works like a government creating and enforcing laws; it serves as the rules and referee of a sport; it’s building a house and following building codes, etc. In my mind, what many of these comparisons miss is the discovery aspect of data governance that must occur before the regulating processes can be competently established.
Creating the natural laws of data
I consider data governance to be more akin to discovering and codifying the laws of physics of a new universe.
- In the beginning, there’s a lot of research, competing theories and maybe some trial and error before a new rule (or law) is agreed upon.
- Extensive input and testing must be performed to confirm the accuracy of the proposed rule.
- Over time, greater understanding may lead to a rule being altered or discarded.
- The most significant technological advances build upon knowledge gleaned from these rules and must weigh the implications based upon them very carefully.
- Before establishing these “natural” laws, there’s no way to firmly differentiate the possible from the impossible.
This aligns closely with the processes that should occur when a new data source is introduced and integrated into a company’s data ecosystem.
To be fair, there are areas of data governance that work more like rules and regulations as opposed to the laws of physics, but even these must often be based upon those initial laws. Just like many state and local regulations are ostensibly designed to protect citizens from running afoul of the laws of science (e.g. cars must have air-bags) some data governance rules must be designed to protect corporate citizens from running afoul of “natural” data laws.
Sometimes these are as simple as naming conventions for data fields on reports. Still, other times they involve complicated steps like security procedures for sharing data with other companies that could prove catastrophic to a business if not followed to the letter. In less mature data governance, many of these rules — just like local laws — are not based on the natural laws but are simply the whims of an overzealous administrator.
Building on a solid foundation
Creating these foundational laws can sound daunting. But the truth is that any company that has data is practicing at least minimal data governance, although maybe not in the most helpful ways. Simply having data structures — like tables in a relational database or documents in a document store — is a basic form of governance.
Most people would associate these structures with data architecture, but just as building architecture has to be rooted in the laws of physics, data architecture must be rooted in the laws of data governance. Data governance includes many other areas like data stewardship, compliance, quality analysis and information architecture. All of these disciplines must combine with company-wide attitudes towards data in general to contribute to an overall data governance atmosphere.
So who is tasked with the job of figuring out all these rules, both natural and derived? Some companies have a full time Data Governance team that works hand in hand with the various analytics, engineering, data, compliance and business teams. For others, it might be handled by individuals embedded in engineering teams or systems architecture. The Zebra currently has Data Stewards working inside of the Data team. The existence of Data Steward roles was a big plus to me when I was approached by The Zebra recruiting staff since it’s far from a given at many companies. Further discussions with Vice President of Data Claire Look during my interview process about the importance of governance were big factors in luring me to the Zebra. Exactly who handles the specifics is less important than the answer to the next question though.
Who needs to be involved in data governance?
Short answer: Everyone. As mentioned above, attitudes towards data and governance within a company are what makes things succeed or fail. Just like human beings can dispute or ignore the existence of relative gravity, they do so at their own peril the next time they walk across a footbridge.
Data governance is happening even if it isn’t assigned to someone. Therefore, it’s probably a good idea to formalize the processes to make sure everyone is informed of the dangers and possibilities that come along with the data.