“Data … we need to talk”
“Uh, oh.” Have you ever heard that phrase from someone? Typically, it’s not good news and leads to the end of some type of relationship. In some ways, our relationship with our data has a “we need to talk” moment.
As a Chief Data Officer leading an Enterprise Data Office for a Fortune 100 company, you may think I would look at a requirement of “keep all data, forever” as a type of job security. In actuality, an active, robust Information Lifecycle Management strategy is essential in today’s data deluge. Even though storage mechanisms exist to almost achieve the requirement at a reduced cost, they do not alleviate the other risks associated with data sprawl and unmanaged data. Believe me, I know, it’s hard to let go. That 15-year-old monthly product snapshot, or those 10 gigabytes of e-mails sitting in your .pst folder or those hundreds of boxes of deposit tickets in that warehouse, they are your friends, your security blanket, and it’s hard to put them out to pasture. Data structured or unstructured, though, has a life expectancy, like any other asset.
It’s important to augment your acquisition to destruction strategy with a solid architecture and storage systems robust enough to acquire, hold and provision what’s needed while brave enough to purge what’s not. Experts estimate that it costs an organization approximately $4 per year to store a single box of paper records and $2 to $20 per year to store each gigabyte of electronic data. That warehouse with a couple hundred thousand boxes of potentially expired records and that legacy data warehouse with over a petabyte of data is sounding more like career suicide for the data officer, rather than job security now.
In addition, the cost of discovery is another aspect where house cleaning can be very valuable. The average cost incurred by a company in connection with the discovery of electronic data ranges from $1 million to $3 million per terabyte of data, compounded by the potential liability in litigation if damaging, unnecessary structured or unstructured data is discovered.
"It’s important to augment your acquisition to destruction strategy with a solid architecture and storage systems robust enough to acquire, hold and provision what’s needed"
Additionally, the proposition of business analytic value in your historic data should not be taken for granted, or lightly. Variables shift significantly overtime so it is wise to carefully examine potential value rather than to assume it and the potential risk/liability of keeping that older data. We have found an approach driven through a data-governance statement of direction, paired with an architectural design, incorporated in the data integration framework, can set the guardrails to guide the appropriate action and oversight.
That said, it’s not just a governance play. On purely necessary and compliant initiatives, I use the analogy of the “red shirts” from the Star Trek landing party. You know what is going to happen, when at the first sign of trouble, they’re going to get carried off, never to be seen again. Every governance component can be connected to added business value, which can help ensure you get beamed back up. According to recent IDC research, businesses can find, access and analyze only about 3 percent of their information easily. That is a disheartening statistic considering the time, resources and investment we spend acquiring, storing, protecting and provisioning our data assets. ILM, along with other key strategies in Metadata Management, Reference Data Management, Data Cataloging–to name a few–can greatly impact that harsh reality in a very positive way. A partnership between the data office, legal, compliance, risk management and our business process owners and data stewards, feeds requirements on retention, protection, usage and orchestration, to meet the needs of the business while appropriately managing the data assets.
If you’re in a legacy, large organization in any industry, this idea could seem like an overwhelming effort. Any business built over many years will invariably have disparate systems, data environments in various stages of maturity, silo information domains and at least pockets of data users, from senior leadership to data scientists, underserved for data and analytics. Just adding another layer of what could feel like controls, oversight or governance will not be received well. Start small, connect a piece of ILM as a service attached to very real business value use cases, integrated across the stages of data management and your relationship with data may yet live happily ever after.