Project Practitioners > Data Management

Data Management

By Patti Gilchrist

I remember as a child my mother would never throw anything away. If I tried to throw away even a useless scrap of paper, she'd frantically dig through the garbage, excitedly proclaiming that I would one day look back and thank her for not letting me throw away something with such sentimental value. Clothing was packed into bags and stored in the back of the closet for decades. Even if it was 10 years out of date, 10 sizes too small, it could not be thrown out, just in case one day it would come back in style.  She refused to throw away a pitted and dusty punch bowl, with remnants of a party held over 20 years ago inside. Although she hadn't hosted a party in over two decades, she claimed that it may come in handy one day, thus it was safely wrapped and stored in a box in the basement alongside other aged and impractical items which she deemed indispensable. It used to drive me crazy.

And now that I am older, I still firmly believe that this obsession borders on insanity, which is probably why I very much enjoy asserting my independence, confidently purging outdated and useless items.  

However, not everyone shares my view. In fact, it seems that many organizations have adopted my mother’s philosophy of never throwing anything away, particularly when it comes to their data management strategy. An overabundance of data is accumulated and then ritualistically stored, backed up, transferred to a disaster recovery site, and protected with the belief that it holds some intrinsic value to the organization.

  • The McKinsey Global Institute estimates that data volume is growing 40% per year.
  • Twitter produces > 8 TB data per day.
  • Health care organizations are collecting 85 percent more data than two years ago, according to a recent Oracle report on Also according to the report, the data managed in all industries came from the following sources:
      - 48 percent came from customer information
      - 34 percent from operations
      - 33 percent from sales and marketing
  • Also according to a recent article on,  eBay consumes 2 petabytes of raw digital space daily to run the site and store its data.
  • Machine-generated data is produced in much larger quantities than non-traditional data.   

Many organizations do not even know what all of this data is or if it is even needed. And like my mother, they look for more storage as the solution to handle their amassed data. (“If only I had more closet space!”) In reality, it isn’t the abundance of data itself that has value. The value lies in the ability to extract useful purpose and meaning from the data.

According to Ash Ashutosh, CEO of Actifio, in his article “Best Practices for Managing Big Data,”  “the vast majority of Big Data is either duplicated data or synthesized data.”  

So how can organizations achieve their goal of realizing benefits from all this accumulated information? A sound data management strategy is essential to derive business value from this large undefined mass of data.

Ashutosh offers a sound data management strategy which consists of 2 steps. 

  1. “The first step is to bring the data down to its unique set and reduce the amount of data to be managed.”  Sorry mom, it’s got to go!
  2. "Next, leverage the power of virtualization technology. Organizations must virtualize this unique data set so that not only multiple applications can reuse the same data footprint, but also the smaller data footprint can be stored on any vendor-independent storage device.”

So what is data virtualization? Wikipedia defines data virtualization as “the process of abstracting disparate systems (databases, applications, file repositories, websites, data services vendors, etc.) through a single data access layer (which may be any of several data access mechanisms)”

For more information on how to virtualize data and mistakes to avoid, visit

The Data Virtualization Leadership Blog provides a list of 10 mistakes to avoid when virtualizing data:

  1. Trying to Virtualize too Much
  2. Failing to Virtualize Enough
  3. Missing the Hybrid Opportunity
  4. Assuming Perfect Data is Prerequisite
  5. Anticipating Negative Impact on Operational Systems
  6. Failing to Simplify the Problem
  7. Treating SQL/Relational and XML/Hierarchical as Separate Silos
  8. Implementing Data Virtualization Using the Wrong Infrastructure
  9. Segregating Data Virtualization People and Processes
  10. Failing to Identify and Communicate Benefits

For more information on these lessons learned and to read the full article, visit

Not all comments are posted. Posted comments are subject to editing for clarity and length.

The comments to this entry are closed.

©Copyright 2000-2017 Emprend, Inc. All Rights Reserved.
About us   Site Map   View current sponsorship opportunities (PDF)
Contact us for more information or e-mail
Terms of Service and Privacy Policy

Stay Connected
Get our latest content delivered to your inbox, every other week. New case studies, articles, templates, online courses, and more. Check out our Newsletter Archive for past issues. Sign Up Now

Follow Us!
Linked In Facebook Twitter RSS Feeds

Got a Question?
Drop us an email or call us toll free:
7am-5pm Pacific
Monday - Friday
We'd love to talk to you.

Learn more about ProjectConnections and who writes our content. Want to learn more? Compare our membership levels.