The Data Cloud

The Data Cloud

For the past few years, people have been saying that data is the new oil. This is wrong. In the digital economy, data is water -- not oil.

Water is one of the most critical substances on Earth. It is the essential molecule that allows us to transfer matter from a cell to a cell’s environment, thus nourishing and perpetuating life. It has no real alternative. Oil, however, is an energy source and has plenty of alternatives, including the sun, natural gas, coal, wind, biomass, geothermal and, incidentally, water (hydropower).

Water is an apt metaphor for the transforming role data plays in how we live, work, learn and play. Like water, data must be available, timely, accurate and secure everywhere, all the time. For information systems to flourish, data must be accessed, transferred and protected across the digital world. Applications and devices must be “well hydrated” to keep up with the speed of today’s always-on economy.


In the H2O world, clouds play a pivotal role in moving water from sources such as rivers or lakes to critical areas of consumption like farms. Most of the world relies on rainfed (i.e., cloud-delivered) agriculture to eat. Similarly, in the enterprise world, we need a new data management architecture that can easily move data where it needs to be, and that new place is something I call the data cloud.

Traditional data sources -- including primary and second storage, data networks and back-up and recovery systems -- are a layered hodgepodge of legacy and new systems glued together by middleware (some human, some made of software) that was built for the old mainframe and client-server worlds.

The new information world can truly be optimized if it’s built at the top of a data cloud, where information moves ubiquitously across places and can be accessed rapidly and efficiently wherever it is stored, whether it is primary, secondary or even ephemeral data. If you look at new data pipelines like Apache Kafka, they are the equivalent of a torrential downpour of data, whereas earlier architectures offer a light sprinkle. They need big data clouds.

Data is a lot like water. We have a massive amount of water, but it is fragmented and frequently difficult to access. Only roughly 0.3% of all water on Earth is fresh, a parallel to primary storage. And only a small percentage of water is found on the surface in lakes, rivers, streams and reservoirs. If that is not mind-boggling enough, a large percentage of water is underground or contained in oceanic salt water (nearly 97% of all water is in the ocean).

Similarly, the data world suffers from mass data fragmentation, a condition where most of our data is not well-cataloged and is redundant, expensive and difficult to access. While primary data (about 20%) is relatively easy to access, the other 80% is much more problematic to utilize, creating operational, cost and security concerns for organizations. Most organizations have little understanding, access or security over secondary data systems. Thus, there is a lot of redundancy built in, robbing resources that could be used in other IT systems. Imagine conserving your water resources for a ski resort but lacking what you need to grow staple crops like corn and wheat.

Advances in big data and machine-learning applications will further test traditional siloed data architectures. Only data cloud architectures can support this important advancement in computer systems.

If we look at the leading infrastructure clouds (Amazon Web Services and Microsoft Azure), there are several lessons we can discern. The leading infrastructure clouds securely provide on-demand compute power, database storage, content delivery and other functionality to help businesses scale and grow. This is commonly understood as infrastructure as a service. To manage the various capabilities, the cloud providers include a management layer referred to as platform as a service.

So, what does a data cloud need to do versus legacy data management approaches?

The data cloud will allow us to deliver data in ways that today are incredibly challenging to do. The seven building blocks include:

  • Software-defined data infrastructure for any app at any scale, anywhere, that spans across on‐premise data centers, public clouds and edge locations.
  • Data and app mobility that lets businesses run apps anywhere with no infrastructure lock‐in. We need to bring the data to the applications and the applications to the data.
  • A single management interface that provides simplified global operation and control through automation and intelligence.
  • Elastic consumption that allows businesses to quickly spin up/down resources on demand, removing overprovisioning and improving responsiveness.
  • The ability to run apps on the same platform as the data, either from a third-party marketplace or developed on an open, API-first architecture.
  • Support for emerging cloud-native architectures that present ephemeral data needs.
  • Security to protect the privacy and integrity of the data in the data cloud.

Like the global cloud cover that provides water across the earth, we need to start thinking of the data cloud as a connected and continuous set of resources.

The data cloud is the answer to mass data fragmentation. Just as condensation and cloud movement brings water up from sources of fresh water and delivers it to areas like the African Serengeti, the data cloud will generate enormous value from utilizing untapped or under-accessed data sources wherever they reside.

Data, like water, is precious. It needs a new architecture. It needs a data cloud.



This article originally appeared in Forbes

Sanjeev Desai

Product Marketing & Management Leader

5y

Awesome article, Alan. I really like the analogy of data is water - not oil in the digital economy. Well done!

Like
Reply

To view or add a comment, sign in

Insights from the community

Others also viewed

Explore topics