While some contemporary business models position data as the fuel that drives business forward, it’s increasingly important for complex organizations to recognize data as a product in and of itself. In fact “Data as a Product,” or DaaP, is a newish term that describes exactly that.
At the core of the DaaP model is the data mesh, which centralizes data governance and structure guidelines but distributes the actual management of data assets to business stakeholders (rather than strictly data specialists). This helps eliminate the bottlenecks associated with centralization of data capabilities.
So what happens when organizations embrace data as a product? People from around the organization have greater autonomy, they’re able to use data more consistently, and the organization itself is able to be more consistently driven by data.
Here’s what you need to know to bring this model to your organization, either by productizing your own data or by incorporating third-party data as a product.
Data as a product vs. data products
First, let’s get clear on definitions of two similar terms: data as a product (DaaP) and data products.
Data as a product
Data as a product refers to a way of structuring datasets internally so they can meet the needs of data consumers. In the world of DaaP, data consumers can be either employees of an organization or external customers.
Rather than having the goal of meeting a handful of specific needs of end users, the goal of data as a product is to make data discoverable and usable to anyone who might need it. This means, among other things, ensuring there are ways for both technical and non-technical users to interact with the data.
Data products
Data products, on the other hand, are digital tools with limited use cases. As with DaaP, the end users of data products can be internal or external. Examples include performance dashboards that show sales pipeline performance (internal) and an app that shows weather forecasts (external).
DaaP and data products: what they have in common
What both data as a product and data products have in common is that, in order for them to work, they need to be built with the principles of product management and product thinking in mind.
That is, in both cases, the end function should drive decision making about infrastructure, design, and feature sets. But while basic product management principles apply in both cases, the ways in which they manifest are different.
For data products, one primary product management concern is choosing and structuring the right data to enable a product’s functions – for example, annotating and combining data on existing drugs so researchers can query existing findings as they explore potential new drugs.
With DaaP, the data itself is the product. End users can interact with it in such a way that they create new value – manifested in new nodes in the data mesh – by bringing different data sets together.
For example, maybe a manufacturing organization has multiple sources of data for machine parts: vendor catalogs, equipment repair history, and training manuals. One DaaP application might be to create a new data product called “machine parts 360” that pulls machine part data from each of these three sources together and provides a more efficient way for employees to find information specific to machine parts.
The evolution of data utilization
The existence of both data products and data as a product serves as an illustration of how dramatically data utilization has evolved. Businesses have always generated vast amounts of data, after all. But it wasn’t until the dawn of the computer age, when our ability to organize and process that data increased exponentially, that businesses have been able to use the data they generate at scale.
The technological developments of the last several decades have led to transformative changes in our data utilization capabilities, including these:
Data storage: The exponential increases in computer memory since the middle of the last century mean that we can store vastly more data today in smaller physical areas than ever before. Businesses can now store and use a constant stream of new data from multiple data sources (think: IoT-connected sensors on factory equipment, web-connected software tracking user activity, internet-connected devices capturing customer behavior, etc.). As our ability to store more data increases, we can peruse data for insights—that is, we can discover things we didn’t know were interesting. Contrast that with the past, where storage limitations restricted us to tracking the “known knowns” and therefore limited business leaders’ ability to discover new opportunities.
Data transfer: From wired communication to wireless and from file transfer protocol to APIs, evolving ways of transmitting data have a huge impact on our ability to use it. As transfer mechanisms have become more flexible, data’s usability and utilization has increased.
Data formats: As storage capacity increased, so too did the need to organize and structure data to make it more easily searchable. Metadata and schema help here, offering a kind of digital card catalog to make data searchable and usable to anyone who understands the organization system.
Data access: Whereas once data access was limited to data scientists and others with high-level expertise, the structures that exist today make data and data-informed offerings much more accessible to non-expert users. This democratization of data manifests as data visualizations (like dashboards that capture the health of a facility’s equipment) and user-friendly apps and tools that let business users and laypeople query datasets in plain language.
Of course, all these advancements mean that data management and data governance have become top concerns for every organization. There is no card catalog without librarians to maintain it, and there is no schema or metadata without data scientists doing similar work.
High utilization levels of modern data are possible thanks to the rigorous data engineering work that happens behind the scenes to make the data usable.
How does this look in the real world?
In the manufacturing space, Siemens uses DaaP principles to gather data from IoT-connected sensors on equipment, which feed systems that perform real-time analysis that highlights things like early warning signs of wear and tear. This lets Siemens perform predictive maintenance, which reduces unplanned downtime.
On the consumer side, streaming platforms often use DaaP to track streaming behavior, search behavior, and engagement to produce personalized recommendations that keep customers subscribed longer.
Value generation opportunities from data products
The business value that data products deliver is significant.
In many cases, long-standing business needs can be met—or met better—by data products built with the data flowing through various data pipelines into an organization’s preferred data repository (e.g., a data lake or data warehouse).
One way to conceptualize the business value data products deliver is to think of them as a way of automating the otherwise tedious work of gathering and interpreting disparate data points. Dashboards are a common example.
A manufacturing business may need to understand, say, the overall health of its factory floor. A leader could have employees gather every morning and report on various metrics—worker capacity, temperature and humidity conditions, results of most recent equipment inspections, status of inventory vs. orders in progress, etc.
But that method is slow (because it requires end users to gather data manually and report it in a synchronous setting), backward-looking (because the data is gathered before the meeting takes place), and offers only a snapshot in time.
A dashboard, on the other hand, could meet the business need much better: by gathering metrics in real time from various digitized data sources, the dashboard can provide a visual summary of these component pieces of overall health, thus empowering leaders to see at a glance how the factory floor is doing—at any time of day or night.
This can deliver significant value to a manufacturing ecosystem. It might, for example, enable the organization to detect problems (like unexpectedly high temperatures) in real time, investigate them immediately, and potentially avoid costly long-term impacts like faster deterioration of equipment.
Building the infrastructure for data products
While data as a product is a valuable long-range goal for many organizations, it’s usually not the first stop on the way to data maturity.
Most organizations can get new value from their data more quickly by turning raw data into a structured data product. Here’s how to achieve that:
Identify a problem the data product should solve. In any organization, there are likely dozens of workflows that could be improved or specific business needs that could be met with a data product. Choosing the right one to build—typically one that combines user desirability, technical feasibility, and business viability—increases the odds that you’ll see ROI faster.
Get user input from a variety of stakeholders. You’ll need input from end users (whether they’re customers or employees), people knowledgeable about the business side of things, and anyone who will be involved in designing and building the actual product, including engineers, designers, data scientists, and delivery experts.
Build mockups and prototypes of increasing fidelity. Iteration is key to data product success, and various versions should incorporate feedback from end users as well as business, technical, data, and design stakeholders.
Launch and iterate. The best data products evolve throughout their lifecycle as the organization adjusts to user feedback and advances its capabilities.
One way data products add business value is by enabling self-service—think of how banks introduced self-service portals that let customers avoid calling tellers or going into branches. For these products to work, of course, they must be fueled by high-quality data. In fact, data quality is one of the most important considerations when building a data product.
The flip side of democratizing data access is ensuring adequate access controls and other data governance considerations. Users must be able to complete specific tasks but must not have access to data beyond what they need, which would create a massive security and liability risk for the organization.
Challenges in transforming data into products
While it’s hard to overstate the benefits data products can offer a business, it’s also important to recognize the challenges to developing and adopting them.
First, the technical challenges: data teams must be heavily involved with data product development. In many organizations, data teams are already stretched thin to support initiatives that require data from around the organization.
While building data products could introduce self-service capabilities that would ultimately reduce some demand for the data team’s time, the demand would increase before decreasing during the development process.
Other key technical challenges involve preparing the data itself. In most organizations, data exists in many formats and many locations; to be usable in a data product, it has to have a consistent, uniform format and be located in a central place governed by rules for data ingestion, maintenance, access, security, and more.
The data product also has to be built for interoperability—that is, it has to work with whatever legacy software and systems the organization uses.
Then there's the cultural hurdle to clear: data is most valuable for organizations that have a data-centric culture. Without establishing that core component of culture, data products and DaaP become very expensive accessories that may or may not improve operations.
Finally, any time an organization activates its data, privacy and regulatory compliance become concerns. Regulations at the state, national, and international level dictate how companies can use data and how they must protect data; failing to comply can be expensive and cause significant brand damage.
Still, the investment required will pay off for most organizations, thanks in part to the fact that data is the way of the future.
What's next for data?
Today, many organizations find themselves in the position of having a lot of raw data with high potential value. The next step is often ordering the data such that it can be used to fuel data products or deployed as a product in and of itself or as part of a data suite meant to power algorithms, machine learning models, AI and more.
In any industry, organizations can expect their use of data to increase substantially in the coming years and decades—that is, if they plan to stay competitive. Increasingly, data models will become an important component of business models and data science will be an essential capability to have in the C-Suite.
To prepare for the shift toward more data-centric business models, business leaders can focus today on assessing the maturity of their data and creating a road map for getting them to a place where their data can act as both a product and a fuel for products and business operations more generally.
If your organization could benefit from a data maturity assessment, get in touch. We’d love to help you evaluate your current situation and devise a path toward the data-centric future.