Voices | Data Fabric: New Buzzword or Revolutionary Approach?
The world of data and analytics is changing faster than ever before. Technological advances, along with the widespread adoption of Artificial Intelligence and Machine learning in the modern enterprise, are challenging the way we think about data and how it is managed. The size and volume of data that today’s organizations have at their disposal is vast, too vast to be handled by traditional data management techniques.
To deal with this evolution requires a shift in approach, and any shift in approach requires a fancy and intriguing name. Enter Data Fabric.
Now, whenever I hear a new term coined in Data and Analytics, my immediate reaction is one of skepticism – are these just new buzzwords to describe the same things we’ve been doing for years? Or perhaps worse – a claim that we can just throw away everything we ever learned about data management and solve all our problems with a fancy new piece of technology? Thankfully, Data Fabric is neither of these things and, in my opinion, has the potential to revolutionize the way enterprises manage, share and interact with their data. Crucially, neither does it encourage us to throw away our existing data infrastructure, but rather to augment and complete it by taking advantage of modern techniques not traditionally applied to data management.
So, what is it that makes Data Fabric different?
A good strategy as a data professional, it could be said, is to pursue the activities that people find the most boring, and try to make them less so, or to take them off their plate altogether. When a young entrepreneur sets out building, let’s say, a restaurant business, I’m pretty certain they’re not thinking ‘wow, imagine all of that data I’ll get to manage, I can’t wait until we define a data governance framework’. Of course, they aren’t, and neither should they be, but the best leaders recognize the importance of such things and understand their potential to achieve competitive advantage.
So… when businesses start thinking about the potential of Artificial Intelligence and Machine Learning, we can all be fairly sure that the first thought that jumps into their heads is not ‘wow, imagine what exciting things I could achieve if I applied machine learning techniques to my Metadata.’
But this is exactly how Data Fabric promises to deliver value: by learning from the way people interact with data, we create an ecosystem that is adaptive and self-healing, allowing us to focus on the data products and pipelines that will deliver the most value to the business rather than constantly attending to the ‘plumbing’.
So what do we mean when we say Data Fabric? Well, there are multiple definitions out there, but, for me, there are 3 key elements:
- It is a logical architecture, not a software product
- It is an integrated data layer from which data can be discovered and utilized by people and applications alike
- It uses advanced analytic and AI/ML techniques to constantly analyze the available metadata and make recommendations on how it should evolve
Let’s explore this last element in a bit more depth… Data Fabric places at its core a number of concepts that even the most skeptical of data professionals would struggle to contest:
- The data needs of an organization change more frequently than traditional approaches to data solution design can keep up with.
- People articulate their data requirements far more effectively through the things they do than through the things they say.
- Organizations constantly waste time and effort integrating and preparing data that is barely used, often at the expense of the data that is actually needed.
- However hard we try to centralize data into warehouses, lakes, etc. we always end up with data in multiple places, sometimes spread across multiple cloud and on-premise solutions.
By performing continuous analysis of the actual interactions we and our business systems have with our data, the Fabric is, at least in theory, able to predict and recommend, or even automate, modifications to our data pipelines that deliver what we need, rather than what we say we want.
It all sounds a bit too good to be true, right? Well, in some ways, it probably is. At least right now. Even analyst organizations such as Gartner highlight that currently available technology solutions aren’t yet at a level of maturity to deliver against the full promise of the Data Fabric. But that doesn’t mean we shouldn’t take the first steps on our journey.
Why should you consider a Data Fabric approach?
No matter how much we are schooled to ‘start with why?’, there is still a massive tendency amongst organizations to focus on the ‘what?’. If an enterprise is pursuing a Data Fabric approach because ‘[Gartner/Forrester] said we should’ or because ‘our biggest competitor is doing it’ then the initiative is most likely already destined to fail. If the first thing we do when we hear about Data Fabric is ask ‘where can I buy one?’ then we’re probably setting off in the wrong direction.
The best data projects begin when there is a real business initiative with an established business case driving the need for change. Arguably, we shouldn’t even mention concepts like ‘Data Fabric’ except in the context of how we will attempt to address an already-articulated business challenge or opportunity.
But we must also consider that the benefits of Data Fabric grow as more use cases adopt the approach and that, in isolation, a single data initiative may derive little/no more benefit from a Data Fabric approach than could be delivered by traditional methods. This is where we should consider the importance of:
- continuously adapting the solution as the business evolves
- reducing the ‘time to value’ for new data solutions and changes to existing ones
- reducing the emphasis on manual data solution design and data maintenance tasks
- integrating the solution with other data products
- putting the data directly in the hands of analysts and data scientists to combine it with other data and potentially derive further value from it
If these considerations are important to your organization, then maybe it’s time to start considering Data Fabric as a target architecture for your future data initiatives.
So where should you start?
Remember when I said that, as data professionals, we should hone in on the boring bits first? Well, here’s where to start: Metadata! If you don’t have a good handle on the metadata in your organization, then how can you hope to gain insights and recommendations from it? And I don’t just mean technical metadata, definitions of tables, columns, and the like, but also data about its usage, its lineage; inferred metadata from data analysis, and business-enriched metadata about what the data actually means to the people that consume it. Analytics, ML, and AI are only as good as the data available to them, so an abundance of metadata is absolutely key.
So, as good a place as any to start is to look at how good a handle you have on the metadata within your organization today. If you already have a rich data catalog enriched with business terminology and semantics, then that’s great, and it might be time to start looking at how you can make increased use of that metadata to help inform decision-making. If, on the other hand, you don’t have a data catalog at all, if it is partial, under-utilized, or undermaintained (don’t worry, you’re not alone) then I would say there is no better time than the present to start building up that metadata catalog. I’ve seen some research that suggests that if you have a poor handle on your metadata then Data Fabric might not be for you. For me, that’s not the right answer, if you have a poor handle on your metadata, now is the time to get a good handle on it, or face being left behind by the competition.
Can technology help?
Well, yes, it can. In fact, it’s absolutely essential! But as with nearly all business and data challenges, technology isn’t the whole answer. Building up a catalog of metadata and enriching it with semantic meaning is going to require a lot of human effort initially before we can start reaping those rewards. The route to success here is an effective data governance organization and operating framework, with the promise of simplification and even automation of many future governance activities.
We should also be very clear that Data Fabric is an architecture and design approach. It isn’t a software product, despite what some vendors may claim.
That said, a good metadata management solution that is compatible with the Data Fabric architecture and the concept of ‘Active Metadata’ (applying continuous analytics/AI to your metadata to derive insight and make recommendations) will certainly help you on your journey to Data Fabric. Further, a data integration platform that supports Data Fabric is able to provide an abundance of metadata, and can connect consumers with the data they require, is an absolute must-have. But maybe your organization already has some of these technologies already. Data Fabric aims to make the most of your existing IT investments by harnessing them in a connected ecosystem – it doesn’t mean we have to replace everything. Speak to your data integration and metadata management vendors to understand their product roadmap and how they are embracing the Data Fabric architecture, whilst examining your existing IT landscape and identifying the gaps which stop you from achieving a Data Fabric design today.
So, what next?
As with any emerging trend, we can’t expect to deliver on the promise of the Data Fabric overnight. But if, like me, you buy into its potential to reshape the data management industry, now is the time to start laying those critical foundations of data governance, metadata management, and data integration. These are not new disciplines, their benefits are already well understood, and will have a positive impact in their own right, whilst also setting us off in the right direction for Data Fabric.
I feel like I’m stating the obvious here, but the old ‘don’t boil the ocean’ advice is as relevant in this context as it ever has been – don’t try to overhaul your entire data architecture. Focus on the data initiatives that are strategically important to the business and apply Data Fabric thinking to the way you solve them.
And remember, as hot a topic as Data Fabric is today, tomorrow there will be a new buzzword, maybe even a new approach or framework. Embrace the essence of Data Fabric – its core components are here to stay – but be ready to adapt and adopt new technologies and approaches as they arise. Taking these steps now can only enhance your organization’s ability to do so.