Enterprises built their early AI strategies in a world that assumed relative freedom of data movement. That world no longer exists, so how do they adjust to the new world?
While it’s a cliché that data is the new oil, the truth is that data does need to be able to flow around an organisation and come to rest in some central location, where it can be refined to extract insight and value.
In recent years, that refining has come, of course, via AI.
Director, Solutions Architecture at Starburst.
But if we really think that analogy through, it also captures some of the problems enterprises currently face managing data.
Recently, the world has seen what happens when the free flow of oil, gas, and other fundamental resources is disrupted. Governments and companies are scrambling to work around supply chain interruptions in the short term and looking for energy sovereignty and resilience in the long term. In this case, centralization has become a liability, thanks to a single, but critical, chokepoint.
In the same way, the upheaval created by AI, paired with governments’ and regulators’ attempts to manage its impact, means companies need to rethink how they architect and manage their data. The difference is that the ways traditional data flows around organizations are changing.
After all, a centralized approach made sense when moving data was straightforward. Now, however, governance and sovereignty concerns and data movement costs have changed the calculus.
AI transformation
AI alone has transformed the equation when it comes to the volume of data that must be managed. It’s not just training models that require massive amounts of data. Those models need constant updating and tuning with fresh data. And, if those models are to deliver value to the business, companies will be constantly running inference, which requires more data, and generates more data.
Unsurprisingly, governments and regulators have an interest in AI governance. This includes existing concerns about data residency and the possibility of new problems, such as data leakage into LLMs.
In Europe, for example, the regulatory landscape laid out by the GDPR mandate has been complicated by the rollout of the EU AI Act, which comes fully into force from August this year. In the US, companies face multiple federal and state-level regulations. Any, or all, of these could come into play as data is moved to that central refinery.
Governments also have an understandable strategic interest in developing sovereign AI, further complicating matters when it comes to both data and the models that work on it.
Once we consider this, it’s clear that only a few organizations are architected to handle the sheer scale of data involved in AI, and the governance it requires.
Companies grow messily, whether organically or through M&A, inheriting different infrastructures – and different governance regimes – as they sprawl across borders.
Data in the old world
In the old world, free-flowing data might have been seen as efficient. Now, given the vast amount of data involved, generating copies of data and moving data across borders is both fraught with risk from a governance perspective. And with escalating cloud egress fees, data movement becomes extremely expensive. At the same time, moving or copying on-prem presents both a financial challenge, and imposes a burden on already stretched technology teams.
What are our options in this new world? Few technology leaders have the option of simply opting out of the whole AI revolution. Whatever your personal views on the technology, few C-suites are prepared to sidestep the AI race.
But we can lay out a roadmap for how to manage data – and compute – in this new world.
To start with, we need to understand the environment we’re really operating in. This may well include accepting that a hybrid or multi-cloud architecture is going to be the normal state of affairs. The very nature of AI, with companies needing to access multiple models and multiple services, means that traditional monolithic, central approaches simply won’t scale.
This, in turn, means every technology leader needs to be crystal clear on what governance means for their organisation and its data, whether that’s data privacy or residency requirements. And it’s imperative that governance is embedded in the AI workflow from the outset. It’s too important to be an afterthought or kicked down the road as other priorities arise.
What this makes clear is that it will normally make more sense to bring compute and those critical AI models to the data, not the other way round.
This isn’t just about reducing data movement costs or producing multiple copies of data. It’s about reducing the friction that comes with reconciling governance requirements as data moves around organizations and across borders. But it can also mean reduced latency and fresher data, making AI more effective.
That’s not to say that it’s not sometimes necessary to move data. But if that’s the case, let’s be mindful and deliberate about it.
But while data becomes increasingly decentralized, it’s imperative that we centralize the management of data access and build the platform accordingly. This lays the groundwork for clear data governance and sovereign AI alike.
By rethinking how they manage data movement, technology leaders can bypass the escalating egress costs and compliance traps embedded in the old way of doing things. The result is an AI strategy that is both scalable and sustainable, enabling enterprise-grade AI, wherever their data lives.
We feature the best data migration tools.
This article was produced as part of TechRadar Pro Perspectives, our channel to feature the best and brightest minds in the technology industry today.
The views expressed here are those of the author and are not necessarily those of TechRadarPro or Future plc. If you are interested in contributing find out more here: https://www.techradar.com/pro/perspectives-how-to-submit
https://cdn.mos.cms.futurecdn.net/U76sZeRd6fS2fKt5RqBYPL-2560-80.jpg
Source link




