Telco EdgeWash 2.0: AI-RAN inference edition
We're entering a new era of telco #edgecomputing hype. It is essentially a replay of the MEC debacle from 5-6 years ago, but this time updated for the AI era, with added GPUs. I've included some unedited charts I created during 2018-19 into this article for reference.
The misguided idea is that deploying #AI compute resources in the radio network (AI-RAN) yields a large opportunity for mobile operators to host inferencing workloads at the edge, for third party applications and content providers.
We've been here before. Telcos have demonstrated only a minimal role in edge computing services, either as localised low-latency cloud computing suppliers, or even in terms of just offering colocation space in exchanges, or mobile towers / aggregation sites.
Nothing really changes with AI workloads that means in-network servers and compute magically becomes a huge opportunity. If anything, it's even harder.
You may have read my debates on Alok Tripathi's numerous, repeated threads about this already, plus some input on James Crawshaw's posts, but it's worth collating my thoughts and criticisms into a single post of my own.
First, some starting points:
So what's new?
The central new trend is that telcos are considering deploying AI-optimised compute (think GPUs + NPUs combined with CPUs) both at centralised sites and also deeper in the network.
The central datacentres will be used for in-house AI training and inferencing, perhaps for local-language LLMs, but also for OSS tasks such as network planning and analytics, BSS functions such as customer support and marketing, various maintenance and security capabilities and so on. These are applications which are mostly not latency-sensitive, and typically don't need massive amounts of data and energy supply. Operators don't envisage their own nuclear reactors for power, unlike some of the larger hyperscale facilities being planned.
They may also be used as "AI factories" in markets where the operator is the leading #supercomputing provider, perhaps working with government on #SovereignAI. In other markets, operators are buyers of GPU-aaS, not sellers. They may have some inhouse resource, but will then rely on cloud #hyperscalers, new AI cloud specialists like CoreWeave, or academic or other supercomputing centres.
AI-RAN sounds good but expensive. "Monetising RAN AI" is unlikely
As well as centralised facilities, distributed compute is mostly being aimed at acceleration and AI functions for the radio network. For mobile, that means things like Open RAN beamforming, radio resource management and so on. Potentially it can also be used for spectrum-sensing, as I wrote recently in another LinkedIn post. The same edge servers might also handle UPFs and core / security functions.
All fair enough.
This is where I expect to see a lot of work in making 6G more AI-enabled in future. There's a whole slew of potential cleverness that can be added to the radio, whether that's relating to future MIMO variants and metasurfaces, channel estimation, energy optimisation and so on. Early examples will appear in 5G Releases 18/19/20.
But where I draw the line is the notion that (expensive) spare capacity can be sold off easily by the MNO, allowing the AI-RAN to be monetised for "edge inferencing as a service". Given the costs of the GPUs, I can understand why this appeals in theory, but it has all the same practical limits as MEC, plus some new ones.
The challenges for edge inferencing-aaS
It's worth revisiting the various reasons that existing MEC propositions haven't gained traction for telcos selling edge-cloud services to enterprises or application developers. Most have a direct read-across to AI-RAN.
Recommended by LinkedIn
A key point I made in the past: network edge services can be of three main types:
It's entirely unclear what types of application are ideal for the single-network bubble in the chart, except ones specifically created for / deployed by an individual operator. Maybe there are options if a specific MNO does a national-language LLM for its subscribers and partners, that could fly, or perhaps if it has a dedicated vertical team working in a specific sector.
What can be done, realistically?
I'm very skeptical that there is a major market available for MNOs in edge inferencing. In some cases, I can see certain operators as sellers of GPU-aaS from central computing sites, if there's a supply/demand imbalance in a given country, or a desire for "sovereign AI" capability for some applications or verticals.
But edge AI, especially based on AI-RAN? Why is AI inferencing any different to all the other supposed edge workloads, that have failed to drive a market for in-network edge computing for the past 5-6 years?
The telecom industry claimed that MEC servers would be used for enterprise apps, XR rendering offload, smart cities, telemedicine, vehicles etc. Almost none of that has ended up in telco servers. Where they’ve used edge compute, it’s been either new mini data centres (eg shipping container size) or local metro / regional DCs.
It's not clear that the move to AI-based applications will change the central vs. edge demand pattern for compute. Even where more edge-side processing is needed, there's a lot of work ongoing on AI PCs, smartphones, home gateways, cars and so on.
Maybe there's a higher-level battle ongoing between NVIDIA and Qualcomm here, with the latter pushing for more device-centric compute, while NVIDIA is trying to get telcos to buy into their messaging and ecosystem.
So what could move the needle here? Here are some suggestions that could help operators still interested in edge computing, whether that's for AI or other purposes.
1) As I've also written about recently, telcos need to eat their own dogfood. Let's see the AI-RAN inferencing resource being used by operators for their own new applications - perhaps offloading IoT data processing such as managed security cameras, or being used for local-language telemedicine or retail or smart home GenAI. If their own product teams won't use the spare capacity, then why would anyone else? They could even ask their finance and treasury teams if they'd like to run some Bitcoin miners in the RAN, if there's all that spare compute and power available and unused.
2) The next category of customer should be other service providers. They already have wholesale relationships such as MVNOs or roaming. Surely those CSPs would want to take advantage of outsourced inferencing? They could just ask their account manager and have it added to their wholesale charging and billing, no?
3) MNOs need to sort out local interconnect and peering for all sorts of reasons, but it's an absolute pre-requisite for AI applications. It counts as "necessary but not sufficient", but is something they should have done years ago. This means working with Internet, cloud and CDN providers more closely, as well as exchange operators like DE-CIX and London Internet Exchange (LINX) . They will also need to have long chats with their regulators about this, and also tell their internal IP transit subsidiaries to get onboard, or shut up. It will probably need awkward conversations with the GSMA / IPX folk as well.
4) Work closely with the hyperscalers and some of the new AI cloud specialists like Coreweave. While the initial edge forays with AWS, GCP and Azure haven't amounted to much, maybe they have more ideas about managing a distributed compute marketplace now. However, MNOs should expect to share any in-network AI-edge-aaS marketplaces [I wish that had a funny acronym] with their fixed broadband peers, who have considerably more opportunity, I expect, as well as various dedicated colocation and edge datacentre owners.
5) Show willingness to be a buyer, as well as seller. If edge AI inferencing is so important, I'd expect operators to be customers as well as putative suppliers. If you're not checking out who else has spare "edge AI" capable GPUs in your HQ city, maybe you should? After all, you're supposedly in competition with them.
But above all, drop the #EdgeWash hype. We've heard it before.
As I said years ago, 5G needs edge, much more than edge needs 5G. (And the slide below also mentioned GPUs, back in 2019).
Software Networking and Automation Leader
3wInteresting article. Am I getting too cheeky at summarizing it as "if-you-build-it-they-will-come is not a business model?" I'm a believer that edge applications will come (as a set of app dev requirements are satisfied) but like MEC, I don't see them happening if we just put some compute power in the corner and start bragging. We need clear applications driving actual requirements with business-intent on front instead of the back.
Technologist | Consultant | Industry 4.0/5.0 Beyond Connectivity with AI/ML | Cloud Native/Infra | ICT Wireless Solutions (4G/5G/WiFi / VNF/CNF/PNF) -- change is eventual, always strive for best
4winformative Dean Bubley as your other article use to be. but very intriguing to me when your refer Edge at RAN with 6 year old MEC or MAEC ( I guess ) air balloon. I agree edge computing could not take shape earlier but no doubt its time has come, for sake of Application demand of low latency and real time inferences in AI context too. But I feel in kind of conundrum, when you mix this with AI-RAN and talk of Edge/GPU there for sake of RAN specific inferences, are we talking same thing there altogether.
I appreciate the insights, Dean! Edge computing is not a hype, there are clear use cases to justify wide deployment.
Fully agree. Edge infra is probably hyperscalers play…surely not at cell site, but at a metro /regional data centres. Network can only help with QoD, routing optimisations etc.