What it takes to run AI in the real world: Lessons from Akamai Digital Leadership Summit

India is at an interesting point in its AI story. The foundation model debate, of how large, how many parameters, and who builds it, is largely settled. What remains is the harder question: how do you actually run AI in production, at India's scale, under India's cost constraints, without the infrastructure budgets that the problems seem to demand? That was the question the Akamai Digital Leadership Summit set out to explore, and on a Friday evening in Bengaluru, with 125 senior technology leaders in the room, it found some honest answers.

The third edition of the summit was designed, from the outset, as something other than a product event. "This is truly an attempt at trying to create a neutral platform," said Sumant Narayanan, Akamai’s Regional Sales Director for India and SAARC, in his welcome address.

Setting the context

Narayanan noted a shift that most in the room would have recognized. "Over the last 12 months or so, a lot of the conversation around AI has been mostly about foundation models, making them bigger and bigger. But now, the conversation has shifted towards how enterprises use these foundational models and actually deliver value to their customers." He also flagged the other side of the equation early: "There’s never been a better time to be an attacker. Especially with AI tools, it's probably the best time to be an attacker."

Dr Robert Blumofe, Akamai’s CTO, traced the company’s history with AI from the deep learning breakthrough of 2012.

"AlexNet was the birth of deep learning, and deep learning was, for Akamai, the perfect tool for our cybersecurity business" through to the ChatGPT moment of November 2022, which he described as the first time most people had any direct interaction with AI.

His caution to the room: "A lot of companies who have come to it late find themselves becoming LLM one-trick ponies. To get real value out of AI, you not only need to know how to use the LLM, but you need to know how to use other forms of deep learning and other forms of ML."

Akamai's partnership with NVIDIA, deploying RTX Pro 6000 GPUs across a distributed network, is its response to where inference is heading.

India’s infrastructure layer

Jigar Halani from NVIDIA’s solution architecture team followed with the infrastructure context that sits beneath everything else discussed through the evening. India’s challenge, he argued, is fundamentally different from the West’s: not an ageing population or a labour shortage, but the need to deliver services at population scale for near-zero cost per transaction.

He pointed to deployments already in operation; the Bhashini platform handling 20 million API calls a day across 36 Indian languages, an AI pipeline reducing biometric authentication failure rates by 50% on Aadhaar, and a RAG system helping navigate five million pending court cases by surfacing relevant precedents and applicable law. On the NPCI fraud detection system, his point was stark: “Fraud as a standalone use case is running on 100 times more powerful infrastructure, using GPUs, as we speak today.” His framing of what AI means for India cut through the noise: “AI, for India, is not a replacement of humans. It is solving something at population scale that humans simply haven’t been able to solve over a long period of time.”

Scaling AI in production

The first panel of the evening, moderated by Hrishikesh Varma, Director of Product Management at Akamai, moved from the macro picture to the specifics of what running AI at scale actually looks like inside three very different organisations.

Sanath Moguluri, VP of Voice AI Engineering at Reliance Jio, described the latency challenge of serving hundreds of millions of users across smartphones, televisions, and automotive systems. “On a telecom network, achieving 500 milliseconds is very challenging. In our experience, around one second of latency is good enough for people to converse with agents, when the use case is specific and domain-focused.” Jio runs a hybrid model: “Not everything needs to go to LLMs. We have hybrid deployments, from edge to cloud, and even within the cloud, there are smaller and larger models doing orchestration.”

Sagar Gaonkar, CTO of Eloelo, described his team’s approach to live content moderation, separating edge decisions from cloud decisions, with human review reserved for the boundary cases. “A good 80% of the cases are very black and white. That last 20% is where you want the human in the loop.” The end-to-end cycle runs in under 10 seconds.

Pranav Tiwari, Head of Engineering APAC at Postman, offered a wider lens on what agentic AI is doing to the connectivity layer. “What used to connect applications very deterministically is changing fundamentally. Connectivity is transforming from plumbing between two applications to something with inference in the middle, business logic blended in, and a series of conversations that eventually get a task done.” A show of hands confirmed that a significant portion of the room already has agent-written code running in production.

Building at India’s cost constraints

If the panel addressed how AI runs in production, the fireside chat with Mohit Saxena, Co-Founder and CTO of InMobi and GlanceAI, moderated by Adam Karon, Akamai’s COO, addressed what it costs — and what it takes to bring that cost down to a point where the business makes sense.

Saxena was direct about what India demands. “When we launch a product in the US, our price point is very different. We can buy whatever GPU we want, put in all kinds of infrastructure. But when we launch the same service in India, we have to rethink our engineering.” When Glance first started generating AI images, the cost was $30 per image, which was unviable for a price-sensitive market. The team brought it down to $1.50, then to a few cents. “If you make it at three cents, the same product that you launch in the US, then India is sorted.”

The method: “Almost 60% of queries are repetitions. You don’t need to call the LLM for every one of them.” Combined with batch processing and multiple specialised models, Glance reduced its effective model invocations substantially. On engineering talent, Saxena pushed back against easy conclusions. “AI has reduced the bar of being an average engineer. But it has really raised the bar of a good engineer. The average is not good enough anymore.” Today, roughly 70% of code at Glance is written by AI in the IDE, but the integration work, he said, still requires the best engineers.

Engineering for Bharat scale

Kiran Kumar Katreddi, VP of Platform Engineering at Meesho, extended the cost conversation into even more constrained territory. Meesho’s model of zero commission for sellers, revenues driven entirely by advertising only works if AI keeps operational costs low across every layer of the business.

With over 200 million users, many of them first-time internet users in Tier-III and Tier-IV towns, the engineering constraints are specific: a 14MB app size, voice and image search in eight Indian languages, personalisation that updates within a 500-millisecond session window, and AI-assisted address resolution for deliveries to locations described as “opposite the previous sarpanch’s house.” At peak Diwali sale volumes, when order volumes hit 3–4x normal, commercial inference platforms kept breaking. Meesho built its own Bharat ML stack, which it open-sourced in 2025 after 18 months of development. “Most of our innovation,” Katreddi said, “exists because of the AI investments we’ve made over the last four or five years.” The platform now handles 3–4 trillion inferences and 1 million queries per second on model inference alone.

The security picture

Vijay Kolli, Akamai’s Regional VP for Enterprise Security, shifted the evening’s focus from cost and scale to what happens when AI systems expand the attack surface. The number that got the room’s attention: API growth on Akamai’s network is no longer 100% annually. “It’s literally 1,000% and more.” AI agents accessing internal databases and inheriting permissions without judgment have changed the threat model in ways that legacy architectures were not designed for.

The panel featuring Mukesh Solanki, CISO at KreditBee, and Sujatha Iyer, Head of AI Security at Zoho Corp, covered the attacker side as much as the defender side. Solanki, whose company processes a million loans a month, was frank: “Hackers will get much more sophisticated with generative AI tools, and will find better ways to poison data so that someone who isn’t eligible for a loan ends up getting one.” Iyer made the case for deterministic models where explainability is non-negotiable: “If your monitoring solution is telling you there’s an 80% chance your server is going to face an outage, it has to come with an explanation.” Her closing line was unambiguous: “Security — imbibe it right from day one of software development. It’s not an afterthought anymore.”

Sovereign AI and the voice problem

Ganesh Gopalan, CEO of Gnani.ai, closed the formal sessions with a grounded take on sovereign AI. He said the commercial rationale is straightforward: enterprises want to retain ownership of the intelligence embedded in their systems. Gnani’s response is to own every layer of its voice stack, ASR, TTS, turn-taking, denoising, and a small language model tuned for voice. “Unless you develop that tech, you’re going to struggle – firstly in terms of protecting your customers’ data, secondly about superior experiences, and thirdly about cost,” he said.

The structural problem with current voice pipelines, he argued, is that converting voice to text and back again loses emotional information that matters. Gnani is building a voice-to-voice model that preserves it. The company currently processes around 30,000 concurrent voice streams. On guardrails, his observation was pointed: "A couple of years back, we very proudly told customers that 55% of our prompting was guardrails. Today, if you say that to a customer, they will throw you out of the room. The benchmark now is that a minimum of 75 to 80% needs to be guardrails."

What it takes to run AI in the real world

What tied the evening together was less a single theme and more a shared way of thinking about AI at scale. Whether the discussion was about inference economics, voice AI pipelines, API security, or citizen services, the speakers were drawing from experience—often hard-won.

As Vijay Kolli put it while opening the security session — borrowing from Marvel, by his own admission, "with great power comes great responsibility." The room laughed, but nobody disagreed.

For enterprises building AI systems at scale, that responsibility extends across infrastructure design, cost optimisation, and security architecture. The discussions at the summit suggested that the next phase of AI adoption will not be defined solely by advances in models, but by the engineering discipline required to run them reliably in the real world.

And in markets like India, where platforms must serve hundreds of millions of users while maintaining cost efficiency, that discipline may ultimately determine which AI systems succeed in production.

Original Article
(Disclaimer – This post is auto-fetched from publicly available RSS feeds. Original source: Yourstory. All rights belong to the respective publisher.)

What it takes to run AI in the real world: Lessons from Akamai Digital Leadership Summit

Setting the context

India’s infrastructure layer

Scaling AI in production

Building at India’s cost constraints

Engineering for Bharat scale

The security picture

Sovereign AI and the voice problem

What it takes to run AI in the real world

What happened when scale met the farm reality

TCS and 3 other stocks announcing Q4 FY26 results today

Related Posts

Leave a Comment Cancel Reply