Close Menu
    What's Hot

    Claude Code just got updated with one of the most-requested user features

    January 15, 2026

    New Cycle Energy Points To $5,000

    January 15, 2026

    Breez Awards Bitcoin Prizes For Lightning Integrations In BTCPay Server, Primal, And More

    January 15, 2026
    Facebook X (Twitter) Instagram
    Facebook X (Twitter) Instagram
    CryptoMarketVision
    • Home
    • AI News
    • Altcoin
    • Bitcoin
    • Business
    • Market Analysis
    • Mining
    • Trending Cryptos
    • Moneyprofitt
    • More
      • About Us
      • Contact Us
      • Terms and Conditions
      • Privacy Policy
      • Disclaimer
    CryptoMarketVision
    Home»AI News»Enterprises are rethinking AI infrastructure as inference costs rise
    Enterprises are rethinking AI infrastructure as inference costs rise
    AI News

    Enterprises are rethinking AI infrastructure as inference costs rise

    adminBy adminNovember 24, 2025No Comments6 Mins Read
    Share
    Facebook Twitter LinkedIn Pinterest Email


    AI spending in Asia Pacific continues to rise, yet many companies still struggle to get value from their AI projects. Much of this comes down to the infrastructure that supports AI, as most systems are not built to run inference at the speed or scale real applications need. Industry studies show many projects miss their ROI goals even after heavy investment in GenAI tools because of the issue.

    The gap shows how much AI infrastructure influences performance, cost, and the ability to scale real-world deployments in the region.

    Akamai is trying to address this challenge with Inference Cloud, built with NVIDIA and powered by the latest Blackwell GPUs. The idea is simple: if most AI applications need to make decisions in real time, then those decisions should be made close to users rather than in distant data centres. That shift, Akamai claims, can help companies manage cost, reduce delays, and support AI services that depend on split-second responses.

    Jay Jenkins, CTO of Cloud Computing at Akamai, explained to AI News why this moment is forcing enterprises to rethink how they deploy AI and why inference, not training, has become the real bottleneck.

    Why AI projects struggle without the right infrastructure

    Jenkins says the gap between experimentation and full-scale deployment is much wider than many organisations expect. “Many AI initiatives fail to deliver on expected business value because enterprises often underestimate the gap between experimentation and production,” he says. Even with strong interest in GenAI, large infrastructure bills, high latency, and the difficulty of running models at scale often block progress.

    Jay Jenkins, CTO of Cloud Computing at Akamai.

    Most companies still rely on centralised clouds and large GPU clusters. But as use grows, these setups become too expensive, especially in regions far from major cloud zones. Latency also becomes a major issue when models have to run multiple steps of inference over long distances. “AI is only as powerful as the infrastructure and architecture it runs on,” Jenkins says, adding that latency often weakens the user experience and the value the business hoped to deliver. He also points to multi-cloud setups, complex data rules, and growing compliance needs as common hurdles that slow the move from pilot projects to production.

    Why inference now demands more attention than training

    Across Asia Pacific, AI adoption is shifting from small pilots to real deployments in apps and services. Jenkins notes that as this happens, day-to-day inference – not the occasional training cycle – is what consumes most computing power. With many organisations rolling out language, vision, and multimodal models in multiple markets, the demand for fast and reliable inference is rising faster than expected. This is why inference has become the main constraint in the region. Models now need to operate in different languages, regulations, and data environments, often in real time. That puts enormous pressure on centralised systems that were never designed for this level of responsiveness.

    How edge infrastructure improves AI performance and cost

    Jenkins says moving inference closer to users, devices, or agents can reshape the cost equation. Doing so shortens the distance data must travel and allows models to respond faster. It also avoids the cost of routing huge volumes of data between major cloud hubs.

    Physical AI systems – robots, autonomous machines, or smart city tools – depend on decisions made in milliseconds. When inference runs distantly, these systems don’t work as expected.

    The savings from more localised deployments can also be substantial. Jenkins says Akamai analysis shows enterprises in India and Vietnam see large reductions in the cost of running image-generation models when workloads are placed at the edge, rather than centralised clouds. Better GPU use and lower egress fees played a major role in those savings.

    Where edge-based AI is gaining traction

    Early demand for edge inference is strongest from industries where even small delays can affect revenue, safety, or user engagement. Retail and e-commerce are among the first adopters because shoppers often abandon slow experiences. Personalised recommendations, search, and multimodal shopping tools all perform better when inference is local and fast.

    Finance is another area where latency directly affects value. Jenkins says workloads like fraud checks, payment approval, and transaction scoring rely on chains of AI decisions that should happen in milliseconds. Running inference closer to where data is created helps financial firms move faster and keeps data inside regulatory borders.

    Why cloud and GPU partnerships matter more now

    As AI workloads grow, companies need infrastructure that can keep up. Jenkins says this has pushed cloud providers and GPU makers into closer collaboration. Akamai’s work with NVIDIA is one example, with GPUs, DPUs, and AI software deployed in thousands of edge locations.

    The idea is to build an “AI delivery network” that spreads inference across many sites instead of concentrating everything in a few regions. This helps with performance, but it also supports compliance. Jenkins notes that almost half of large APAC organisations struggle with differing data rules across markets, which makes local processing more important. Emerging partnerships are now shaping the next phase of AI infrastructure in the region, especially for workloads that depend on low-latency responses.

    Security is built into these systems from the start, Jenkins says. Zero-trust controls, data-aware routing, and protections against fraud and bots are becoming standard parts of the technology stacks on offer.

    The infrastructure needed to support agentic AI and automation

    Running agentic systems – which make many decisions in sequence – needs infrastructure that can operate at millisecond speeds. Jenkins believes the region’s diversity makes this harder but not impossible. Countries differ widely in connectivity, rules, and technical readiness, so AI workloads must be flexible enough to run where it makes the most sense. He points to research showing that most enterprises in the region already use public cloud in production, but many expect to rely on edge services by 2027. That shift will require infrastructure that can hold data in-country, route tasks to the closest suitable location, and keep functioning when networks are unstable.

    What companies need to prepare for next

    As inference moves to the edge, companies will need new ways to manage operations. Jenkins says organisations should expect a more distributed AI lifecycle, where models are updated across many sites. This requires better orchestration and strong visibility into performance, cost, and errors in core and edge systems.

    Data governance becomes more complex but also more manageable when processing stays local. Half of the region’s large enterprises already struggle with the variance in regulations, so placing inference closer to where data is generated can help.

    Security also needs more attention. While spreading inference to the edge can improve resilience, it also means every site must be secured. Firms need to protect APIs, data pipelines, and guard against fraud or bot attacks. Jenkins notes that many financial institutions already rely on Akamai’s controls in these areas.

    (Photo by Igor Omilaev)

    Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is part of TechEx and co-located with other leading technology events. Click here for more information.

    AI News is powered by TechForge Media. Explore other upcoming enterprise technology events and webinars here.



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    admin
    • Website

    Related Posts

    Claude Code just got updated with one of the most-requested user features

    January 15, 2026

    McKinsey tests AI chatbot in early stages of graduate recruitment

    January 15, 2026

    HBM on GPU: Thermal Challenges and Solutions

    January 14, 2026

    Salesforce rolls out new Slackbot AI agent as it battles Microsoft and Google in workplace AI

    January 14, 2026
    Add A Comment
    Leave A Reply Cancel Reply

    Top Posts

    Claude Code just got updated with one of the most-requested user features

    January 15, 2026

    New Cycle Energy Points To $5,000

    January 15, 2026

    Breez Awards Bitcoin Prizes For Lightning Integrations In BTCPay Server, Primal, And More

    January 15, 2026

    Subscribe to Updates

    Get the latest sports news from SportsSite about soccer, football and tennis.

    Welcome to Crypto Market Vision – your trusted source for everything crypto Our mission is simple: to make the world of cryptocurrency clear, accessible, and actionable for everyone. Whether you are a beginner exploring Bitcoin for the first time or a seasoned trader looking for market insights, our goal is to keep you informed, empowered, and ahead of the curve.

    Facebook X (Twitter) Instagram Pinterest YouTube
    Top Insights

    Claude Code just got updated with one of the most-requested user features

    January 15, 2026

    New Cycle Energy Points To $5,000

    January 15, 2026

    Breez Awards Bitcoin Prizes For Lightning Integrations In BTCPay Server, Primal, And More

    January 15, 2026
    Get Informed

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    Facebook X (Twitter) Instagram Pinterest
    • Contact Us
    • About Us
    • Terms and Conditions
    • Privacy Policy
    • Disclaimer

    © 2025 cryptomarketvision.com. All rights reserved. Designed by DD.

    Type above and press Enter to search. Press Esc to cancel.

    ethereum
    Ethereum (ETH) $ 3,299.40
    tether
    Tether (USDT) $ 0.999713
    bitcoin
    Bitcoin (BTC) $ 95,453.00
    xrp
    XRP (XRP) $ 2.07
    bnb
    BNB (BNB) $ 930.72
    solana
    Wrapped SOL (SOL) $ 142.02
    usd-coin
    USDC (USDC) $ 0.99975
    dogecoin
    Dogecoin (DOGE) $ 0.139689