Kubernetes as the AI Operating System: What Latin American Leaders Need to Know
This is infrastructure reality
At KubeCon 2025, the message was clear: Kubernetes is positioning itself as the operating system for AI.
Not a container orchestrator. Not a DevOps tool. The foundational platform that AI stacks will assume exists.
For Latin American technology leaders, this shift creates both opportunity and urgency. The decisions you make about infrastructure today will determine whether you can compete in the AI era tomorrow.
I’ve been working with Kubernetes since its early days. I’m one of 1,500 Kubestronauts globally and a CNCF Ambassador. I’ve helped companies across Latin America build cloud-native infrastructure.
Here’s what you need to understand about this moment.
Why Kubernetes for AI?
The question I get most often: “Why does AI need Kubernetes? Can’t we just run models on VMs?”
You can. But you’ll lose.
AI workloads have characteristics that Kubernetes was built to handle:
Elastic resource demands. Training jobs need massive compute for hours, then nothing. Inference needs fluctuate with user demand. Traditional infrastructure either wastes money on idle resources or can’t scale when needed.
GPU orchestration. AI runs on GPUs and specialized accelerators. Kubernetes 1.35 introduced Dynamic Resource Allocation (DRA), which finally gives the scheduler visibility into specific device attributes. You can now schedule based on GPU memory, compute cores, and topology.
Multi-tenant isolation. You’re not running one AI workload. You’re running dozens. Teams need isolation. Resources need fair allocation. Costs need attribution. Kubernetes provides the primitives.
Reproducibility. AI models need to be versioned, deployed, rolled back, and audited. The declarative nature of Kubernetes makes this manageable. Running models manually doesn’t scale.
The CNCF’s new AI Conformance Program exists because AI vendors recognized they need a standard platform. They chose Kubernetes.
What Changed in Kubernetes 1.35
The December 2025 release of Kubernetes 1.35 “Timbernetes” wasn’t just incremental. It marked Kubernetes’ explicit pivot toward AI workloads.
Three features matter most:
In-Place Pod Resource Resize (GA)
You can now adjust CPU and memory on running pods without restarts. For AI training jobs that run for hours or days, this is transformative. You can give a pod extra CPU during initialization, then scale down. You can adjust memory as datasets change.
One architect I know described it as “finally treating pods like the living systems they are.”
Dynamic Resource Allocation (DRA) Enhancements
DRA graduated to beta with significant improvements. The old device plugin model could only tell the scheduler “there are 4 GPUs available.” DRA tells the scheduler “there are 4 GPUs available, here are their memory sizes, here are their compute capabilities, here’s their topology.”
This enables locality-aware scheduling. You can ensure pods that need to communicate land on GPUs connected by NVLink. You can optimize for data locality. You can reduce the latency that kills AI training performance.
Gang Scheduling (Alpha)
AI training jobs often need multiple pods to start simultaneously. If pod 1 starts but pod 2 can’t be scheduled, pod 1 sits idle wasting expensive GPU time.
Native gang scheduling ensures either all pods in a group start, or none do. This was previously only available through external projects like Volcano. Now it’s in core Kubernetes.
The Latin American Opportunity
Here’s where it gets interesting for our region.
AI requires three things: compute, data, and models.
Compute: Latin America will never have hyperscale cloud presence matching the US. But we don’t need it. SLMs (Small Language Models) are matching LLMs for enterprise use cases at a fraction of the compute. Kubernetes lets you run these efficiently on regional infrastructure.
Data: Data sovereignty requirements are pushing companies to keep data in-region. Financial services, healthcare, government—they can’t send data to US clouds. Kubernetes running on regional infrastructure enables compliant AI deployment.
Models: Open-source models like Llama, Mistral, and DeepSeek are closing the gap with proprietary alternatives. The time from closed-source breakthrough to open-source match dropped from 140 days to 41 days in one year. You can run state-of-the-art AI without vendor lock-in.
The constraint of not having AWS regions everywhere becomes an advantage. Companies building on Kubernetes + open-source models + regional infrastructure own their AI destiny.
Practical Steps for LatAm Leaders
If you’re a CTO or IT leader in Latin America, here’s what I’d recommend:
1. Assess Your Kubernetes Maturity
Where are you on the journey?
Level 0: No Kubernetes
Level 1: Kubernetes for stateless workloads
Level 2: Kubernetes for databases and stateful apps
Level 3: Kubernetes as platform (internal developer platform)
Level 4: Kubernetes for AI/ML workloads
Most Latin American enterprises I work with are at Level 1 or 2. AI requires Level 3 minimum. That’s a gap to close.
2. Upgrade to Kubernetes 1.35+
The AI-relevant features require current versions. If you’re running Kubernetes 1.28, you’re missing in-place resize, DRA improvements, and gang scheduling.
Note: 1.35 deprecates cgroups v1. Make sure your underlying infrastructure supports cgroups v2.
3. Invest in GPU Infrastructure Strategy
AI needs GPUs. But GPU economics are different from CPU economics. You’re not buying commodity compute. You’re managing scarce, expensive, specialized resources.
Questions to answer:
Buy or rent GPUs?
On-premises or cloud?
Which GPU families for which workloads?
How to share GPUs across teams?
Kubernetes provides the orchestration, but you need the strategy.
4. Build Platform Engineering Capability
Kubernetes alone isn’t enough. You need an internal developer platform that makes AI deployment self-service.
The best organizations I see have platform teams that provide:
Pre-configured AI development environments
Model serving infrastructure (KServe, Seldon, etc.)
Experiment tracking and MLOps tooling
Cost visibility and chargeback
This is where “Kubernetes for AI” becomes real.
5. Consider Regional Infrastructure Partners
The hyperscalers will sell you GPU instances. But for Latin American companies with data sovereignty requirements, compliance needs, or cost constraints, regional infrastructure matters.
I’m biased here—Cuemby exists to solve this problem—but the principle stands regardless of vendor. Understand where your AI workloads actually run.
The Certification Path
For technical professionals, credentials matter in this space.
The CNCF offers certifications that signal Kubernetes expertise:
CKA (Certified Kubernetes Administrator)
CKAD (Certified Kubernetes Application Developer)
CKS (Certified Kubernetes Security Specialist)
Holding all three makes you a Kubestronaut. There are only 1,500 of us globally.
The certification path takes time. But it’s one of the clearest ways to demonstrate cloud-native expertise to employers and clients.
Latin America needs more Kubestronauts. The companies building AI infrastructure need people who understand the platform.
What’s Next
Kubernetes 1.36 is expected by April 2026. The trajectory is clear: deeper AI integration, better resource management, more sophisticated scheduling.
The organizations that build Kubernetes capability now will have options later. The organizations that wait will be catching up while competitors ship AI products.
This isn’t hype. This is infrastructure reality.
The AI era runs on Kubernetes. Latin America can either build on that foundation or watch from the sidelines.
I know which one I’m choosing.
Angel Ramirez is CEO and Co-Founder of Cuemby, one of 1,500 Kubestronauts globally, and a CNCF Ambassador. He founded Fundación Hispana de Cloud Native, Latin America’s largest cloud-native community with 5,000+ members.




