Infrastructure

DevOps Engineer

Local LLM Focus

🇻🇳 VietnamRemote-friendlyEnglish-first

About the role

Own the infrastructure for running local LLMs at scale — GPU clusters, inference servers, and AI-optimized deployment pipelines.

You'll ensure our self-hosted AI systems are fast, reliable, and cost-efficient across multiple client environments — both in our SG/VN data centres and inside customer VPCs.

This role sits at the intersection of MLOps, classic SRE, and platform engineering. You ship Helm charts, Terraform, and observability dashboards rather than ad-hoc scripts.

What we're looking for

3+ years operating production Kubernetes
Experience with vLLM, TGI, Triton, or comparable LLM-serving stacks
Solid Linux + networking fundamentals; Terraform / Pulumi a plus
You believe a runbook is part of the deliverable