Strategic Deployment Engineer, Chanakya

Sarvam AI

DelhiEngineering

About the role

About Sarvam

Sarvam is building the bedrock of Sovereign AI for India. The company is developing India's full-stack sovereign AI platform, building across research, models, infrastructure and applications with a singular focus on making AI genuinely work for India. Sarvam works with leading enterprises and public institutions and is backed by Lightspeed, Peak XV, and Khosla Ventures. Sarvam partners with India's leading brands, including Tata Capital, SBI Life, CRED, IDFC, and LIC.

About the Role

Strategic Deployment Engineers are Sarvam's forward-deployed technical assets. Embedded with clients, you own the full lifecycle of AI system deployments — including in air-gapped, classified, and on-prem environments — and in complex enterprise accounts where standard playbooks don't apply. You are the technical SPOC for your assigned accounts. Success is measured by whether the system works, the client trusts us, and the deployment creates durable capability — not by ticket closure. You will operate with autonomy and carry real accountability: for the system, for the relationship, and for outcomes.

What You'll Do

• Own end-to-end deployment of Sarvam's full AI stack in client environments — on-prem, air-gapped, classified infrastructure, and complex enterprise accounts

• Serve as technical SPOC for assigned accounts, from scoping and PoC through to steady-state operations

• Diagnose and resolve integration failures, model drift, inference issues, and infrastructure breakdowns without escalation ladders

• Surface field learnings that feed back into the product layer and replicable deployment library

• Manage deployment pipelines, model serving, and environment configuration in non-standard, constrained settings

• Drive client-side adoption through documentation, training, and operational handover where required

• Own client satisfaction (CSAT, time-to-value, uptime) for your accounts; flag risks before they become escalations

What We're Looking For

• 3–6 years in software or ML engineering, with at least one full-cycle on-prem or enterprise deployment delivered end-to-end

• Production-grade experience in Python, Docker, Linux systems administration, REST APIs, and CI/CD pipelines

• Hands-on experience with LLM inference stacks — vLLM, TGI, Ollama, or equivalent — and RAG architectures and vector stores

• Experience deploying in constrained environments: air-gapped networks, limited connectivity, non-standard hardware, or complex regulatory requirements

• Full-stack debugging instinct — comfortable diagnosing across infrastructure, networking, and application layers without a specialist to hand

• Demonstrated ability to ship and maintain a working system end-to-end in environments where reliability was non-negotiable

• Proven ability to navigate ambiguous client requirements and make the call without explicit guidance

Signals We Look For

• You've shipped and maintained a working system end-to-end in environments where the