Kubernetes & Bare Metal Engineer, ISS
Everpure
About the role
We’re in an unbelievably exciting area of tech and are fundamentally reshaping the data storage industry. Here, you lead with innovative thinking, grow along with us, and join the smartest team in the industry.
This type of work—work that changes the world—is what the tech industry was founded on. So, if you're ready to seize the endless opportunities and leave your mark, come join us.
Kubernetes & Bare Metal Engineer – Member of Technical Staff
About Infrastructure Shared Services (ISS)
Infrastructure Shared Services (ISS) is responsible for Everpure's engineering infrastructure, development environments, and production-adjacent services across our global data centers and public cloud environments. We partner with internal engineering teams to deliver reliable, secure, and scalable platforms so they can focus on building high‑quality products.
Within ISS, the Bare Metal Kubernetes Platform team designs, builds, and operates large‑scale Kubernetes environments on bare metal servers, backed by Everpure arrays and Portworx, and integrated with ISS’s observability, CI/CD, and multi‑tenancy frameworks.
SHOULD YOU ACCEPT THIS CHALLENGE...
As an Bare Metal & Kubernetes Engineer, you will be a senior individual contributor responsible for designing, deploying, and operating large‑scale bare‑metal Kubernetes clusters and platform services in our on‑prem data centers.
You will:
Lead technical design and implementation for new cluster features and capabilities
Own critical areas of the platform (e.g., cluster lifecycle, networking, storage, observability, or multi‑tenancy)
Drive reliability, performance, and security of the Kubernetes platform used by multiple business units
Mentor other engineers and influence best practices across ISS and partner teams
Key Responsibilities
Platform Design & Architecture
Design and evolve bare‑metal Kubernetes architectures including control plane, worker nodes, networking, and storage integrations (Portworx on FlashArray/FlashBlade).
Define standards for cluster lifecycle management (provisioning, upgrades, decommissioning) using tools like Kubespray, Foreman, and internal CD pipelines.
Contribute to design of multi‑tenant, secure clusters including RBAC, OIDC/SSO, namespace isolation, and quota/limit strategies.
Implementation & Operations
Deploy, operate, and continuously improve large‑scale bare‑metal Kubernetes clusters across multiple data centers (dev, stg, prod).
Implement and maintain cluster networking: CNI (e.g., Cilium), BGP, load balancers, ingress, and multi‑rack/ToR topologies.
Build and maintain GitOps‑based workflows (e.g., ArgoCD) and CI/CD pipelines to manage cluster add‑ons, platform services, and tenant workloads.
Ensure observability of the platform (metrics, logs, traces) using Prometheus, Elastic stack, Grafana, and related tooling; define SLOs and alerts with SRE teams.
Participate in “follow the sun” on call for the production system. Lead or contribute the incident man