DevOps Engineer II
About the role
Role Overview
We are seeking a motivated DevOps Engineer II to join our growing Platform SRE team. In this role, you will support the design, deployment, and maintenance of our multiregion, multicloud infrastructure, with a primary focus on Google Cloud Platform (GCP). You’ll work closely with senior DevOps engineers and developers to ensure high availability, scalability, and reliability across distributed systems. As a DevOps Engineer II, you’ll help design, deploy, and optimize infrastructure to support ML training, inference, and data-intensive pipelines. This is an excellent opportunity for someone early in their career who is eager to deepen their expertise in Kubernetes, GitLab CI, ArgoCD, and cloud automation while gaining exposure to complex global infrastructure.
Responsibilities
- Assist in managing multiregion and multicloud infrastructure, ensuring resiliency, scalability, and performance.
- Support infrastructure provisioning and deployments primarily on GCP, while gaining exposure to other cloud providers.
- Design, deploy, and maintain agentic AI workflows and automation systems, integrating LLMs, orchestration frameworks, APIs, and observability tooling to improve operational efficiency, incident response, and developer productivity.
- Collaborate with development teams to design and maintain CI/CD pipelines in GitLab CI.
- Work on Kubernetes cluster management (GKE and potentially other managed K8s offerings).
- Contribute to GitOps-based deployments using ArgoCD.
- Help automate infrastructure with Terraform and other infrastructure-as-code tools.
- Monitor system health and participate in the on-call rotation, contributing to incident response, troubleshooting, and root cause analysis to improve reliability.
- Document processes, create runbooks, and help improve operational practices.
Qualifications
- Bachelor’s degree in Computer Science, Information Technology, or related field (or equivalent practical experience).
- 3-6 years of hands-on experience in DevOps, SRE, or related roles
- Solid foundation in Linux/Unix administration and basic networking.
- Hands-on experience with public cloud services (Compute Engine, GKE, Cloud Storage, IAM, VPC etc).
- Understanding of Kubernetes concepts
- Familiarity with CI/CD concepts and ArgoCD for GitOps workflows.
- Interest or exposure to multicloud and multiregion architectures.
- Strong analytical, troubleshooting, and problem-solving skills.
- Scripting experience with Python, Bash, or