Software Engineer II - Cloud Infrastructure Engineer
About the role
About the Role
Abnormal AI is an AI-native behavioral security platform that protects enterprises from advanced threats by analyzing and understanding communication patterns and access behavior at scale. We now protect more than 25% of the Fortune 500, and as we expand into new product lines and geographies, a scalable, reliable infrastructure foundation is critical to our next phase of growth.
The Platform & Infrastructure team is seeking a Cloud Infrastructure Engineer for our Cellular Infrastructure team. This team owns the full lifecycle of Abnormal’s cell-based deployment architecture—bootstrapping new cells, deploying our entire application and infrastructure stack onto them, and keeping every cell healthy, isolated, cost-efficient, and compliant. Engineers on this team wear multiple hats: infra engineering, application-layer debugging, and close collaboration with product and application teams to minimize overhead so those teams can stay focused on building.
What You Will Do
- Bootstrap new cells end-to-end: full infrastructure setup (compute, networking, IAM, etc.) and complete application stack deployment.
- Maintain and evolve cell lifecycle tooling to make provisioning repeatable, auditable, and operator-friendly—reducing manual steps and time-to-production.
- Partner with application and product teams to design and implement scalable, cell-native architecture approaches.
- Design, build, test, scale, monitor, and maintain secure, cost-efficient infrastructure in a multi-cloud environment (AWS and Azure).
- Triage and resolve complex cross-layer issues quickly, then drive root cause fixes that prevent recurrence.
- Drive down technical debt and toil through automation and systemic improvements to the cell deployment lifecycle.
- Participate in on-call rotation with a learning-oriented mindset, identifying systemic gaps and driving long-term reliability improvements.
- Keep cross-team communication low-friction and high-signal: proactive and well-contextualized.
- Contribute as a core member of an agile team through sprint planning, standups, and execution with a strong sense of ownership and teamwork.
Must Haves
- Bachelor’s degree in Computer Science or a related technical field.
- 4+ years of experience engineering cloud infrastructure for production microservice systems, with attention to performance, reliability, security, and cost.
- 2+ years of Python experience, including application-layer code (not just scripts).
- 1+ year of experience with Kubernetes and Helm.
- 1+ year of AWS experience (VPC, IAM, S3, Route 53, CloudFront, EKS, ECS, CloudWatch).
- 1+ year of Terraform and HCL experience.
- Comfort operating across infra and application engineering without hard boundaries.
- Experience with on-call rotations, incident response, and operating production-grade systems.
- Practical experience using Generative AI tools in day-to-day engineering workflows.
- Strong communication skills and the ability to thrive in a fast-paced, remote-first environment—balancing autonomy with collaboration, demonstrating a bias toward action, and maintaining a positive, constructive mindset.
Nice to Haves
- Experience with Bash, Golang, Terragrunt and data infrastructure (Spark, Databricks).
- Hands-on experience with cell-based, multi-tenant, or multi-region infrastructure architectures.
- Familiarity with Generative AI developer tools such as Claude Code, and experience driving AI-first engineering workflows.
- Prior experience building large-scale IaC abstractions or internal developer platforms.
- AWS certifications.