Lead DevOps Engineer (m/f/d)
About the role
Lead DevOps Engineer (m/f/d)
Bangalore or Fully Remote from India
Kayzen is a mobile demand-side platform (DSP) dedicated to democratizing programmatic advertising. We enable leading apps, agencies, media buyers, and brands to run programmatic customer acquisition, retargeting, and brand performance campaigns through its self-serve and managed service options. Built on the three core pillars of performance, transparency, and control, Kayzen powers the world’s best mobile marketing teams with bespoke solutions that fuel business growth and deliver a competitive advantage. With an unprecedented scale of 160B+ daily ad requests from 1.6B+ unique users worldwide, we serve up to 1B+ ads per day in 180 countries. Kayzen is accessible through our APIs and user interface.
The Team
The Platform Engineering team owns Real-time bidding (RTB) platform, a Real-time budget system, Real-time event processing, Stream data processing engine and multiple other complex large-scale distributed components and data pipelines. We work closely with data scientists, data analysts, product and business teams. We are responsible for some of the most technically challenging work, for example:
- Handling ~2 Million Req/sec in sub-millisecond latency
- Handing & managing ~Petabytes of data
- Managing distributed systems deployed across multiple data centers
- Optimizing JVM and Linux kernel for optimal performance
- Managing our own data center spread across the globe consisting of thousand of powerful servers
- Working on some of the most challenging problems of Ad-Tech
About the Role
We are looking for a Lead DevOps Engineer to build and scale the backbone of our global RTB platform. You will move beyond manual server management to engineer automated, self-healing infrastructure that handles petabytes of data with sub-millisecond latency. This role is about treating our private data centers like a programmable cloud.
Responsibilities
- Develop and maintain automated provisioning pipelines (PXE, ZTP) to deploy bare-metal servers at scale across global data centers.
- Research and recommend innovative and automated approaches for system administration tasks.
- Perform regular security monitoring to identify any possible intrusions.
- Repair and recover from hardware or software failures, coordinating with impacted teams.
- Apply OS patches and upgrades on a regular basis, and upgrade administrative tools and utilities.
- Maintain data center environmental and monitoring equipment.
- Perform ongoing performance tuning, hardware upgrades, and resource optimization as required.
- Act as a technical lead and trusted point of contact for the infrastructure team, helping drive operational excellence and engineering best practices.
- Support mentoring and onboarding of engineers, improve team collaboration and communication, and contribute to scaling the team as the infrastructure organization grows.