Lead DevOps Engineer(7:00 AM to 3:00 PM Shift)
About the role
So, what’s the role all about?
The DevOps Engineer is a hybrid, senior-level role sitting at the intersection of operational reliability and software delivery automation. You will function as an integrated part of a cross-functional engineering team, combining the proactive service management mindset of an Application Operations Engineer with the automation-first philosophy of a DevOps practitioner.
You will be responsible for keeping production environments healthy and performant, while simultaneously designing and maintaining the CI/CD pipelines, infrastructure-as-code frameworks, and tooling that enable rapid, high-quality software delivery. You are the connective tissue between engineering, platform, and operations — someone who is equally comfortable in an incident bridge call and a sprint planning meeting.
How will you make an impact?
DevOps & Automation
- Design, build, and maintain continuous integration and continuous delivery (CI/CD) pipelines for rapid, quality-assured deployment of software deliverables.
- Build and manage Infrastructure as Code (IaC) using tools such as CloudFormation, Ansible, Terraform, Chef, or Puppet.
- Manage day-to-day operations of release pipelines, build tools, artifact repositories, and source control systems.
- Coordinate build and release activities with engineering, QA, product, and other stakeholders across the organisation.
- Identify, research, and prototype new technologies and practices to continuously improve DevOps processes and team efficiency.
- Maintain and upgrade DevOps systems in both production and non-production environments on an ongoing basis.
Cloud & Application Operations
- Proactively monitor infrastructure and application health — including CPU, memory, file systems, databases, batch jobs, and network performance — and respond swiftly to anomalies.
- Identify and resolve operational issues including infrastructure failures, batch processing errors, network disruptions, and client data feed problems.
- Troubleshoot and respond to production downtime, performance degradation, and security-related incidents in a timely, structured manner.
- Perform end-to-end operational duties covering application server health, service availability, and platform integrity in accordance with documented processes and runbooks.
- Review and manage client service request tickets in adherence to defined SLAs, ensuring accountability and timely resolution.
- Provide on-call off-hour support as part of a structured rotation, including during non-prime and weekend shift windows as required.
Documentation, Communication & Governance
- Maintain complete and accurate operational documentation including incident tracking, change logs, and runbooks.
- Produce metric reports and regular productivity/status updates for internal stakeholders and management.
- Communicate proactively and clearly — both written and verbal — with internal teams, leadership, and customers on a daily basis.
- Liaise with management to share feedback on existing and new processes, methodologies, and best practices.