← Back to jobs
Enumerate
Enumerate

Senior Site Reliability Engineer

engineeringfull-timeLatin America - Remote
SALARY
Not listed
WORK TYPE
remote
JOB TYPE
full-time
INDUSTRY
general
Apply for this position
✦ AutoApply Let us apply to roles like this on your behalf.
Learn more

About the role

Key Responsibilities

Architecture & Infrastructure Ownership

  • Design, implement, and evolve cloud infrastructure architectures for high availability, reliability, security, and scale.
  • Define and maintain reference architectures and patterns for services, applications, and environments across the organization.
  • Develop workflow processes and standards for building, deploying, and maintaining applications within a distributed architecture.
  • Lead infrastructure modernization initiatives (e.g., containerization, Kubernetes adoption, infrastructure as code, platform consolidation).

Governance, Standards & Cost Management

  • Establish and enforce governance standards for infrastructure, CI/CD, observability, and operational practices.
  • Define and maintain policies for environment management, access control, configuration management, and change management.
  • Implement cost management practices (e.g., tagging, budget alerts, rightsizing, reservations/committed use, auto-scaling policies) to optimize cloud spend.
  • Partner with product and engineering leadership to balance performance, reliability, and cost-efficiency across environments.
  • Use DORA metrics and industry benchmarks to drive continuous improvement in delivery and operational performance.

CI/CD, Automation & Operations

  • Design, implement, and maintain CI/CD pipelines for multiple applications and environments using tools such as Git, Azure DevOps, GitLab, or Jenkins.
  • Develop and manage automation pipelines for deployment, configuration, and infrastructure management.
  • Build and maintain monitoring, alerting, and logging systems to ensure visibility, high availability, and performance of applications and services.
  • Manage cloud infrastructure resources and services to ensure reliability, security, and scalability.

Incident Management & Reliability

  • Lead incident response efforts, including triage, root cause analysis, and post-incident reviews.
  • Contribute to and maintain incident response processes, runbooks, and on-call practices.
  • Partner with engineering teams to design resilient systems and reduce mean time to recovery (MTTR).

Leadership, Mentorship & Cross-Functional Collaboration

  • Collaborate with software engineering, QA, product, and IT teams to determine the best way to tackle complex infrastructure, security, and delivery challenges.
  • Mentor engineers in DevOps and platform practices, tools, and standards across the organization.
  • Lead departmental initiatives related to DevOps, platform engineering, and infrastructure disciplines; present plans and progress to stakeholders.
  • Drive new department initiatives based on organizational needs and your expertise in modern technologies and industry trends.
  • Stay current on emerging technologies, tools, and best practices; evaluate their potential application within our tech stack.
✦ Let us apply for you
We find roles like this and apply on your behalf. Cover letter written for each one. Plans from $14.99/mo. Cancel anytime.
Join waitlist
Apply now