Renaissancelearning Nam

Sr Site Reliability Engineer

engineeringfull-timeRemote- US

SALARY

Not listed

WORK TYPE

remote

JOB TYPE

full-time

INDUSTRY

general

Apply for this position

✦ AutoApply Let us apply to roles like this on your behalf.

Learn more

About the role

Job Description

Renaissance is looking for an experienced Sr Site Reliability Engineer to be part of the Engineering Enablement group’s Site Reliability Team with a focus on Application and Infrastructure Availability, Reliability, Observability & Security.

We are at the crossroads of evolving our current team and looking for someone who has been involved in the SRE implementation journey at other companies. We are looking for someone who influences our SRE philosophy and practices, who is a problem solver, self-motivated, great at communication, values teamwork. You will apply your technical expertise to build and scale our highly available distributed SaaS platform used by millions of K-12 students worldwide.

In this role as a Sr Site Reliability Engineer, you will

Work with engineering, security & governance teams to improve observability, reliability, resiliency, auditability of our systems and minimize/prevent downtime.
Contribute to infrastructure-as-code using Terraform & CloudFormation.
Support CI/CD pipelines which ensures the prompt release of high-quality software.
Collaborate with cross-functional teams to resolve infrastructure issues.
Perform Disaster Recovery exercises on our products.
Explore and integrate AI tooling into the SRE workflows.
Be part of an on-call rotation & support off hour incidents & deployments.
Demonstrates strong skills in giving constructive feedback through coaching even without direct reports.

For this role as a Sr Site Reliability Engineer, you must have:

5+ years of experience focused on SRE.
Experience in managing & monitoring containerized cloud environments in production, preferably AWS EKS.
Experience with IaC, Configuration Management and Orchestration Tools like Terraform/Docker/Ansible.
Hands-on experience in any of the programming or scripting languages like .NET/Java, Python, Javascript etc.
On Call experience & willingness to be on call during non-work hours and weekends.
Experience working in an agile environment.

Bonus points for:

BS in Information Systems or Computer Science, related field experience, or both.
Managing Kubernetes Clusters, EKS at Scale using Helm.
Setting up Gitlab & Github pipelines & workflows.
Experience setting up Monitoring, Logging, Alerting & Observability in tools such as NewRelic, Datadog, Grafana. CloudWatch, PagerDuty.
Experience w/Teleport, Hashicorp Boundary etc.
Experience w/RedShift, OpenSearch/ZeroETL.
Experience running Disaster Recovery exercises.
Implementing service level objectives (SLO/SLI/SLA’s) & error budgets.

✦ Let us apply for you

We find roles like this and apply on your behalf. Cover letter written for each one. Plans from $15/mo. Cancel anytime.

Get AutoApply

Apply now