Earnin
Earnin

Site Reliability Engineer II

engineeringfull-timeMexico City, Mexico; Remote, Mexico
SALARY
Not specified
WORK TYPE
remote
JOB TYPE
full-time
INDUSTRY
fintech
Apply for this position →
✦ AutoApply — Let us apply to roles like this on your behalf.
Learn more →

About the role

About EarnIn

As one of the first pioneers of earned wage access, our passion at EarnIn is building products that deliver real-time financial flexibility for those with the unique needs of living paycheck to paycheck. Our community members access their earnings as they earn them, with options to spend, save, and grow their money without mandatory fees, interest rates, or credit checks.

Position Summary

We have a real passion for delivering the best product experience for our community members. We work closely with all teams and share responsibility for rapidly delivering production-ready features to our community. We build or contribute to infrastructure, reliability tooling, and practices that help teams ship quickly and safely. We think a lot about things like good alert hygiene, friendly runbooks, clear SLOs, and how to make deployments feel boring in the best possible way. As an SRE, you are a well-rounded practitioner in designing, observing, and operating our systems in production. Rather than following established playbooks, you are starting to write them. You work with confidence across observability tooling, incident response, and infrastructure-as-code, and you know how to communicate tradeoffs clearly to the engineers and teams around you.

What You'll Do

  • Design systems with resilience, graceful degradation, and capacity in mind.
  • Define and measure SLOs and SLIs that actually reflect what our customers feel.
  • Use Datadog (logging, metrics, APM) together with CloudWatch to build signal-heavy, noise-light observability.
  • Configure alerting and routing that reach engineers through incident.io, where we run incident management and on-call, so that when a human gets paged, it really matters.
  • Continuously improve our incident lifecycle, from fast detection and solid triage, through clear communication, to blameless, actionable follow-ups.
  • You will combine solid software fundamentals with reliability thinking so our systems are highly available, easy to debug, and a joy to work on. You know that the only good 2 a.m. alert is the one that never fires in the first place.

What We're Looking For

  • A bachelors or masters degree in computer science or equivalent industry experience
  • 3+ years of experience in an SRE or Software Engineering role.
  • Hands-on coding experience in Python and/or Go.
  • Distributed Systems Expertise — Proven experience designing, operating, and shepherding large-scale distributed systems from design through production, including incident learnings that make on-call quieter over time.
  • Reliability Engineering Mindset — Deep fluency in SLOs, SLIs, error budgets, and MTTR — using them to drive decisions and explain tradeoffs, not just decorate dashboards.
✦ Let us apply for you
We find roles like this and apply on your behalf. Cover letter written for each one. $14.44/mo.
Start AutoApply →
Apply now →