Honeycomb

Field Reliability Engineer- LATAM

engineeringfull-timeRemote - Brazil

SALARY

Not listed

WORK TYPE

remote

JOB TYPE

full-time

INDUSTRY

general

Apply for this position

✦ AutoApply Let us apply to roles like this on your behalf.

Learn more

About the role

What We’re Building

Honeycomb is a service for the near and present future, defining observability and raising expectations of what developer tools can do! We’re working with well known companies like HelloFresh, Slack, LaunchDarkly, and Vanguard and more across a range of industries. This is an exciting time in our trajectory, we’ve closed Series D funding, scaled past the 200-person mark, and were named to Forbes’ America’s Best Startups of 2022 and 2023!

Who We Are

We come for the impact, and stay for the culture! We’re a talented, opinionated, passionate, fiercely inclusive, and responsible group of bees. We have conviction and we strive to live our values every day. We want our people to do what they truly love amongst a team of highly talented (but humble) peers.

How We Work

We are a fully distributed company, which means we believe it is not where you sit, but how you deliver that matters most. We invest in our people and care about how you orient to our culture and processes. At the same time we imbue a lot of trust, autonomy, and accountability from Day 1.

Platform Engineering - Managed Services & Infrastructure

Own and operate customer-facing managed infrastructure including Refinery as a Service (RaaS) and Honeycomb Private Cloud (HnyPC) deployments across multiple AWS accounts and regions.
Build and maintain Terraform modules, Helm charts, and deployment automation for provisioning and managing customer EKS clusters, collector pools, and Refinery instances.
Design and implement monitoring, alerting, and observability for managed service infrastructure - using Honeycomb to monitor Honeycomb.
Manage scaling, upgrades, and incident response for customer deployments, including capacity planning and cost optimization across AWS infrastructure.
Building autonomous deployment and management tooling for field-operated managed services.

Technical Escalation & Unblocking

Serve as the senior technical escalation point for our most challenging customer situations - production incidents, complex collector configurations, Refinery tuning, and architecture reviews that exceed the scope of standard technical roles.
Diagnose and resolve deep infrastructure and observability issues spanning distributed systems, Kubernetes clusters, AWS networking (ALBs, PrivateLink, NLBs, VPCs), and polyglot service meshes.
Partner directly with customer SRE, platform, and engineering teams to troubleshoot real-time production issues, often under time pressure and with direct revenue impact.
Participate in an on-call rotation for managed services (Refinery as a Service, Honeycomb Private Cloud), providing Tier 2 escalation support for customer-facing infrastructure issues.
Build and maintain SOPs, runbooks, and diagnostic frameworks that accelerate resolution for the broader field and support teams.

Open Source & Ecosystem

Contribute to and maintain OpenTelemetry distributions, collectors, exporters, and instrumentation libraries that our customers depend on.
Represent Honeycomb in the OpenTelemetry community - participating in SIGs, reviewing PRs, triaging issues, and driving adoption of best practices.
Build reference architectures, sample collector configurations, and integration guides that demonstrate effective instrumentation patterns for common customer scenarios.

✦ Let us apply for you

We find roles like this and apply on your behalf. Cover letter written for each one. Plans from $15/mo. Cancel anytime.

Get AutoApply

Apply now