Nebius

Senior Support Engineer

supportfull-timeRemote

SALARY

Not listed

WORK TYPE

remote

JOB TYPE

full-time

INDUSTRY

Apply for this position

✦ AutoApply Let us apply to roles like this on your behalf.

Learn more

About the role

The role

We're looking for a senior support engineer who can handle difficult technical issues in modern cloud environments. This is not a traditional support role. The work is hands-on and technical: debugging Linux and Kubernetes issues, investigating problems in cloud infrastructure, and helping customers running AI workloads, distributed systems, and GPU-based environments. You'll work closely with engineering on production issues, help improve internal tools and troubleshooting workflows, and act as an escalation point when problems are unclear or high impact.

The role includes weekend rotation and incident response.

What you'll do

Investigate and resolve complex technical issues in customer environments
Troubleshoot across Linux, Kubernetes, cloud infrastructure, networking, storage, and GPU-related workloads
Support customers running containerized systems, inference workloads, training jobs, or other distributed platforms
Act as a senior escalation point for production incidents
Reproduce issues, narrow down root causes, and work with engineering on long-term fixes
Build or improve internal scripts, troubleshooting tools, and operational documentation
Help make support more scalable through better automation, observability, and process improvements
Communicate clearly with customers during active investigations and incidents
Take part in weekend coverage and urgent issue response

What we're looking for

Strong Linux troubleshooting skills
Strong Kubernetes and container experience
Solid understanding of cloud infrastructure in AWS, GCP, Azure, OpenStack, or similar environments
Good networking fundamentals
Ability to write scripts or small tools in Python, Bash, Go, or similar
Experience working on production issues that require structured debugging and cross-team collaboration
Ability to work independently and stay effective when the path to resolution is not obvious
Clear written communication, especially when explaining technical issues to customers and internal teams

Especially valuable

Experience with GPU-based infrastructure
Familiarity with AI/ML or LLM-related workloads
Understanding of inference and training pipelines
Experience improving observability, tooling, or operational workflows
History of building useful internal tools or automating repetitive work
Personal or open-source projects that show real technical depth

A strong candidate for this role usually

enjoys debugging messy infrastructure problems
takes ownership without waiting to be told exactly what to do
thinks beyond the immediate ticket
works well with engineering
looks for ways to reduce repeated operational pain, not just close cases

✦ Let us apply for you

We find roles like this and apply on your behalf. Cover letter written for each one. Plans from $15/mo. Cancel anytime.

Get AutoApply

Apply now