Bugcrowd

Reinforcement Learning Infrastructure (Cybersecurity)

engineeringfull-timeRemote - US

SALARY

Not listed

WORK TYPE

remote

JOB TYPE

full-time

INDUSTRY

Apply for this position

✦ AutoApply Let us apply to roles like this on your behalf.

Learn more

About the role

Job Summary

The Bugcrowd RL and Reasoning Team focuses on pushing the boundaries of autonomous cybersecurity by building authentic reinforcement learning environments for foundational model companies. As a Staff Engineer, you will advance the frontier of AI Reinforcement Learning development and delivery. You will build the infrastructure and tooling that transforms real-world vulnerability research into large-scale reinforcement learning environments used to train next-generation AI systems.

This role is unique. You will help create the training environments that teach AI systems how to hack and defend software. Your work will directly influence the capabilities of the next generation of AI models. Instead of building a single application, you will build the infrastructure that generates thousands of environments used to train frontier AI systems.

Our team works at the intersection of AI, security research, and systems engineering, building environments that allow models to learn skills such as vulnerability discovery, exploitation, and remediation.

Essential Duties and Responsibilities

If you enjoy building high-performance systems that power cutting-edge AI research, this role is for you.

This role focuses on building the systems that generate RL environments, not just the environments themselves. You will design pipelines that ingest software projects, analyze them with Bugcrowd’s Mayhem platform, and automatically construct training environments used by frontier AI labs including Anthropic, OpenAI, and Cohere.

The ideal candidate is a strong systems engineer who understands:

Reinforcement learning workflows
Building clean, reproducible Linux ML environments (containers, MCP, etc)
System security background in binary exploitation, such as buffer overflows, fuzzing, exploitation, and x86/64.
Experience developing applications in Python and C, with Rust a plus.

Education, Experience, Knowledge, Skills, and Abilities

Understanding of RL training workflows used by modern LLM systems
Experience with DevOps pipelines (e.g., github actions), reproducible builds (docker, buildkit, nix).
Proficiency in Python and C. Other languages (especially Rust) are a plus.
Understanding of software vulnerabilities, fuzzing, or program analysis
Experience with build systems and large open-source codebases
Comfort working with Linux systems and low-level debugging
Experience working with benchmark environments (CTFs, SWE-bench, security challenges, etc.) is a plus

Working Conditions and Physical Requirements

The ideal candidate must be able to complete all physical requirements of the job with or without reasonable accommodation.

Sitting and / or standing - Must be able to remain in a stationary position 50% of the time

✦ Let us apply for you

We find roles like this and apply on your behalf. Cover letter written for each one. Plans from $15/mo. Cancel anytime.

Get AutoApply

Apply now