Snowflake

AI System Research and Development Engineer - Optimization

engineeringfull-timeUS-CA-Menlo Park

SALARY

Not listed

WORK TYPE

remote

JOB TYPE

full-time

INDUSTRY

Apply for this position

✦ AutoApply Let us apply to roles like this on your behalf.

Learn more

About the role

About Snowflake

At Snowflake, we are powering the era of the agentic enterprise. To usher in this new era, we seek AI-native thinkers across every function who are energized by the opportunity to reinvent how they work. You don’t just use tools; you possess an innate curiosity, treating AI as a high-trust collaborator that is core to how you solve problems and accelerate your impact. We look for low-ego individuals who thrive in dynamic and fast-moving environments and move with an experimental mindset — who rapidly test emerging capabilities to discover simpler, more powerful ways to deliver results. At Snowflake, your role isn't just to execute a function, but to help redefine the future of how work gets done.

Role

We are looking for talented System Developers and Researchers to join the Snowflake AI Research team and contribute to LLM inference and training system development, optimizations, and agentic systems. Our mission is to build the most efficient and scalable generative AI systems.

Responsibilities

Analyze and optimize GPU kernel performance for training and inference of LLMs.
Develop and implement strategies to enhance the efficiency and scalability of deep learning systems.
Profile and benchmark deep learning systems using tools and techniques to identify bottlenecks.
Design and implement optimizations to reduce latency and improve resource utilization for training and inference.
Stay updated with the latest advancements in GPU kernel optimization, deep learning, and LLM system development.
Contribute to the development of agentic frameworks and applications for LLM-driven workflows, enhancing automation, reasoning, and decision-making capabilities.
Open-source and publish innovations, optimizations, and engineering practices in technical blogs, top-tier conferences and journals.

Requirements

Bachelor’s degree in Computer Science, Electrical Engineering, or a related field. A Master’s degree or PhD is preferred.
5 years of experience in GPU kernel optimization, deep learning system optimization, or high-performance computing (HPC).
Proficiency in deep learning frameworks such as PyTorch, TensorFlow, JAX.
Strong understanding of GPU architectures and experience with CUDA or similar frameworks.
Experience with frameworks like CUTLASS, Triton, cuDNN, etc.
Experience with profiling tools (e.g., nvprof, Nsight) and performance analysis methodologies.
Solid problem-solving skills and ability to debug complex performance issues.
Excellent communication skills and ability to work effectively in a cross-functional team environment.

✦ Let us apply for you

We find roles like this and apply on your behalf. Cover letter written for each one. Plans from $15/mo. Cancel anytime.

Get AutoApply

Apply now