Machine Learning Engineer II, Computer Vision Applied Science
About the role
About Pinterest
Millions of people around the world come to our platform to find creative ideas, dream about new possibilities and plan for memories that will last a lifetime. At Pinterest, we’re on a mission to bring everyone the inspiration to create a life they love, and that starts with the people behind the product.
Discover a career where you ignite innovation for millions, transform passion into growth opportunities, celebrate each other’s unique experiences and embrace the culture to do your best work. Creating a career you love? It’s Possible.
At Pinterest, AI isn't just a feature, it's a powerful partner that augments our creativity and amplifies our impact, and we’re looking for candidates who are excited to be a part of that. To get a complete picture of your experience and abilities, we’ll explore your foundational skills and how you collaborate with AI.
Through our interview process, what matters most is that you can always explain your approach, showing us not just what you know, but how you think. The Pinterest Labs group is dedicated to the development and research of applied machine learning. Our initiatives span a diverse range of AI/ML fields, including fundamental computer vision, multimodal large language models, multimodal representation learning, generative modeling, heterogeneous graph neural networks, and recommender systems. By building foundation ML models that utilize our extensive knowledge graph and billions of Pins, we aim to significantly enhance the core Pinterest product.
Our visual modeling team is currently seeking new members to focus on the advancement of vision-centric LLMs. We are building VLMs capable of perceiving intricate visual details and understanding user aesthetics to facilitate communication through visual assets using tools like multimodal search and text-to-image models. This role offers the opportunity to work with Pinterest's unique visual-text datasets to develop large-scale generative models for production. You will join the core visual pod, a collaborative group of approximately six engineers and a product prototyping team, to create specialized evaluation benchmarks and contribute to the broader research community.
What you’ll do
- Prototype new model architectures for Pinterest VLMs. We’re looking for hands-on experience working with finetuning open-source LLM models and improve their visual perception and tool using capabilities.
- Develop new evaluation benchmarks that tailors to vision-centric capabilities such as fashion style recommendations.
- Read research papers, participate in group discussions, and help brainstorm our overall visual generative strategy at the company.
- Help with collection of relevant visual training data for Pinterest Canvas, particularly to conduct RLHF, targeted fine-tuning, etc.
- Publish and publicize your work via conferences, paper submissions, blog posts, etc.
- Mentor more junior researchers or research interns within the Pinterest Labs organization.
What we’re looking for
- Research engineers and scientists who have experience working with generative computer vision models, preferably various forms of visual encoders and LLMs.
- 2+ years of industry computer vision experience.
- M.S. or PhD in Machine Learning, Computer Science, or related areas.
Nice to Have
- Publications at top ML conferences.
- Experience using Cursor, Copilot, Codex, or similar AI coding assistants for development, debugging, testing, and refactoring.
- Familiarity with LLM-powered productivity tools for documentation search, experiment analysis, SQL/data exploration.