Senior Applied Machine Learning Engineer - Catalogue Intelligence
About the role
About the Role
We’re building a more intelligent, scalable product catalogue across multiple markets. Core capabilities like auto-categorisation and brand detection already exist, but they are not yet connected into a system that consistently drives quality, discovery, and growth.
This role owns that system. The Senior Applied ML Engineer - Catalogue Intelligence is responsible for building the machine learning systems that power OnBuy’s catalogue decisioning engine.
Working in partnership with the Head of Seller Solutions, who defines catalogue rules and commercial logic, you will design and deploy production-grade systems that automatically improve:
- Product categorisation
- Product data quality and completeness
- Pricing competitiveness insights
- Catalogue coverage and selection
- Product discoverability
This is a hands-on, production-focused role where outputs directly modify the live catalogue and materially impact GMV, conversion performance, search, and discovery.
Core Mission
Turn catalogue rules and commercial logic into automated, data-driven systems that continuously improve discovery, data quality, pricing competitiveness, and revenue outcomes.
What You’ll Be Responsible For
You’ll take ownership of how product data is structured, validated, and used across the platform. This includes:
- Improving how we classify and understand products at scale
- Raising the overall quality of catalogue data and defining what “good” looks like
- Ensuring product data supports effective search, filtering, and discovery
- Identifying gaps in our catalogue and surfacing opportunities for growth
- Improving how our catalogue performs across external channels
You’ll build and evolve the systems and decision logic that enable this, and iterate based on real-world performance and data. You’ll work across:
- Structured data (catalogue attributes, GTINs, taxonomy)
- Unstructured data (text and images)
- Behavioural data (search, clicks, conversions)
How You’ll Work
You’ll build directly using SQL and Python on top of:
- BigQuery
- Airbyte
- Google Datastream
You’ll be working across data pipelines, information extraction, and production ML systems, combining rules, heuristics, and ML/LLMs where appropriate. The focus is on shipping practical systems quickly, validating them with real data, and improving them over time. You’ll work closely with engineering, product, and analytics, but you’ll be expected to own and deliver the core logic yourself.
This role is not focused on research or offline modelling. You’ll be expected to build systems that operate in production and directly influence how products appear and perform on the platform.