Principal Data Engineer
About the role
About Sovrn
Every interesting company solves important problems for other people. Sovrn is a Software and Data business that helps Open Web businesses be and remain independent. We help them understand their business better, operate more efficiently, and make & keep more money.
- We believe in the freedom and free-flow of information.
- We believe the Open Web is the largest source of this information.
- We believe in helping Open Web businesses be and remain Independent.
Through Software products and Data solutions we help our customers:
- Understand their business better, so they can make better decisions
- Operate their business more efficiently, so they can invest in what matters most
- Make (and Keep) more money, so they control their own destiny
About the Role
We're looking for a Principal Software Engineer (Data) with deep roots in adtech data infrastructure and a genuine conviction about what AI-native data engineering looks like in practice. This is a specialized principal-level engineering role — one that carries all the architectural ownership and technical leadership expectations of a Principal Software Engineer, focused on Sovrn's Data Collective.
From a generative/agentic AI capabilities standpoint, we already use LLMs and agentic tooling across our data stack and, we're looking for is someone who can help us take that from general adoption to intentional practice — who has strong opinions about where AI creates real leverage in a high-throughput adtech environment, and who can bring the rest of the engineering organization along with them.
Languages/components/tools in our stack: Python, Pyspark, Kafka, Databricks, AWS
What you'll be doing:
Data Platform & Architecture
- Own the design and evolution of data platform systems that operate at exchange scale; high throughput, real-time streaming, and always-on batch pipelines
- Lead architectural decisions across data infrastructure: pipeline design, data modeling, lakehouse architecture, and data services layers
- Specify data platform components and configurations required for pipeline implementation; define pipeline observability to understand and improve performance at massive scale
- Research, implement, and evolve methods to process and democratize data across the organization
- Drive technical standards, design reviews, and engineering best practices across a senior team
- Partner with product, data science, and platform teams to ship end-to-end
AI & Agentic Engineering Leadership
- Establish and champion AI engineering practices across the team, from prompt engineering and RAG patterns to agentic workflow design, LLM evaluation, and progressive implementation of agentic design patterns
- Identify high-leverage opportunities to apply AI in our data stack: intelligent pipeline optimization, anomaly detection, automated data quality, forecasting, and LLM-powered data services
- Lead the evolution of our existing LLM and agentic tooling from passive use to intentional, well-architected integration within our data platform
- Set standards for how we evaluate, trust, and operate AI-powered systems in production, including observability, fallback behavior, and model governance
- Help the broader engineering team build fluency and confidence with AI tooling, not just tolerance of it
Collaboration & Mentorship
- Provide domain expertise across the organization to enable business growth through data services and data models