Staff AI Engineer
About the role
About the Role
The VDC Intelligence team is Veeam’s AI and data intelligence engine, operating across the full platform to deliver threat detection, agentic infrastructure, and AI Trust capabilities at enterprise scale. As a Staff AI Engineer, you will design, build, and own production AI systems that help Veeam’s customers extract meaningful insights from their enterprise data, leading technical decisions on agentic workflows, threat detection, LLM-powered assistants, and the MCP infrastructure that ties it all together. This is a software engineering role at its core: you will own capabilities end to end, from API and service design through AI integration, cloud infrastructure, and production operations. You will also mentor peers, drive architectural decisions, and partner closely with product and platform teams to deliver customer-facing impact.
What You’ll Do
- Lead agentic AI development, including multi-agent orchestration patterns, agent-to-agent protocols, and reliable tool use at production scale
- Own prompt engineering and evaluation workflows including structured outputs, hallucination reduction, and behavioral consistency
- Build and own MCP server infrastructure that exposes backup data to AI agents via the Model Context Protocol, enforcing tenant-aware RBAC, query constraints, and safe tool boundaries
- Define AI quality benchmarks for retrieval relevance, summarization accuracy, and agent reliability, and drive systematic improvements through eval-driven iteration
- Champion security and safety in AI systems, including adversarial prompt hardening, jailbreak resistance, data boundary enforcement, and OWASP LLM Top 10 awareness
- Tune AI workflows for performance, cost, latency, and observability across billions of documents in global regions
- Mentor engineers on the team, raise the technical bar, and contribute to architecture reviews and design decisions
Technologies You’ll Work With
Azure OpenAI Service, Kubernetes, Cosmos DB, Blob Storage, Event Bus, Model Context Protocol (MCP), OAuth 2.0, OIDC, Azure AD / Entra ID
What You’ll Bring
- Proven experience integrating AI/ML services and APIs into production backend systems, including APIs, async pipelines, and cloud infrastructure, treating models and inference endpoints as components in a larger service architecture
- Hands-on experience shipping LLM-powered capabilities end to end, such as embeddings pipelines, RAG, summarization, or LLM-powered assistants, with a strong understanding of failure modes
- Experience designing and operating multi-step agentic workflows with real tool use, including strategies for reliability, observability, and recovery
- Working knowledge of Model Context Protocol (MCP), including building MCP servers, designing tool exposure contracts, or integrating MCP into agent workflows
- Experience with prompt engineering and evaluation including structured outputs, hall