Lead Data Scientist
About the role
About AppOmni
AppOmni prevents SaaS data breaches by delivering end-to-end SaaS security. Our platform gives security teams clear visibility into posture, access, third-party connections, AI-related activity, and with built-in discovery to identify unsanctioned SaaS and Shadow AI tools. Backed by continuous monitoring and real-time threat detection, AppOmni helps enterprises identify and resolve risks early, keeping their SaaS applications secure.
Recognized as a Frost Radar™ 2025 Leader and Great Place To Work®, AppOmni continues to set the standard for innovation and customer value in SaaS security. The largest and fastest-growing global enterprises across industries trust AppOmni to secure their SaaS applications.
About the Role
AppOmni is looking for a Lead Data Scientist to help define and build scalable, production-grade data pipelines and intelligent analytics capabilities within our SaaS platform.
In this role, you will apply data science, statistical modeling, batch and real-time analytics, and large-scale data engineering to transform complex datasets into actionable product insights and customer-facing capabilities. You will work across a broad range of technical domains on pipelines, including ETL, statistical modeling, machine learning (supervised and unsupervised) and LLM as well as monitoring, governance, visualization, and production modeling systems.
We are looking for a highly versatile engineer-scientist — someone who has worked across different layers of the modern data stack and enjoys continuing to solve a wide variety of technical problems. This role is ideal for someone whose background spans data engineering, infrastructure, analytics applications, statistical modeling, and operational production systems.
You will be responsible for end-to-end data workflows, from ingestion and transformation through analytics implementation, orchestration, monitoring, governance, and production operations. This is a hands-on individual contributor role with technical leadership responsibilities, partnering closely with Product and Engineering to build reliable, scalable, and intelligent data-driven systems
What You’ll Do
- Design and implement scalable batch and real-time data processing systems across large and complex datasets.
- Build and optimize ETL and streaming data pipelines using modern GCP big data technologies.
- Lead development decisions around model choices, data architecture, data modeling, pipeline orchestration, analytics infrastructure, and production systems.
- Develop statistical models and analytics capabilities that support product intelligence and operational insights.
- Design and maintain production-grade data workflows using technologies such as Airflow, Dataflow, PubSub, and PySpark.
- Contribute across multiple areas of the data ecosystem, including data engineering, monitoring and governance, visualization, and analytics tooling.
- Establish monitoring, observability, and governance practices for data quality, pipeline reliability, and production health.
- Partner closely with Engineering to operationalize scalable data infrastructure and analytics systems.
- Collaborate with Product to shape intelligent, data-driven product capabilities and user experiences.
- Act as a technical leader and thought partner across data engineering, analytics, infrastructure, and applied modeling initiatives.
- Help evolve internal tooling and frameworks that improve scalability, reliability, and operational efficiency across the platform.
What We’re Looking For
- 7–10+ years of experience as a Data Scientist, Applied Scientist, or similar role with a strong emphasis on production systems and data engineering.