← Back to jobsApply for this position
Sayari
Data Engineer (Remote, US)
engineeringfull-timeRemote - US
SALARY
$90k – $120k/yr
WORK TYPE
remote
JOB TYPE
full-time
INDUSTRY
ai
✦ AutoApply Let us apply to roles like this on your behalf.
Learn more
About the role
POSITION DESCRIPTION
As a Data Engineer at Sayari, you will be the engine behind the world's most comprehensive commercial world model. You will join a high-autonomy team responsible for building and scaling the complex orchestration systems that transform billions of primary-source records into actionable intelligence. This is a role for a "builder" who respects the complexity of large-scale ETL and graph databases and is "PhD-curious" about the future of AI-native data products and modern orchestration.
JOB RESPONSIBILITIES
- Design, build, and maintain scalable data pipelines using Python, Spark, and Airflow to support our core data acquisition and entity resolution engines.
- Collaborate cross-functionally with AI/ML and Product teams to implement new features and AI-native products.
- Proactively identify and resolve bottlenecks in our complex ETL processes, bringing a fresh perspective to refine and optimize our existing codebase.
- Contribute to a robust engineering culture through rigorous code reviews, unit testing, and clear communication of design decisions.
- Own the end-to-end delivery of roadmap tasks within two-week sprints, ensuring work meets high standards for quality, documentation, and performance.
- Participate in roadmap planning and story refinement, eventually taking ownership of major epics that drive our long-term product defensibility.
SKILLS & EXPERIENCE
Required
- Professional proficiency in Python and experience contributing to shared codebases using Git (branching, PRs, code reviews).
- Demonstrated experience working with relational databases (PostgreSQL/BigQuery) and an interest in or familiarity with graph databases.
- Familiarity with distributed computing (Spark) or a strong desire to master it.
- Strong collaborative skills and the ability to work effectively in an Agile, sprint-based environment.
- A "self-directed" orientation: ability to move tasks from "assigned" to "complete" with high autonomy and clear communication.
Preferred
- Experience with Django, Scala, or Scrapy.
- Hands-on experience with workflow orchestration tools like Airflow.
- Experience or strong interest in LLM tuning, deployment, and AI engineering best practices.
- Experience working with international or non-English datasets.
- Prior experience working with high-scale, complex data pipelines.
Benefits
- 100% fully paid medical, vision, and dental for employees and their dependents
✦ Let us apply for you
We find roles like this and apply on your behalf. Cover letter written for each one. Plans from $14.99/mo. Cancel anytime.
Join waitlist