Sayari

Data Engineer (Remote, US)

engineeringfull-timeRemote - US

SALARY

$90k – $120k/yr

WORK TYPE

remote

JOB TYPE

full-time

INDUSTRY

Apply for this position

✦ AutoApply Let us apply to roles like this on your behalf.

Learn more

About the role

POSITION DESCRIPTION

As a Data Engineer at Sayari, you will be the engine behind the world's most comprehensive commercial world model. You will join a high-autonomy team responsible for building and scaling the complex orchestration systems that transform billions of primary-source records into actionable intelligence. This is a role for a "builder" who respects the complexity of large-scale ETL and graph databases and is "PhD-curious" about the future of AI-native data products and modern orchestration.

JOB RESPONSIBILITIES

Design, build, and maintain scalable data pipelines using Python, Spark, and Airflow to support our core data acquisition and entity resolution engines.
Collaborate cross-functionally with AI/ML and Product teams to implement new features and AI-native products.
Proactively identify and resolve bottlenecks in our complex ETL processes, bringing a fresh perspective to refine and optimize our existing codebase.
Contribute to a robust engineering culture through rigorous code reviews, unit testing, and clear communication of design decisions.
Own the end-to-end delivery of roadmap tasks within two-week sprints, ensuring work meets high standards for quality, documentation, and performance.
Participate in roadmap planning and story refinement, eventually taking ownership of major epics that drive our long-term product defensibility.

SKILLS & EXPERIENCE

Required

Professional proficiency in Python and experience contributing to shared codebases using Git (branching, PRs, code reviews).
Demonstrated experience working with relational databases (PostgreSQL/BigQuery) and an interest in or familiarity with graph databases.
Familiarity with distributed computing (Spark) or a strong desire to master it.
Strong collaborative skills and the ability to work effectively in an Agile, sprint-based environment.
A "self-directed" orientation: ability to move tasks from "assigned" to "complete" with high autonomy and clear communication.

Preferred

Experience with Django, Scala, or Scrapy.
Hands-on experience with workflow orchestration tools like Airflow.
Experience or strong interest in LLM tuning, deployment, and AI engineering best practices.
Experience working with international or non-English datasets.
Prior experience working with high-scale, complex data pipelines.

Benefits

100% fully paid medical, vision, and dental for employees and their dependents

✦ Let us apply for you

We find roles like this and apply on your behalf. Cover letter written for each one. Plans from $14.99/mo. Cancel anytime.

Join waitlist

Apply now