Senior Data Engineer
About the role
Your Role:
We are seeking a Senior Data Platform Engineer to design and develop the data management layer for our platform to ensure its scalability as we expand to larger customers and new jurisdictions. At Alpaca, data engineering encompasses financial transactions, customer data, API logs, system metrics, augmented data, and third-party systems that impact decision-making for both internal and external users. We process hundreds of millions of events daily, with this number growing as we onboard new customers.
We prioritize open-source solutions in our data management approach, leveraging a Google Cloud Platform (GCP) foundation for our data infrastructure. This includes batch/stream ingestion, transformation, and consumption layers for BI, internal use, and external third-party sinks. Additionally, we oversee data experimentation, cataloging, and monitoring and alerting systems.
Our team is 100% distributed and remote.
Responsibilities:
- Design and oversee key forward- and reverse-ETL patterns to deliver data to relevant stakeholders.
- Develop scalable patterns in the transformation layer to ensure repeatable integrations with BI tools across various business verticals.
- Expand and maintain the Alpaca Data Lakehouse architecture's constantly evolving elements.
- Collaborate closely with sales, marketing, product, and operations teams to address key data flow needs.
- Operate the system and manage production issues in a timely manner.
Must-Haves:
- 7+ years of experience in data engineering, including 2+ years of building scalable, low-latency data platforms capable of handling >100M events/day.
- Proficiency in at least one programming language, with strong working knowledge of Python and SQL.
- Experience with cloud-native technologies like Docker, Kubernetes, and Helm.
- Strong hands-on experience with relational database systems and object storage implementations like Apache Iceberg.
- Strong hands-on experience with Google Cloud Platform.