← Back to jobs
Innodatainc
Innodatainc

Events & Community Growth Intern

datafull-timeRemote - Washington
SALARY
Not listed
WORK TYPE
remote
JOB TYPE
full-time
INDUSTRY
ai
Apply for this position
✦ AutoApply Let us apply to roles like this on your behalf.
Learn more

About the role

Scope of the Role:

We are looking for a curious and driven Data Engineering Intern to join our Data & AI team. You will primarily focus on building and maintaining robust data pipelines and infrastructure, while also contributing to applied AI projects involving Large Language Models (LLMs) and Retrieval-Augmented Generation (RAG) systems.

This is a hands-on role. You will work alongside senior engineers and data scientists, contribute to production-grade systems.

The role is roughly 65% Data Engineering and 35% Data Science / Applied AI.

What You’ll Own:

Data Engineering

  • Design, build, and maintain scalable ETL/ELT data pipelines using tools like Apache Airflow, dbt, or Spark
  • Work with structured and unstructured data from various sources — APIs, databases, event streams
  • Write optimized SQL queries and data transformation logic for analytical and ML use cases
  • Maintain and improve data quality, schema management, and pipeline monitoring
  • Collaborate on data warehouse and data lake architecture (e.g., Snowflake, BigQuery, Delta Lake)
  • Document data flows, lineage, and schema definitions

Data Science & Applied AI

  • Build and evaluate RAG pipelines — chunking, embedding, indexing, and retrieval
  • Work with vector databases (e.g., Pinecone, Weaviate, pgvector) for semantic search
  • Integrate LLM APIs (OpenAI, Anthropic, open-source models) into data products or internal tools
  • Help with prompt engineering, evaluation frameworks, and fine-tuning experiments
  • Support exploratory data analysis and feature engineering for ML workflows

You’ll Thrive in This Role If You Have:

  • Pursuing a degree in Computer Science, Data Science, Engineering, or a related field
  • Solid foundation in Python — comfortable writing clean, modular, production-quality code
  • Hands-on experience with SQL (query optimization, CTEs, window functions)
  • Familiarity with at least one cloud platform — AWS, GCP, or Azure
  • Understanding of data pipeline concepts: batch vs streaming, orchestration, idempotency
  • Strong analytical mindset with attention to data quality and correctness
  • Experience with workflow orchestrators: Apache Airflow, Prefect, or Dagster
  • Exposure to dbt for data transformation and testing

The expected hourly range for this position is $20/hour.

✦ Let us apply for you
We find roles like this and apply on your behalf. Cover letter written for each one. Plans from $14.99/mo. Cancel anytime.
Join waitlist
Apply now