Innodatainc

AI/ML Research Engineer, LLM Post-Training & Evaluation

engineeringfull-timeRemote - United States

SALARY

Not listed

WORK TYPE

remote

JOB TYPE

full-time

INDUSTRY

Apply for this position

✦ AutoApply Let us apply to roles like this on your behalf.

Learn more

About the role

Scope of the Role:

Innodata is expanding its team of technical experts in LLM training, post-training, and evaluation systems. As an AI/ML Research Engineer, LLM Training & Evaluation, you will build and optimize the technical foundations that power model improvement for foundation model builders and leading labs.

This role is ideal for someone who has hands-on experience fine-tuning and evaluating large language models (and ideally multimodal models), and who can bridge research and engineering in real-world customer environments. You will work closely with Language Data Scientists, Applied Research Scientists, data engineers, and client technical stakeholders to design and implement robust training/evaluation pipelines using both human-in-the-loop and AI-augmented methods.

The ideal candidate brings a strong computer science / machine learning engineering background, experience with modern LLM post-training workflows, and the ability to engage credibly with technical counterparts at leading AI organizations.

What You’ll Own:

As an AI/ML Research Engineer, LLM Training & Evaluation, you will design and implement the pipelines and tooling that connect data, evaluation, and post-training. You will help customers and internal teams move from evaluation findings to measurable model improvements.

Your work may include building fine-tuning workflows (e.g., supervised fine-tuning and preference-based optimization), integrating evaluation harnesses into model development loops, improving experiment reliability and throughput, and supporting advanced evaluation scenarios such as long-context, cross-modal, and dynamic multi-turn interactions.

You will also contribute to Innodata’s internal R&D efforts, including benchmark datasets, evaluation frameworks, and reusable infrastructure for model assessment and post-training experimentation. Additional responsibilities include (but are not limited to):

Lead or co-lead technically complex ML engineering projects from initial customer discussions through implementation and delivery
Design, build, and improve LLM training and post-training pipelines, including data ingestion, preprocessing, fine-tuning, evaluation, and experiment tracking
Implement and optimize evaluation systems for LLMs and multimodal models, including offline benchmarks and task-specific test harnesses
Integrate human-in-the-loop and AI-augmented evaluation signals into model development workflows
Build robust infrastructure and tooling for reproducible experimentation, metrics logging, and regression monitoring
Diagnose model behavior and pipeline failures, including data issues, training instability, metric inconsistencies, and evaluation drift
Collaborate with Language Data Scientists and Applied Research Scientists to translate evaluation frameworks into executable systems
Work closely with customer technical stakeholders to understand goals, constraints, and success criteria; propose and implement technically sound solutions
Contribute to internal research and platform development, including benchmark frameworks, evaluation tooling, and post-training workflow improvements

✦ Let us apply for you

We find roles like this and apply on your behalf. Cover letter written for each one. Plans from $15/mo. Cancel anytime.

Get AutoApply

Apply now