Cantina
Machine Learning Engineer, Core Data
datafull-timeRemote
SALARY
Not listed
WORK TYPE
remote
JOB TYPE
full-time
INDUSTRY
ai
✦ AutoApply Let us apply to roles like this on your behalf.
Learn more
About the role
About the Role
ML Engineer focused on Data Quality to own the datasets that power speech systems. Hands-on work with audio and text data: auditing, denoising, filtering, labeling, and building tooling and models that turn large-scale data into reliable training corpora for TTS and adjacent tasks.
What You'll Do
- Dataset ownership: define specs; audit and curate large-scale audio/text; close corpus gaps and fix sample-level issues
- Quality instrumentation: build automated gates/metrics (SNR, clipping, VAD, WER, SV/LID, safety) with dashboards; validate against listening tests
- Classifiers and filters: train lightweight models to tag, score, and filter data
Work will directly improve model performance, robustness, and cost by driving the model ↔ data ↔ eval flywheel from the data side.
✦ Let us apply for you
We find roles like this and apply on your behalf. Cover letter written for each one. Plans from $15/mo. Cancel anytime.
Get AutoApply