Cantina
Machine Learning Engineer, Core Data
datafull-timeRemote
SALARY
Not specified
WORK TYPE
remote
JOB TYPE
full-time
INDUSTRY
ai
✦ AutoApply — Let us apply to roles like this on your behalf.
Learn more →
About the role
About the Role
ML Engineer focused on Data Quality to own the datasets that power speech systems. Hands-on work with audio and text data: auditing, denoising, filtering, labeling, and building tooling and models that turn large-scale data into reliable training corpora for TTS and adjacent tasks.
What You'll Do
- Dataset ownership: define specs; audit and curate large-scale audio/text; close corpus gaps and fix sample-level issues
- Quality instrumentation: build automated gates/metrics (SNR, clipping, VAD, WER, SV/LID, safety) with dashboards; validate against listening tests
- Classifiers and filters: train lightweight models to tag, score, and filter data
Work will directly improve model performance, robustness, and cost by driving the model ↔ data ↔ eval flywheel from the data side.
✦ Let us apply for you
We find roles like this and apply on your behalf. Cover letter written for each one. $14.44/mo.
Start AutoApply →