AI QA Trainer - LLM Evaluation - Freelance Project
About the role
Role Overview
Are you an AI QA expert eager to shape the future of AI? Large-scale language models are evolving from clever chatbots into enterprise-grade platforms. With rigorous evaluation data, tomorrow’s AI can democratize world-class education, keep pace with cutting-edge research, and streamline workflows for teams everywhere. That quality begins with you—we need your expertise to harden model reasoning and reliability.
Responsibilities
- Challenge advanced language models on tasks like hallucination detection, factual consistency, prompt-injection and jailbreak resistance, bias/fairness audits, chain-of-reasoning reliability, tool-use correctness, retrieval-augmentation fidelity, and end-to-end workflow validation
- Document every failure mode so we can raise the bar
- Converse with the model on real-world scenarios and evaluation prompts
- Verify factual accuracy and logical soundness
- Design and run test plans and regression suites
- Build clear rubrics and pass/fail criteria
- Capture reproducible error traces with root-cause hypotheses
- Suggest improvements to prompt engineering, guardrails, and evaluation metrics (e.g., precision/recall, faithfulness, toxicity, and latency SLOs)
- Partner on adversarial red-teaming, automation (Python/SQL), and dashboarding to track quality deltas over time
Qualifications
- Bachelor’s, master’s, or PhD in computer science, data science, computational linguistics, statistics, or a related field
- Shipped QA for ML/AI systems
- Safety/red-team experience
- Test automation frameworks experience (e.g., PyTest)
- Hands-on work with LLM eval tooling (e.g., OpenAI Evals, RAG evaluators, W&B)
- Skills: evaluation rubric design, adversarial testing/red-teaming, regression testing at scale, bias/fairness auditing, grounding verification, prompt and system-prompt engineering, test automation (Python/SQL), and high-signal bug reporting
- Clear, metacognitive communication—'showing your work'—is essential
Compensation & Details
Pay range: $6-to-$65 per hour, with the exact rate determined after evaluating your experience, expertise, and geographic location. Final offer amounts may vary from the pay range listed above. As a contractor you’ll supply a secure computer and high-speed internet; company-sponsored benefits such as health insurance and PTO do not apply.