Senior Data Scientist, LLM
About the role
About Xometry
Xometry (NASDAQ: XMTR) powers the industries of today and tomorrow by connecting the people with big ideas to the manufacturers who can bring them to life. Xometry’s digital marketplace gives manufacturers the critical resources they need to grow their business while also making it easy for buyers at Fortune 1000 companies to tap into global manufacturing capacity.
Job Description
Xometry is seeking a Senior Data Scientist to join our Generative AI team. The candidate will focus on training and fine-tuning Visual Language Models (VLMs) for multimodal document understanding. The ideal candidate will leverage their expertise in machine learning and computer vision to advance Xometry's capabilities in processing and extracting structured data from complex documents and images. This is a 1-year contract.
Responsibilities
- Develop, fine-tune, and evaluate Visual Language Models (VLMs) to enhance document understanding, focusing on multimodal data such as text, images, and technical drawings.
- Design and implement data preparation, cleaning, and augmentation processes tailored to multimodal model training, ensuring high-quality data pipelines for VLMs.
- Leverage transfer learning and pre-trained models to accelerate model development and optimize performance on Xometry’s specific data.
- Use cloud resources (e.g., Amazon Web Services) to scale training and fine-tuning processes for VLMs efficiently.
- Collaborate with data engineering and machine learning operations (MLOps) teams to deploy VLMs into production and monitor their performance.
- Interpret model outputs and improve model accuracy and robustness by applying data analysis and visualization tools (such as Python, Jupyter Notebooks, and SQL).
- Experiment with and implement state-of-the-art model architectures, continuously optimizing VLM performance in a fast-paced, iterative environment.
- Work within a team-oriented setting, participating in peer reviews, sharing insights, and contributing to an environment of continuous learning and improvement.
Qualifications
- A bachelor’s degree is required; an advanced degree (M.S. or PhD) in computer science, data science, or a related field is preferred.