Senior Infrastructure Engineer
About the role
About the company:
The mining industry has steadily become worse at finding new ore deposits, requiring >10X more capital to make discoveries compared to 30 years ago. The easy-to-find, near-surface deposits have largely been found, and the industry has chronically under-invested in new exploration technology, relying on the manual techniques of yesteryear – even as demand accelerates for copper, lithium, and other metals to build electric vehicles, renewable energy, and data centers.
KoBold builds AI models for mineral exploration and deploys those models—alongside our novel sensors—to guide decisions on KoBold-owned-and-operated exploration programs. In the six years since founding, KoBold has become by far both the largest independent mineral exploration company and the largest exploration technology developer. Our data scientists and software engineers, who come from leading technology companies, jointly lead exploration programs with our renowned exploration geologists.
KoBold has proven its first discovery with materially less capital than the industry average and found one of the best copper deposits ever discovered: the copper is far more concentrated than the global average of copper mines, and this asset alone is expected to generate meaningful revenue for decades. KoBold has a portfolio of more than 60 other projects, each of which has the potential for another high-quality discovery.
KoBold is privately held; investors include institutional asset managers T. Rowe Price and Canada Pension Plan Investments; technology venture capitalists Andreessen Horowitz, Breakthrough Energy Ventures, BOND Capital, and Standard Investments; and natural resources companies Equinor, BHP, and Mitsubishi.
About The Role:
In this role, you will partner with exploration and engineering teams to build reliable, scalable infrastructure that makes it easier to turn data and models into real-world exploration insights. You will improve observability, streamline MLOps workflows, and maintain shared tools like JupyterHub that enable faster experimentation and collaboration. Your work will help create a solid foundation for scientists and engineers to focus on discovery instead of infrastructure.
Responsibilities
- Design, build, and operate compute infrastructure that is both scalable and reliable to support critical services.
- Work closely with engineering teams to embed observability, reliability, and security throughout the software development process.
- Create and maintain automation for monitoring, deployments, and incident response to keep operations efficient and predictable.
- Lead or support capacity planning, performance reviews, and system tuning to ensure stable and efficient systems.
- Join the on-call rotation and take part in incident response, troubleshooting, and resolution.
- Develop and refine monitoring and alerting to catch issues early and reduce downtime.
- Establish and maintain disaster recovery and business continuity practices that protect the organization against failures.
- Regularly review and improve our tools and processes to strengthen system visibility and reliability.
- Investigate points of fragility in distributed systems and understand how complex systems behave under stress in order to improve resilience.
- Continually learn about mineral exploration through reading, discussions with exploration team members, periodic rotation on an exploration team and time in the field with geo