Infrastructure Operations Engineer
About the role
What We're Looking For
Lightning AI is seeking an experienced Infrastructure Operations Engineers to help scale and operate our next-generation AI infrastructure platform. Our InfraOps team sits at the center of reliability, automation, and operational scale for GPU infrastructure. This team owns break/fix operations, incident response, customer provisioning, observability, and the automation systems that keep complex infrastructure running efficiently.
In this role, you’ll work hands-on with large-scale GPU environments, Linux systems, bare metal infrastructure, provisioning workflows, and platform reliability. You’ll partner closely with Infrastructure Engineering, Network Operations, and Software Platform teams to troubleshoot issues, improve operational efficiency, and build automation that reduces manual toil over time.
We’re flexible on location for this team. This role can work hybrid out of one of our US-based hubs (Seattle, NYC, or SF) or fully remote within the U.S., with occasional c