← Back to jobsApply for this position
Veeamsoftware
Senior Platform Engineer (Cloud Workloads)
engineeringfull-timePune, India
SALARY
Not listed
WORK TYPE
remote
JOB TYPE
full-time
INDUSTRY
ai
✦ AutoApply Let us apply to roles like this on your behalf.
Learn more
About the role
About the Role
We are looking for a Senior Platform Engineer to join the Workload team within the Veeam R&D Department. You will own critical observability infrastructure, drive incident response maturity, and help scale proactive support capabilities as operational accountability.
What You’ll Do
- Design, build, and maintain observability pipelines using the Elastic Stack (Elasticsearch, Kibana, Fleet) across Azure and AWS workloads
- Develop and own SLO/SLI dashboards and error budget reporting for BaaS platform services
- Respond to and lead incident response for distributed, multi-tenant cloud workloads; own runbook creation, maintenance, and continuous improvement
- Build and refine proactive support tooling, including pattern analysis, tenant correlation dashboards, and baseline deviation alerting, to reduce reactive support burden
- Manage and maintain Elastic Fleet agent policies, enrollment health, and log streaming pipelines across Azure and AWS worker fleets
- Partner with SRE, R&D, and Proactive Support teams to close observability gaps, including tenant identification workflows and admin portal integrations
Technologies we work with
- Elastic Stack — Elasticsearch, Kibana, Elastic Fleet, KQL, Query DSL
- Azure Kubernetes Service (AKS), Azure Container Apps, VMs
- Azure Security — Entra ID, Managed Identities (user/system assigned), App Registrations, Key Vault
- Infrastructure as Code — Azure Bicep, Terraform, or Pulumi
- CI/CD — Azure DevOps, GitHub Actions
- ITSM tooling — ServiceNow, Salesforce, Jira, Incident.io (for tenant and incident workflows)
What You’ll Bring
- 5+ years of experience in cloud platform engineering, SRE, or infrastructure roles supporting commercial SaaS products
- Deep hands-on experience with Elastic Stack: Building dashboards, writing KQL/Query DSL, managing Fleet
- Proven experience operating and troubleshooting distributed, multi-tenant workloads on Azure and/or AWS
- Strong understanding of Azure cloud services: AKS, Entra ID, Key Vault, Service Bus, Cosmos DB, Private Endpoints, etc.
- Experience with incident response in production cloud environments, including runbook development and post-incident review
- Experience with IaC tools (Azure Bicep, Terraform) and CI/CD pipelines (Azure DevOps, GitHub Actions)
- Strong scripting
✦ Let us apply for you
We find roles like this and apply on your behalf. Cover letter written for each one. Plans from $14.99/mo. Cancel anytime.
Join waitlist