All roles

Vice President – Site Reliability Engineering, Data Centers

Remote · USA Full-time New today

Job Description:

  • Oversee a specialized SRE team focused on the design, deployment, and maintenance of automation toolsets as well as the systems they interact with.
  • Establish and enforce standards for IaC to ensure consistent, repeatable, and secure deployments across an entire infrastructure ecosystem. Strong proficiency in Terraform is required.
  • Lead the strategy for automated configuration and state management, ensuring Ansible playbooks and Packer image pipelines are optimized for both Windows, Linux, and ESXi Platforms.
  • Manage the monitoring and health of the automation platforms themselves. Implement SLIs/SLOs to ensure the 'tools that build the servers' are highly available and performant.
  • Drive the automated lifecycle of both physical and virtual assets, from initial template creation/deployment to automated patching, scaling, and decommissioning.
  • Lead the development of custom scripts and internal providers (Python, Go, PowerShell, Bash) to provide better insights and tooling for our systems.
  • Outside of the automation team you will need to be able to collaborate and foster workflows alongside the rest of the Datacenter team and be able to facilitate needs for the team as a whole.
  • Analyze system behavior and resource utilization in virtual environments to optimize the performance of automated deployments.
  • Provide technical guidance and career mentorship to SREs, fostering a culture of 'automate-first' and continuous improvement.

Requirements:

  • 6-10 years’ experience in Infrastructure, SRE or DevOps, specifically focused on infrastructure automation at scale.
  • Deep proficiency with Terraform (providers, modules, state management) and Ansible (roles, playbooks, Tower/AWX).
  • Hands-on experience with Image Creation (i.e. Packer, Ansible, SCCM) to build standardized, hardened images for both Windows and Linux in hybrid environments.
  • Strong experience managing and automating virtual platforms such as VMware (vSphere/vCenter) as well as Cloud providers such as Azure and AWS.
  • High-level scripting skills in mediums such as Python, Go, PowerShell, and Bash.
  • Experience with observability tools (Splunk, ELK, Prometheus, or Grafana) to monitor infrastructure health and automation telemetry.
  • Good understanding of Network topology and design as well as experience with platforms such as Juniper Networks or Palo Alto.
  • Strong mastery of Git (branching strategies, PR workflows) and CI/CD platforms (Jenkins, GitLab CI, or GitHub Actions).
  • Equal comfort managing, troubleshooting, and tuning performance for both Windows Server and Linux.

Benefits:

  • Flexible work arrangements
  • Professional development opportunities

Apply tot his job Apply To this Job

Related roles

Sr. Site Reliability Engineer- Product Reliability Engineering

Remote · USA Full-time

Software Engineer - Kubernetes, CI/CD, and DevOps

Remote · USA Full-time

Senior Site Reliability Engineer, Infrastructure Foundations

Remote · USA Full-time

Software Engineer (JAVA/ Openshift/Kubernetes)

Remote · USA Full-time

Senior Engineer II, Managed Kubernetes

Remote · USA Full-time

Intermediate Site Reliability Engineer, Database Operations Remote, Canada; Remote, New Zealand

Remote · USA Full-time

Principal Infrastructure & Site Reliability Engineer (US REMOTE)

Remote · USA Full-time

Site Reliability Engineer-Remote (PST hours)

Remote · USA Full-time

Site Reliability Engineer 5, Ads Sre [Remote]

Remote · USA Full-time

Senior Site Reliability Engineer - AWS

Remote · USA Full-time

Experienced Weekend Remote Live Chat Customer Support Agent – IT Solutions via Messaging | Earn $25-$35 Per Hour

Remote · USA Full-time

Remote Customer Service Representative – Health Insurance Enrollment Support & Bilingual (English/Spanish) – Full‑Time, Home‑Based

Remote · USA Full-time

Experienced Customer Support Representative – Equity Market Operations

Remote · USA Full-time

Senior Full Stack Engineer (Poland)

Remote · USA Full-time

Experienced Customer Support Representative - Remote Opportunity at arenaflex

Remote · USA Full-time

Centralus Health AVP Supply Chain

Remote · USA Full-time

Experienced Full Stack Data Entry Specialist – Remote Work Opportunity with arenaflex

Remote · USA Full-time

Experienced Full Stack Digital Chat Support Representative – Work Overnight from Home

Remote · USA Full-time

Remote Customer Service Representative – Virtual Customer Support Specialist (Work From Home)

Remote · USA Full-time

Experienced Lockbox Processor (Government) – Scanning and Data Entry Specialist

Remote · USA Full-time