Search

SRE Director / SRE Programs Manager

Black Rock Solutions Corporation
locationMiami, FL, USA
PublishedPublished: 6/14/2022
Technology
Full Time

Job Description

Job DescriptionJob Summary

We are seeking an experienced SRE Director / SRE Programs Manager to lead and manage enterprise-level SRE initiatives at Verizon. The role involves driving reliability programs, leading cross-functional teams, defining best practices, and ensuring high availability, scalability, and performance across critical production systems.

Key Responsibilities

  • Lead SRE strategy, roadmap, and execution for large-scale distributed systems.
  • Partner with engineering, operations, and product teams to implement reliability and observability best practices.
  • Drive automation, monitoring, incident management, and capacity planning processes.
  • Establish KPIs/SLIs/SLOs to measure and improve system reliability.
  • Manage and mentor global SRE teams, ensuring strong collaboration and delivery.
  • Oversee incident reviews, root cause analysis, and problem resolution.
  • Champion reliability-focused culture and advocate for continuous improvement.
  • Manage stakeholder communication with Verizon leadership and business units.

Required Skills Experience

  • 12+ years of IT experience with at least 6+ years in Site Reliability Engineering leadership roles.
  • Proven experience managing large-scale SRE or Production Engineering programs.
  • Strong expertise in cloud platforms (AWS, GCP, Azure), container orchestration (Kubernetes, Docker), and CI/CD.
  • Hands-on knowledge of observability tools (Prometheus, Grafana, ELK, Splunk, Datadog, etc.).
  • Solid understanding of automation, scripting (Python, Go, Shell, etc.), and configuration management (Ansible, Terraform, etc.).
  • Excellent leadership, stakeholder management, and program governance skills.
  • Strong problem-solving and decision-making ability under pressure.

Nice-to-Have Skills

  • Experience in telecom/ISP domain is a plus.
  • Certifications: AWS/GCP/Azure Architect, Kubernetes CKA/CKAD, ITIL, PMP.
  • Exposure to AI/ML for reliability automation
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...