- Company Name
- Blackpoint Cyber
- Job Title
- Director of SRE
- Job Description
-
**Job Title:** Director of Site Reliability Engineering (SRE)
**Role Summary:** Lead and scale the organization’s cloud infrastructure, ensuring high availability, reliability, and cost efficiency for mission-critical services. Drive SRE best practices, automation, and financial optimization while mentoring a high-performing SRE team and collaborating cross-functionally with engineering, security, and product groups.
**Expectations:**
- 10+ years in SRE/DevOps or cloud infrastructure roles.
- 5+ years managing SRE or DevOps teams.
- Proven track record of designing, deploying, and operating highly available cloud environments (AWS, Azure, or GCP).
- Demonstrated success in cloud cost optimization and financial stewardship.
- Strong leadership, communication, and problem‑solving skills.
**Key Responsibilities:**
- Design, implement, and manage scalable, highly available cloud infrastructure.
- Establish and enforce SRE practices: monitoring, incident response, capacity planning, performance tuning, and reliability metrics (SLA/SLO/SLI).
- Drive automation via IaC (Terraform, CloudFormation, Pulumi) and CI/CD pipelines.
- Lead a team of SREs, applying coaching and accountability frameworks.
- Optimize cloud spend—right‑size resources, use spot/spot‑like instances, and implement cost‑saving strategies.
- Collaborate with engineering to embed reliability into development workflows.
- Work with security and compliance teams to maintain security hygiene and disaster‑recovery readiness.
- Conduct post‑incident reviews and continuous improvement cycles.
**Required Skills:**
- Deep expertise in AWS, Azure, or GCP with cost‑management and scaling strategies.
- Proficiency with IaC tools (Terraform, CloudFormation, Pulumi).
- Hands‑on experience with CI/CD, Kubernetes, container orchestration.
- Advanced monitoring/observability skills (Prometheus, Grafana, Datadog, Splunk).
- Strong leadership, mentoring, and stakeholder engagement.
- Familiarity with SLA/SLO/SLI frameworks.
- Ability to drive “engineering efficiency” through self‑serve tooling.
**Required Education & Certifications:**
- Bachelor’s degree in Computer Science, Engineering, or related field (or equivalent experience).
- Certifications such as AWS Certified Solutions Architect, Google Cloud Professional Cloud Architect, or equivalent are preferred.