- Company Name
- Change Digital – Digital & Tech Recruitment
- Job Title
- Cloud Engineering Lead
- Job Description
-
**Job Title:** Cloud Engineering Lead
**Role Summary:**
Lead a high‑performing Cloud Operations team that manages 24×7 AWS production environments across multiple regions. Own availability, security, and operational excellence, driving an automation‑first culture where all maintenance, compliance, and incident response are delivered through code and CI/CD pipelines.
**Expactations:**
- Minimum 5‑years of leadership in Cloud Operations with proven experience managing production workloads in AWS, ideally across multiple regions.
- Strong background in Infrastructure as Code (Terraform) and CI/CD for infrastructure, including policy‑as‑code, automated testing, and safe rollback strategies.
- Deep knowledge of AWS networking, DNS (Route 53, resolvers), high‑availability design, and disaster recovery.
- Solid security foundation (IAM, KMS, patching, vulnerability management) and hands‑on incident/on‑call experience with blameless post‑mortems.
- Proficiency in scripting (Python or Bash) and pragmatic approach to cost, capacity, and reliability management.
**Key Responsibilities:**
- Lead and mentor a Cloud Ops team, setting goals, SLOs, and fostering a culture of continuous improvement.
- Design and implement resilient, fault‑tolerant architectures (networking, DNS, failover, DR) across multi‑region estates.
- Codify runbooks, policies, and compliance controls; enforce changes via Git‑based CI/CD pipelines with automated testing and approvals.
- Reduce MTTR by implementing comprehensive observability and alerting for all workloads.
- Partner with Security to define and enforce guardrails (IAM, KMS, patching, vulnerability management).
- Automate routine maintenance tasks (patching, certificate rotation, backup/restore validation, EOL upgrades).
- Develop reusable building blocks and standards; collaborate with Architecture, Security, Development, Data, and Operations teams.
**Required Skills:**
- Leadership of Cloud Ops teams in AWS (24×7, multi‑region).
- Terraform, CloudFormation, or equivalent IaC tooling.
- Git‑based workflow, CI/CD for infrastructure (CodePipeline, GitHub Actions, Jenkins).
- AWS networking: VPC, subnets, routing, Transit Gateway/Cloud WAN, Route 53, health checks.
- IAM, KMS, secrets management, patch baselines, vulnerability scanning.
- Incident management, on‑call, post‑mortem.
- Python or Bash scripting.
- Cost and capacity optimization practices.
**Required Education & Certifications:**
- Bachelor’s degree in Computer Science, Engineering, or equivalent technical experience.
- AWS Certified Solutions Architect – Professional or equivalent cloud certification highly preferred.