- Company Name
- Iris Software Inc.
- Job Title
- SRE - Technical Lead
- Job Description
-
Job Title: SRE – Technical Lead
Role Summary: Lead site reliability engineering initiatives for a Fortune 100 client, focusing on Azure cloud operations, Kubernetes (AKS) environments, and production reliability, while driving continuous improvement of monitoring, incident response, and automation practices.
Expectations:
- Design, implement, and maintain highly available and scalable cloud infrastructure on Azure.
- Ensure operational excellence and reliability of production services, proactively addressing incidents and capacity planning.
- Develop and enforce SRE processes and best practices across teams.
- Mentor engineering staff on SRE methodologies and tooling.
Key Responsibilities:
- Architect and manage Azure resource groups, AKS clusters, and associated networking/security components.
- Configure and maintain observability stack (e.g., Dynatrace, Prometheus, Grafana) to provide real‑time monitoring and alerting.
- Lead incident management, including post‑mortem analysis and root‑cause investigations.
- Automate deployment pipelines using Terraform, Azure DevOps or equivalent, and orchestrate configuration with Ansible.
- Script support for operational tasks in Python.
- Drive reliability metrics (MTTR, MTBF, SLA adherence) and report trends to stakeholders.
- Collaborate with development teams to integrate reliability into the CI/CD lifecycle.
Required Skills:
- 5+ years experience in cloud operations, with deep knowledge of Azure services and Azure Kubernetes Service (AKS).
- Proven SRE experience: monitoring, incident response, reliability engineering, capacity planning.
- Expertise with observability tools (Dynatrace, Prometheus, Grafana, or similar).
- Strong scripting in Python; familiarity with Terraform, Ansible, and CI/CD pipelines.
- Excellent analytical, problem‑solving, and communication skills.
- Experience in banking or payments systems is an asset, but not mandatory.
Required Education & Certifications:
- Bachelor’s degree in Computer Science, Engineering, or equivalent practical experience.
- Relevant Azure certifications (e.g., AZ-104, AZ-500, or AZ-400) strongly preferred.