- Company Name
- SimCorp
- Job Title
- Lead Site Reliability Engineer
- Job Description
-
**Job Title:** Lead Site Reliability Engineer
**Role Summary:**
Lead the design, implementation, and operation of reliability, scalability, and performance strategies for cloud-based products. Drive end‑to‑end SRE practices across product teams, automate infrastructure, and manage incident response while mentoring junior engineers.
**Expectations:**
- 5–8+ years in SRE or cloud infrastructure leadership, preferably with financial technology exposure.
- Deep experience with Microsoft Azure, Azure DevOps, Kubernetes, Docker, and CI/CD pipelines.
- Proven track record of building monitoring, alerting, and self‑healing systems, including machine‑learning anomaly detection.
- Strong incident‑management and root‑cause‑analysis skills following ITIL principles.
- Ability to lead technical teams, mentor engineers, and influence architecture decisions.
**Key Responsibilities:**
- Architect and deploy SRE solutions (monitoring, alerting, anomaly detection, self‑healing, reliability testing).
- Design capacity‑planning, resource‑management, and automation initiatives for systems and pipelines.
- Collaborate with product development to optimize application performance and infrastructure.
- Manage on‑call rotations, leading incident response, troubleshooting outages, and documenting procedures.
- Promote continuous improvement, best practices in change management, observability, and operational excellence.
- Mentor junior SREs and cultivate a culture of knowledge sharing.
**Required Skills:**
- Microsoft Azure production‑grade design and operations.
- Infrastructure‑as‑Code: Bicep, Terraform, ARM, Ansible.
- Monitoring & incident frameworks; experience with Azure Monitor, Prometheus, Grafana, etc.
- Kubernetes, Docker, CI/CD, API integration.
- Scripting (PowerShell, Bash, Python) and SQL; familiarity with Cosmos DB, MongoDB Atlas.
- ITIL knowledge (problem, change, incident management).
- Strong communication, collaboration, and leadership abilities.
**Required Education & Certifications:**
- Bachelor’s or Master’s degree in Computer Science or a related field.
- Certifications such as Microsoft Certified: Azure Solutions Architect Expert, Azure DevOps Engineer Expert, or equivalent are preferred.