- Company Name
- Optomi
- Job Title
- Site Reliability Engineer
- Job Description
-
Job title: Site Reliability Engineer
Role Summary:
Responsible for enhancing reliability, scalability, and performance of customer‑facing applications. Owns design, deployment, and operation of distributed systems on Azure, leveraging IaC, container orchestration, CI/CD pipelines, and observability tools to achieve high availability and operational excellence.
Expectations:
- Deliver SRE‑level practices on a 6‑month contract to hire, operating 2‑days hybrid onsite.
- Collaborate with development, cloud engineering, and platform teams to ensure seamless service delivery.
- Own end‑to‑end architecture of Azure resources, CI/CD, and monitoring frameworks.
Key Responsibilities:
1. Design, implement, and maintain Azure infrastructure (AKS, Service Bus, Event Hub, Azure SQL, Function Apps, App Services).
2. Build and evolve reusable services, automation scripts, and platform frameworks to improve stability and developer velocity.
3. Provision and manage IaC with Terraform; version‑control all configurations.
4. Containerize applications using Docker; deploy and manage Kubernetes workloads on AKS, including scaling, upgrades, and reliability enhancements.
5. Collaborate with .NET developers; optimize CI/CD pipelines in Azure DevOps (ADO).
6. Design and enforce observability: set up monitoring, alerting, and logging using Splunk Observability Cloud (or equivalent).
7. Develop SLOs, SLIs, error budgets; implement auto‑scaling, fail‑over, disaster‑recovery strategies.
8. Investigate incidents, perform root‑cause analysis, and implement long‑term fixes to reduce MTTR.
9. Tune application, database, and cloud service performance; drive improvements in uptime, latency, throughput, and cost efficiency.
Required Skills:
- 3–5+ years SRE experience focused on cloud, observability, and automation.
- Proficiency with Azure services: AKS, Service Bus, Event Hub, Azure SQL, Function Apps, App Services.
- Terraform for IaC.
- Docker and Kubernetes container deployment.
- Azure DevOps (CI/CD pipeline configuration).
- Monitoring/observability tools (Splunk Observability Cloud preferred).
- Understanding of .NET ecosystem and development fundamentals.
Required Education & Certifications:
- Bachelor’s degree in Computer Science, Engineering, or a related technical field (equivalent experience accepted).
- Master’s degree preferred.
- (No mandatory certifications requested.)