- Company Name
- ideaHelix
- Job Title
- Site Reliability Engineer - Burnaby, BC, Canada
- Job Description
-
**Job Title**
Site Reliability Engineer
**Role Summary**
Design, build, and operate scalable, secure, and highly available infrastructure and application platforms. Leverage IaC, Kubernetes, and automation tools to deliver self‑service DevOps capabilities while ensuring robust monitoring, observability, and rapid issue resolution across Linux environments and virtualized infrastructures.
**Expectations**
- Continuous improvement of infrastructure automation and reliability processes.
- Participate in on‑call rotation for critical services.
- Communicate status, incidents, and improvements clearly to technical and non‑technical stakeholders.
**Key Responsibilities**
- Develop and maintain IaC frameworks (Terraform, Ansible) for provisioning OS, applications, and network resources.
- Architect, deploy, and manage global Kubernetes clusters with integrated security controls.
- Administer Linux servers (Ubuntu, RHEL, OEL) in production environments.
- Monitor system, VM, application, and network performance using Zabbix, SNMP, Prometheus, Grafana, ELK stack.
- Design and maintain automation scripts/tools (Bash, PowerShell, Python, Go) to enable self‑service operations.
- Troubleshoot security appliances, servers, storage, Kubernetes, and network issues.
- Apply security patches at scale across large VM fleets.
- Collaborate with DevOps, security, and networking teams to implement best practices and standards.
**Required Skills**
- Linux administration (RHEL, Ubuntu) – 5+ years.
- Virtualized environments (VMware, LXD, MicroK8s, OpenStack) – 3+ years.
- IaC & automation (Ansible, Terraform) – 3+ years.
- Monitoring & observability (Zabbix, SNMP, Prometheus, Grafana, ELK).
- Networking protocols (TCP/IP, DNS, SMTP, LDAP, HTTPS, SAML, STP, OSPF, BGP).
- CI/CD & DevOps tools (GitLab CI/CD, AWX, Foreman, Docker).
- Scripting (Bash, PowerShell, Python, Go).
- Security standards and patch management.
- Excellent problem‑solving, multi‑tasking, and communication skills.
**Required Education & Certifications**
- Master’s or Bachelor’s degree in Computer Technologies or related field.
- Relevant certifications (e.g., Red Hat Certified Engineer, AWS Certified DevOps Engineer, Certified Kubernetes Administrator) are a plus.