- Company Name
- Fivetran
- Job Title
- Senior Staff Site Reliability Engineer
- Job Description
-
**Job Title**
Senior Staff Site Reliability Engineer
**Role Summary**
Design, operate, and continuously improve the reliability, scalability, and security of Fivetran’s cloud-native data platform. Own end‑to‑end infrastructure health, orchestrate automated deployments, and lead incident response and preventive measures in a fast‑moving data‑pipeline environment.
**Expectations**
* Demonstrate deep expertise in managed Kubernetes and multi‑cloud operations.
* Drive 100 % availability and performance targets, proactively eliminating outages.
* Mobilize cross‑functional teams (engineering, product, support, sales, security) to prioritize and resolve critical issues swiftly.
* Mentor junior SREs, share best practices, and influence platform architecture decisions.
**Key Responsibilities**
* Monitor infrastructure for availability, capacity, and throughput; identify and remediate bottlenecks.
* Define, build, and maintain CI/CD pipelines (ArgoCD, Terraform, Pulumi, Buildkite) to ensure rapid, reliable deployments across GCP, AWS, and Azure.
* Implement and enforce security hardening, vulnerability scanning, and remediation in collaboration with security teams.
* Lead root‑cause analysis and post‑mortem cycles for outages; translate findings into actionable improvements.
* Coordinate urgent bug fixes and incident‑driven changes with support and sales teams.
* Own infrastructure scalability, cost optimization, and capacity planning.
**Required Skills**
* Managed Kubernetes (EKS, AKS, GKE) – design, deployment, troubleshooting, scaling.
* Cloud platforms: AWS, Azure, GCP; associated services (VPC, VPN, PrivateLink, Private Service Connect).
* IaC & automation: Terraform, Pulumi, Ansible, Buildkite, ArgoCD.
* Scripting & programming: Python, Go (preferred), Java, Bash/Shell.
* Linux system administration – kernel, networking, process management.
* Database experience with PostgreSQL.
* Monitoring & observability tools (Grafana, etc.).
* Strong incident management and root‑cause analysis.
**Required Education & Certifications**
* Bachelor’s or higher degree in Computer Science, Engineering, or related field (or equivalent experience).
* Relevant cloud or Kubernetes certifications (CKA, AWS Solutions Architect, Azure Solutions Architect, GCP Professional Cloud Architect) advantageous but not mandatory.