- Company Name
- FactSet
- Job Title
- Lead Site Reliability Engineer - (DevOps, AWS Focus, PostgreSQL, CI/CD Systems) - Hybrid
- Job Description
-
**Job Title:** Lead Site Reliability Engineer
**Role Summary:** Lead the architecture and continuous improvement of cloud infrastructure and database systems to ensure scalability, reliability, and security for mission-critical applications.
**Expectations:**
- Architect and optimize AWS/EKS-based infrastructure with PostgreSQL databases for high availability and performance.
- Implement CI/CD pipelines, GitOps workflows, and Kubernetes cluster management to streamline deployment and scalability.
- Establish monitoring, logging, and incident resolution practices aligned with security and compliance standards (SOC2, GDPR).
- Mentor engineering teams on DevOps/SRE methodologies and foster cross-functional collaboration.
**Key Responsibilities:**
- Design, deploy, and manage scalable, secure cloud environments on AWS (EKS, EC2, S3, RDS, IAM).
- Oversee PostgreSQL database tuning, backup, disaster recovery, and observability.
- Develop and maintain Terraform workflows for infrastructure-as-code with CI/CD integration.
- Build and scale CI/CD pipelines using GitLab CI, GitHub Actions, or equivalent tools.
- Implement GitOps practices (Argo CD, Flux) to automate deployment workflows.
- Deploy and manage production-grade Kubernetes clusters with robust security and scaling policies.
- Establish monitoring solutions (DataDog, Prometheus, Grafana) for system observability.
- Apply security best practices and ensure compliance with regulatory requirements.
**Required Skills:**
- 10+ years in SRE, DevOps, or cloud-native roles.
- Expertise in AWS services (EC2, EKS, S3, RDS, IAM).
- Advanced PostgreSQL administration (tuning, backup, HA).
- Proficiency in Terraform (modular design, CI/CD integration).
- Experience with CI/CD tools (GitLab CI, GitHub Actions).
- Hands-on GitOps workflow execution (Argo CD, Flux).
- Strong Kubernetes knowledge (deployment, networking, security).
- Familiarity with monitoring/logging stacks (Prometheus, Grafana, ELK).
- Strategic infrastructure roadmap design and communication.
- Strong written/verbal communication and team mentoring skills.
**Required Education & Certifications:**
- Bachelor’s degree in computer science or related field.
- Preferred certifications: AWS Solutions Architect, Kubernetes, Terraform.
- Relevant experience in fintech/SaaS or high-compliance industries.