cover image
Stability AI

Senior Site Reliability Engineer

Remote

United states

Senior

Full Time

21-01-2026

Share this job:

Skills

Cloud Security CI/CD Kubernetes Monitoring Networking Organization AWS Software Development CI/CD Pipelines Terraform Grafana Infrastructure as Code

Job Specifications

< Remote - United States >

Job Description:

Stability AI’s Engineering Operations team is looking for a Senior Site Reliability Engineer (SRE) to join our growing team and play a pivotal role in improving and shaping our cloud infrastructure. The person will closely work with engineering, IT, security, and product teams to drive innovation and reliability in an evolving environment. Candidates should have the initiative to build and improve a maturing cloud landscape.

Responsibilities:

Developing and enforcing SRE best practices and standards across the organization
Architecting and managing scalable systems in AWS and other cloud environments, focusing on high availability and resilience
Implementing and maintaining infrastructure as code using Terraform
Setting up and refining monitoring, logging, and alerting systems
Driving incident management and root cause analysis to improve system reliability
Championing SRE principles and mentoring junior team members

Qualifications:

Collaborating with development teams to enhance CI/CD pipelines
Experience scaling resource intensive systems, be it storage, networking, or compute
Knowledge and experience with Kubernetes or other container scaling solutions
Background in software development or automation scripting
Knowledge and experience with Grafana, ELK stack, or similar tools
Cloud security experience

Equal Employment Opportunity:

We are an equal opportunity employer and do not discriminate on the basis of race, religion, national origin, gender, sexual orientation, age, veteran status, disability or other legally protected statuses.

About the Company

Stability AI is the enterprise-ready creative partner for teams and creators, delivering professional-grade generative AI tools and solutions for media generation and editing across image, video, 3D, and audio to enable creative production at scale. Stability AI sparked the generative AI revolution with the release of Stable Diffusion in August 2022, putting generative technology in the hands of millions of creators globally and cementing its position as a leader in the field. Stable Diffusion models have since been download... Know more