- Company Name
- JPS Tech Solutions
- Job Title
- Senior DevOps/SRE Engineer
- Job Description
-
**Job Title:** Senior DevOps/SRE Engineer
**Role Summary:**
Lead the design, build, and operation of highly available, secure, and cost‑efficient AWS and Kubernetes environments. Drive SRE principles (SLOs, error budgeting), secure CI/CD pipelines, observability, and self‑service automation to enable rapid, reliable delivery for mission‑critical applications.
**Expectations:**
- 5+ years in DevOps, SRE, or Platform Engineering with proven leadership in production automation.
- 3+ years managing high‑availability AWS production workloads.
- Senior experience with Kubernetes, Docker, and Linux system administration.
**Key Responsibilities:**
- Define and maintain SLOs/SLIs, manage error budgets, and lead incident response and blameless post‑mortems.
- Architect secure, scalable IaC using Terraform, Ansible, and CloudFormation for repeatable deployments.
- Design CI/CD pipelines (GitHub Actions, Jenkins) with automated rollbacks, canary releases, and blue‑green deployments.
- Build and manage full‑stack observability pipelines (Prometheus, Grafana, Datadog, or ELK), dashboards, and alerting.
- Enforce security‑as‑code: integrate SAST/DAST, secrets scanning, and SBOM validation into the SDLC.
- Monitor cloud spend, implement right‑sizing, and drive FinOps initiatives.
- Develop shared playbooks, reusable automation modules, and self‑service tools to accelerate developer velocity.
- Mentor cross‑functional teams and establish organization‑wide best practices for fault tolerance and operational readiness.
**Required Skills:**
- Cloud: AWS (IAM, Networking, Compute, EC2, RDS, EKS).
- Containerization: Kubernetes, Docker, Helm.
- IaC: Terraform, CloudFormation, Ansible.
- CI/CD: GitHub Actions, Jenkins, Bitbucket Pipelines, ArgoCD.
- Observability: Prometheus, Grafana, Datadog, ELK stack.
- Security: SAST/DAST tools, secret scanning, SBOM, AWS GuardDuty, SecurityHub.
- Scripting: Python, Go, or Bash.
- Operating Systems: Linux administration (Ubuntu, CentOS).
- Operational Practices: Chaos engineering, capacity modeling, incident management, blameless post‑mortems.
- Communication: Documentation of architecture, playbooks, and collaboration with engineering teams.
**Required Education & Certifications:**
- Bachelor’s degree in Computer Science, Engineering, or related technical field.
- Relevant industry certifications (e.g., AWS Certified Solutions Architect, Certified Kubernetes Administrator, or equivalent) strongly preferred but not mandatory.
District of columbia, United states
On site
Senior
17-01-2026