Skills

Communication Python Java Go Scala Incident Response GitLab CI/CD DevOps Monitoring Jenkins Performance Testing Problem-solving Linux Programming Shell SDLC Hadoop CI/CD Pipelines Gitlab CI OpenShift Kafka Prometheus Grafana

Job Specifications

Job Title: Sr. DevOps Engineer

Job Location: Chicago, IL (Locals only)

Job Type – Contract

Job Description

As an SRE, you will ensure the reliability, scalability, and performance of mission-critical applications and infrastructure. This role requires deep technical expertise, strong coding skills, and collaboration with development and DevOps teams to enhance observability, automation, and production stability in hybrid environments

Required Skills & Experience

5+ years in Site Reliability Engineering or related roles.
Strong experience with NFT frameworks and performance testing tools.
Hands-on expertise in Red Hat OpenShift and container orchestration.
Solid understanding of Linux systems administration and hybrid cloud integration.
Proficiency in monitoring tools (Prometheus, Grafana, ELK stack) and incident management.
Experience with CI/CD tools (Jenkins, GitLab CI, DevOps).
Advanced coding skills in Java, Python, and one additional language (Go, Scala, or Shell).
Deep knowledge of distributed systems and data platforms (Hadoop ecosystem, Kafka, NiFi).
Excellent problem-solving and communication skills.

Key Responsibilities

Non-Functional Testing (NFT): Design and execute NFT strategies to validate system performance, resilience, and scalability before production deployments.
Production Deployment Management: Oversee and coordinate production releases, ensuring zero-downtime deployments and rollback strategies.
OpenShift Expertise: Manage and optimize workloads on Red Hat OpenShift Container Platform (OCP), including cluster configuration and troubleshooting.
Hybrid Integration: Integrate on-premises Linux VMs with cloud services, ensuring secure and seamless connectivity in hybrid environments.
Monitoring & Incident Response: Implement advanced monitoring solutions, proactively detect anomalies, and lead incident response and root cause analysis.
Preventive Measures: Develop automation and guardrails to prevent outages and improve system health.
Collaboration: Partner with DevOps and development teams to enhance CI/CD pipelines, improve observability, and embed reliability practices into the SDLC.
Distributed Data Processing: Build and maintain data pipelines using Hadoop, HBase, Kafka, and NiFi for large-scale data processing and streaming.
Programming & Automation: Write robust, maintainable code in Java, Python, and at least one additional
programming language (e.g., Go, Scala, or Shell scripting) for automation, tooling, and system integrations.

-Thanks & regards

Akram Khan

About the Company

Our brand's primary goal is to offer the necessary recruitment solutions to our corporate clients. We assist businesses and candidates in enhancing their overall growth chances with our premium variety of services. As a reputable company, we are always here to assist you in developing the best procedures for selecting the best individuals from a large pool. Most importantly, we help you narrow down the pool of qualified candidates and run a productive training session. To enable our clients to become more agile and competit... Know more