cover image
Softcom Systems Inc

Reliability Engineer

On site

Plano, United states

Freelance

25-02-2026

Share this job:

Skills

Python Java Bash Incident Response DevOps Docker Kubernetes Monitoring Scripting and Automation Agile methodologies Architecture Azure AWS cloud platforms Agile GCP Microservices

Job Specifications

RELIABILITY ENGINEER – PLANO, TX

Provide consulting Services to design, implement, and manage robust monitoring and alerting systems to proactively

identify issues and execute timely incident response.

3+ years’ experience working with cloud platforms and services (AWS, Azure, GCP, etc.) in a production

environment.

Solid understanding of monitoring and logging tools, such as Datadog and Cloudwatch.

Solid knowledge of containerization technologies (Docker, Kubernetes) and microservices architecture.

Familiarity with DevOps practices and Agile methodologies.

Strong scripting and automation skills (e.g., Python, Bash) to facilitate operational tasks.

OOP experience, ability to perform code reviews (Java)

Strong analytical and troubleshooting skills in complex environments

Understanding of observability concepts, and monitoring experience

About the Company

Softcom Systems Inc is headquartered in Princeton, NJ serving the US market for the last 20 years. Apart from United States, we also serve the global market from our Canada, London and India offices. Our service offerings include: * Application Development, Maintenance and Support * Digital Consulting - Social, Mobile, Big Data, Cloud, Agile * Business Intelligence and Analytics * Digital Testing which includes Independent, Managed & Specialized testing services * Enterprise solutions - SAP * CRM solutions - Salesforce S... Know more