- Company Name
- FastTek Global
- Job Title
- Cloud Operations Specialist
- Job Description
-
**Job Title**
Cloud Operations Specialist
**Role Summary**
Specialist responsible for designing, deploying, and operating cloud‑native data messaging services on GCP Pub/Sub and Confluent Kafka. Focuses on infrastructure as code, site reliability engineering, automation, CI/CD, monitoring, incident response, and disaster recovery to ensure highly available, secure, and compliant messaging pipelines.
**Expectations**
- 5+ years in IT operations with DevOps/Automation focus.
- Minimum 2 years hands‑on experience with GCP Pub/Sub, Confluent Kafka, and other cloud data services.
- 2+ years Terraform IaC, 2+ years scripting (Python, PowerShell, or Bash).
- 2+ years CI/CD pipeline experience (Jenkins, Cloud Build, Tekton).
- Strong grasp of SRE principles, IAM, authentication (OAuth2, AD, LDAP, ADFS, SSL).
- Willingness to be on call during weekends/after hours.
- Detail‑oriented, collaborative, excellent communication and problem‑solving skills.
**Key Responsibilities**
- Create, maintain, and document Terraform modules for Pub/Sub topics/subscriptions, Kafka clusters, and related networking.
- Automate deployment, scaling, and management of messaging services across public and private clouds.
- Develop and refine CI/CD pipelines, quality gates, and automated testing for messaging applications.
- Implement SRE practices: monitoring, alerting, incident response, post‑mortems, and capacity planning.
- Design and validate disaster‑recovery and backup strategies; conduct periodic fail‑over tests.
- Ensure security, compliance, and efficient configuration of cloud resources.
- Collaborate with DevOps, application teams, and vendors to streamline integration and improve developer experience.
- Identify and advocate for new data streaming technologies and patterns to meet evolving needs.
**Required Skills**
- Cloud platforms: GCP (Pub/Sub, Cloud Run, Dataflow), Confluent Kafka Cloud.
- IaC: Terraform (modules, state management).
- Automation & scripting: Python, PowerShell, or Bash.
- CI/CD: Jenkins, Google Cloud Build, RedHat Tekton.
- Monitoring & observability: Grafana, Dynatrace.
- Identity & access management: IAM, OAuth2, AD/LDAP, ADFS, SSL.
- Networking fundamentals: NAT, firewalls, routing, load balancing.
- SRE principles: reliability, performance, observability, incident management.
- Strong verbal & written communication, analytical reasoning, teamwork.
**Required Education & Certifications**
- Bachelor’s degree in Computer Science, Information Technology, or related field.
- Certifications: None required (GCP or Terraform certifications considered a plus).