- Company Name
- Crusoe
- Job Title
- Senior Cloud Support Engineer
- Job Description
-
Job Title: Senior Cloud Support Engineer
Role Summary: Deliver first‑line and advanced technical support for a sustainable GPU cloud platform, ensuring high availability, rapid issue resolution, and customer satisfaction. Act as the primary escalation point, collaborate cross‑functionally with SRE, networking, and storage teams, and contribute to knowledge base and automation initiatives.
Expectations:
- 5 + years of experience in customer or technical support within cloud, storage, or networking environments.
- Proven ability to manage high‑volume, critical incidents 24/7 and meet SLA targets.
- Excellent command of Linux CLI, Git, and cloud orchestration tools.
- Strong analytical, communication, and customer‑service skills.
Key Responsibilities:
- Provide exceptional technical support to customers via Zendesk; maintain CSAT > 95 % and meet SLA metrics.
- Participate in a 24/7 on‑call rotation, ensuring timely incident triage and resolution.
- Diagnose and resolve issues with VMs, hardware, scaling tests, containers (Kubernetes), workload managers (Slurm, Terraform), and monitoring tools (Grafana).
- Manage alert triage, prepare for maintenance windows, and conduct node delivery tests.
- Work closely with SRE, networking, and storage teams from initial triage through root‑cause analysis and issuance of RCAs.
- Develop and maintain onboarding/training materials, knowledge‑base articles, and SOPs for support processes.
- Collaborate with global teams to adhere to ticketing and handoff procedures.
- Contribute to automation scripts and tools that improve support efficiency.
Required Skills:
- Linux command‑line proficiency (bash, ssh, systemd).
- Version control with Git (branching, pull requests).
- Container orchestration (Kubernetes) and workload management (Slurm, Terraform).
- Monitoring and alerting (Grafana, Prometheus, internal tools).
- Public cloud fundamentals (AWS, Azure, GCP).
- HPC knowledge: Infiniband, RDMA, RoCE, SDN.
- Strong problem‑solving, analytical, and troubleshooting abilities.
- Excellent verbal and written communication.
Required Education & Certifications:
- Bachelor’s degree in IT, Computer Science, Engineering, or related field, **or** 4 + years of equivalent technical experience.
- Valid certifications (optional but preferred): CKA, CKAD, CKS, KCNA, AWS ML‑Specialty, AWS Solutions Architect – Professional, NVIDIA AI Infrastructure, Linux Foundation IT Associate, System Administrator.
---