- Company Name
- Xplore Inc.
- Job Title
- Platform Engineering
- Job Description
-
**Job Title:** Director of Platform Engineering
**Role Summary:**
Lead the design, implementation, and operation of shared compute, data, and automation platforms that underpin all value streams across the organization. Own the Kubernetes, VM, IDP, and hybrid cloud infrastructure; orchestrate data pipelines and internal application development; ensure secure, reliable, scalable, and cost‑optimized platforms that enable network engineering, SRE, analytics, and digital teams.
**Expectations:**
- Drive digital transformation through a cohesive platform strategy.
- Deliver high‑availability, resilient, and standardized services that reduce complexity and accelerate delivery.
- Align with IT Platform Engineering on cloud governance, shared tooling, and enterprise data architecture.
- Mentor and grow cross‑functional engineering teams.
- Provide measurable platform performance and cost insights to executive leadership.
**Key Responsibilities:**
- Define and execute platform engineering strategy; develop roadmaps for cloud, Kubernetes, VM clusters, automation, observability, and data platforms.
- Lead high‑performing teams across platform engineering, Site Reliability Engineering, data engineering, infrastructure engineering, and internal application development.
- Own compute substrate (Kubernetes, VMs, IDPs) ensuring secure, consistent environments for network tooling and enterprise workloads.
- Standardize platform services; reduce delivery complexity; align with enterprise data (bronze → gold) pipelines.
- Lead on‑prem and hybrid Cloud infrastructure; integrate on‑prem compute with Azure or other public clouds.
- Build, maintain, and govern data ingestion, transformation, and quality pipelines that deliver gold‑tier datasets.
- Architect internal developer platforms (IDPs): CI/CD, IAC, secrets management, observability, telemetry.
- Embed SRE principles (SLIs, SLOs, error budgets, automated remediation) across all platform services.
- Own incident management, blameless post‑incident reviews, and continuous improvement of platform operations.
- Accelerate DevOps & automation: cloud‑native app development, reusable frameworks, and platform‑level tooling.
- Collaborate with cross‑functional teams (NOC, Service Assurance, Engineering) to promote platform‑driven operations.
**Required Skills:**
- Cloud-native architecture (Kubernetes, Docker, Helm, Argo CD/Flux, CI/CD).
- Hybrid and multi‑cloud experience (Azure, AWS, GCP).
- Infrastructure as Code: Terraform, Pulumi, or ARM templates.
- Observability & monitoring: Prometheus, Grafana, ELK, Kibana, OpenTelemetry.
- Data pipeline design: Azure Data Factory, DataBricks, Spark, SQL, data governance.
- Strong SRE fundamentals: SLIs, SLOs, error budgets, incident response.
- Automation scripting: Python, Go, Bash, PowerShell.
- Security best practices: secrets management, IAM, network segmentation, compliance.
- Leadership & stakeholder management; excellent communication and mentoring skills.
**Required Education & Certifications:**
- Bachelor’s degree in Computer Science, Engineering, or related field (Master’s preferred).
- Certifications: Certified Kubernetes Administrator (CKA) or Certified Kubernetes Security Specialist (CKS); Azure Solutions Architect Expert, AWS Certified Solutions Architect – Professional, or Google Cloud Professional Cloud Architect; DevOps Foundation or CDK8S Practitioner (optional).
---