- Company Name
- Fractal
- Job Title
- Data Architect - Azure & Databricks
- Job Description
-
**Job Title:**
Data Architect – Azure & Databricks
**Role Summary:**
Design, build, and govern scalable lakehouse data platforms on Azure Databricks for enterprise clients, focusing on data modernization, advanced analytics, and AI/ML in the healthcare payer domain. Serve as a technical leader, delivering end‑to‑end data pipelines, ensuring high performance, security, compliance, and stakeholder alignment.
**Expectations:**
- Deliver robust, production‑grade data architectures that support multi‑layer models (Bronze, Silver, Gold).
- Lead cross‑functional workshops to translate business requirements into technical solutions.
- Optimize Spark workloads for cost and performance, maintain observability, and enforce security and governance.
- Champion the adoption of AI copilot tools and agentic workflows to accelerate development.
**Key Responsibilities:**
- Architect and maintain Databricks Lakehouse platforms, including Delta Lake, Unity Catalog, and DLT pipelines.
- Define data models (star/snowflake, dimensional) and create data marts via Databricks SQL warehouse.
- Integrate data from ERP, POS, CRM, e‑commerce, and third‑party sources using Azure Data Lake Storage Gen2.
- Optimize Spark jobs (OPTIMIZE, VACUUM, ZORDER, Time Travel) and configure cluster sizing for performance and cost.
- Implement monitoring, alerting, and performance tuning using Databricks Observability and native cloud tools.
- Design secure, compliant data architectures with RBAC, encryption, and data lineage.
- Lead data governance initiatives (Data Fitness Index, quality scores, metadata cataloging).
- Mentor and collaborate with data engineers and data scientists to deliver end‑to‑end pipelines.
- Evaluate new Databricks features and generative AI tools, proposing pilots for platform enhancement.
**Required Skills:**
- 12–18 years of data engineering experience, including 5+ years on Azure Databricks/Apache Spark.
- Proficient in PySpark, SQL, Delta Lake, DLT, Databricks Workflows, and MLflow.
- Strong background in lakehouse architecture, bronze/silver/gold layering, and ETL/ELT pipelines.
- Expertise in data modeling (star/snowflake, dimensional), Delta Lake optimization, and performance tuning.
- Knowledge of Azure Data Lake Storage Gen2, ingestion from structured/unstructured sources, and REST APIs.
- Experience with Unity Catalog, RBAC, encryption, and data governance practices.
- Familiarity with BI tools (Power BI, Tableau) and Databricks SQL warehouse.
- Ability to use AI code assistants (GitHub Copilot, Databricks Assistant) and advocate agentic workflows.
**Required Education & Certifications:**
- Bachelor’s or Master’s degree in Computer Science, Information Technology, or a related field.
- Certifications in Azure Data Engineering, Databricks Certified Professional, or similar are preferred.