Skills

Python Scala Unity SQL Data Governance Data Engineering Apache Spark Encryption CI/CD DevOps Docker Version Control Problem-solving Machine Learning apache git power bi Organization Azure AWS Analytics GCP Spark CI/CD Pipelines Databricks Terraform

Job Specifications

Lead Data Engineer – Databricks

Location: Onsite – Toronto, Canada

Type: Contract

About the Role

We are looking for a Lead Data Engineer with 8–10 years of experience who can quickly assess complex data problems, make sound technical decisions, and drive solutions end to end. This role requires deep hands-on experience with Databricks and modern lakehouse architectures, along with the ability to operate in regulated environments such as banking, finance, or insurance. Experience supporting or modernizing legacy BI platforms like Power BI is a strong plus, as the organization continues its transition toward a Databricks-centric analytics platform.

Key Responsibilities

• Lead the design, development, and ownership of scalable ETL/ELT pipelines using Databricks and Apache Spark for both batch and streaming workloads.

• Architect and maintain Delta Lake lakehouse solutions, implementing Medallion (Bronze, Silver, Gold) patterns to support analytics and machine learning use cases.

• Serve as a technical lead who can independently figure things out, evaluate trade-offs, and make pragmatic decisions in fast-moving, ambiguous situations.

• Collaborate closely with solution architects, analysts, and business stakeholders to translate regulatory and business requirements into reliable data solutions.

• Implement and enforce data governance, security, and access controls using Unity Catalog, IAM, encryption, and audit-ready data practices.

• Integrate Databricks with cloud services across AWS, Azure, or GCP, including cloud storage, ingestion frameworks, and orchestration tools.

• Design and automate workflows using tools such as Airflow, dbt, or cloud-native schedulers, ensuring reliability and observability.

• Tune Spark jobs and Databricks clusters for performance and cost efficiency while meeting enterprise SLAs.

• Apply DevOps and DataOps best practices, including CI/CD pipelines, version control, and automated testing for data workloads.

• Support downstream reporting and analytics, including integration with legacy Power BI dashboards during platform modernization efforts.

• Provide technical guidance and mentorship to other data engineers, setting standards and best practices across the team.

Required Qualifications

• 8–10 years of experience in data engineering or large-scale data platform development.

• Strong hands-on expertise with Databricks, Apache Spark, and Delta Lake in production environments.

• Prior experience working in regulated industries such as banking, finance, insurance, or similar enterprise environments.

• Proficiency in Python or Scala, with advanced SQL skills and experience in performance tuning and data modeling.

• Experience with at least one major cloud platform: AWS, Azure, or GCP.

• Hands-on experience with data pipeline orchestration tools such as Airflow, dbt, Step Functions, or equivalent.

• Strong problem-solving mindset and the ability to work independently and onsite in Toronto.

Preferred Qualifications

• Experience with Unity Catalog, MLflow, or data quality frameworks.

• Power BI experience, particularly in supporting or migrating legacy reporting platforms.

• Familiarity with Terraform, Docker, and Git-based CI/CD workflows.

• Databricks or cloud data engineering certifications.

About the Company

Comprehensive Staffing Solutions for Specialized Engineering Roles: Software Development Engineering: We offer staffing solutions with skilled software engineers proficient in various programming languages and technologies. They are equipped to develop top-tier, scalable, and efficient software solutions that align with your project needs. Site Reliability Engineering (SRE): Our expert SRE staff specialize in ensuring the high availability and reliability of software systems. They excel in continuous integration and deployme... Know more