Skills

Python SQL Data Governance Data Engineering Apache Spark Apache Airflow GitHub CI/CD Docker Kubernetes Monitoring AWS Lambda Architecture Data Architecture Machine Learning apache AWS AWS Cloud Analytics Spark CI/CD Pipelines PySpark Kafka Terraform Infrastructure as Code GitHub Actions

Job Specifications

We Are Hiring – Senior Data Engineer (PySpark & AWS)

Salary: 300 outside IR35

Location: Remote in UK

We are looking for an experienced and highly skilled Senior Data Engineer to join our growing data engineering team. This role is ideal for a passionate engineer who thrives in building scalable data platforms, designing robust pipelines, and working with cutting-edge cloud technologies.

About The Role

As a Senior Data Engineer, you will be responsible for designing, developing, and optimizing large-scale data pipelines that power analytics, reporting, and machine learning initiatives. You will work closely with data scientists, analysts, and platform teams to ensure data is reliable, secure, and available in real time and batch processing environments.

Key Responsibilities

Design, build, and maintain scalable data pipelines using PySpark and Python for high-volume, high-velocity data processing.
Develop and manage ETL/ELT workflows, ensuring data accuracy, consistency, and performance.
Orchestrate complex workflows using Apache Airflow, including scheduling, dependency management, and failure handling.
Architect and implement cloud-native data solutions on AWS, following best practices for performance, scalability, and security.
Work extensively with AWS services such as API Gateway, AWS Lambda, Amazon Redshift, AWS Glue, Amazon CloudWatch, Amazon S3, EMR, and IAM.
Use Terraform to provision and manage AWS infrastructure as code, ensuring reproducible and reliable environments.
Build and maintain CI/CD pipelines using GitHub Actions to automate testing, deployment, and infrastructure changes.
Optimize Spark jobs, tune performance, and troubleshoot production issues across distributed systems.
Collaborate with cross-functional teams to define data architecture, governance, and best practices.

Required Qualifications

6+ years of hands-on experience in data engineering or related roles.
Strong expertise in Python, PySpark, and SQL with experience in writing optimized, production-grade code.
In-depth knowledge of Apache Spark internals and Apache Airflow.
Proven experience designing and implementing ETL pipelines for large-scale data platforms.
Strong hands-on experience with AWS cloud services, especially API Gateway, Lambda, Redshift, Glue, CloudWatch, S3, and EMR.
Experience provisioning infrastructure using Terraform.
Practical experience building CI/CD pipelines using GitHub Actions.

Preferred Qualifications

Experience with real-time data streaming using Kafka, Kinesis, or similar technologies.
Familiarity with containerization tools such as Docker and Kubernetes.
Knowledge of data governance, data quality frameworks, and monitoring strategies.

Why Join Us?

Work on large-scale, high-impact data platforms.
Opportunity to shape modern data architecture in a cloud-first environment.
Collaborative, innovative, and growth-focused culture.
Competitive compensation and benefits.

About the Company

Welcome to Athsai, For Technical and Automotive sector Hiring Our Journey: Athsai emerged from a shared vision to bridge the gap between exceptional tech professionals and companies on the cutting edge of innovation. We recognised that the most ground breaking solutions arise when diverse minds collaborate, and that’s where our journey began. Our Mission: Our mission is simple yet transformative: To propel the tech industry forward by connecting the brightest minds with groundbreaking opportunities. We’re not just matc... Know more