- Company Name
- VE3
- Job Title
- Senior Data Engineer | AWS Glue & Kafka | Outside IR35
- Job Description
-
Job Title: Senior Data Engineer (Contract) | AWS Glue & Kafka
Role Summary: Senior data engineer with ~10 years of experience designing and delivering production data pipelines on AWS. Responsible for lake/lakehouse architectures, batch (Glue/EMR, PySpark) and streaming (Kafka) pipelines, data contracts, quality gates, CI/CD, observability, and security hardening.
Expactations: Deliver production‑ready pipelines, maintain data quality and observability, secure data assets, mentor junior team members, document processes and runbooks, and engage stakeholders for continuous improvement.
Key Responsibilities:
- Architect and implement lake/lakehouse data flows on AWS S3, Glue, and EMR.
- Build and maintain Kafka consumers/producers, manage schema evolution, resilience, and DLQs.
- Develop PySpark transformations, CDC merges, partitioning, and performance optimization.
- Implement data quality tests, monitoring, alerting, and basic lineage.
- Harden security with IAM least privilege, KMS, and private networking.
- Create runbooks, architecture diagrams, and handover documentation.
Required Skills:
- Deep expertise in AWS services: Glue, S3, EMR, RDS, IAM, KMS, CloudWatch.
- Strong Kafka proficiency (MSK/Confluent, schema registry, consumer group tuning).
- Production Python/PySpark development with automated tests and CI/CD pipelines.
- Data modelling for bronze/silver/gold layers, CDC, SCD2, and data contracts.
- Infrastructure as Code (Terraform or CDK) and cost/performance tuning.
- Excellent communication and stakeholder engagement.
Required Education & Certifications:
- Bachelor’s degree in Computer Science, Engineering, or related field (or equivalent experience).
- AWS Certified Data Analytics – Specialty or AWS Certified Big Data – Specialty (preferred).
- Kafka certification (Confluent or equivalent) is a plus.