Skills

Communication Python Scala SQL Data Governance Data Engineering Encryption Salesforce Workday CI/CD DevOps Architecture Machine Learning apache AWS cloud platforms Analytics Spark PySpark Terraform

Job Specifications

Role: Lead AWS Data Engineer

Location: Houston, TX-Onsite

Duration: Long Term Contract

Minimum Requirements:

Bachelor’s or Master’s degree in Computer Science, Information Systems, Engineering, or a related discipline plus at least 12 years of hands-on data engineering experience, or demonstrated equivalency of experience and/or education
5+ years in a technical-lead or team-lead capacity delivering enterprise-grade solutions.
Deep expertise in AWS data and analytics services: e.g.; S3, Glue, Redshift, Athena, EMR/Spark, Lambda, IAM, and Lake Formation.
Proficiency in Python/PySpark or Scala for data engineering, along with advanced SQL for warehousing and analytics workloads.
Demonstrated success designing and operating large-scale ELT/ETL pipelines, data lakes, and dimensional/columnar data warehouses.
Experience with workflow orchestration (e.g.; Airflow, Step Functions) and modern DevOps practices—CI/CD, automated testing, and infrastructure-as-code (e.g.; Terraform or CloudFormation).
Experience with data lakehouse architecture and frameworks (e.g.; Apache Iceberg).
Experience in integrating with enterprise (onprem, SaaS) systems (Oracle e-business, Salesforce, Workday)
Strong communication, stakeholder-management, and documentation skills; aptitude for translating business needs into technical roadmaps.

Preferred Qualifications:

Solid understanding of data modeling, data governance, security best practices (encryption, key management), and compliance requirements.
Experience working within similarly large, complex organizations
Experience building integrations for enterprise back-office applications
AWS Certified Data Analytics – Specialty or AWS Solutions Architect certification (or equivalent) preferred; experience with other cloud platforms is a plus.
Proficiency in modern data storage formats and table management systems, with a strong understanding of Apache Iceberg for managing large-scale datasets and Parquet for efficient, columnar data storage.
In-depth knowledge of data cataloging, metadata management, and lineage tools (AWS Glue Data Catalog, Apache Atlas, Amundsen) to bolster data discovery and governance.
Knowledge of how machine learning models are developed, trained, and deployed, as well as the ability to design data pipelines that support these processes.
Experience migrating on-prem data sources onto AWS.
Experience building high quality Data Products.

About the Company

We are a dynamic and forward-thinking software development and staff augmentation firm committed to delivering innovative solutions tailored to our clients' unique needs. We are passionate about leveraging technology and talent to drive your business forward. Whether you have a specific project in mind or require ongoing support. Know more