Skills

Communication Python Java Go Scala SQL NoSQL Big Data Data Warehousing Data Governance Data Engineering Apache Spark MySQL MongoDB Cassandra PostgreSQL Monitoring Roadmap planning Problem-solving Architecture Data Architecture Database Management Machine Learning Programming Databases apache Azure AWS cloud platforms Analytics GCP Snowflake Data Science Hadoop Spark Kafka Flink

Job Specifications

Position: Data Engineer

Location: Albany, NY (Onsite/Remote)

Experience: 13+ Years

Work type: C2C only

Visa Type: H1-B, GC and US Citizen

PETADATA is currently hiring for a Data Engineer for one of their clients.

Roles & Responsibilities

Data Pipeline Development

Design, build, and maintain scalable data pipelines for ingesting, processing, and transforming data
Develop batch and real-time data processing workflows
Ensure data pipelines are reliable, efficient, and fault-tolerant

Data Architecture & Modeling

Design and implement data architectures(data lakes, data warehouses, lakehouses)
Create and maintain data models, schemas, and metadata
Optimize data storage and retrieval strategies

Data Integration

Integrate data from multiple sources (databases, APIs, streaming platforms, third-party tools)
Handle structured, semi-structured, and unstructured data
Implement data validation and quality checks

Big Data & Processing Frameworks

Work with big data tools such as Apache Spark, Hadoop, Kafka, Flink
Build streaming and real-time data processing systems
Optimize large-scale data processing performance

Cloud & Platform Services

Use cloud platforms such as AWS, Azure, or GCP
Work with services like:
AWS: S3, Glue, Redshift, EMR, Athena
Azure: Data Factory, Synapse
GCP: BigQuery, Dataflow
Manage cloud-based data infrastructure

Database Management

Manage and optimize SQL and NoSQL databases
Implement indexing, partitioning, and performance tuning
Support data access for analytics, AI, and reporting teams

Data Quality, Security & Governance

Implement data quality checks, monitoring, and alerts
Enforce data governance, lineage, and metadata management
Ensure data security, privacy, and compliance (GDPR, HIPAA, etc.)

Automation & Orchestration

Automate data workflows using tools like Airflow, Luigi, Prefect
Schedule, monitor, and troubleshoot data jobs
Reduce manual processes through automation

Collaboration & Support

Collaborate with data scientists, analysts, and AI engineers
Support analytics, BI, and machine learning initiatives
Translate business requirements into data solutions

Monitoring & Optimization

Monitor pipeline performance and data freshness
Identify and resolve data issues and bottlenecks
Optimize costs and resource usage

Optional / Senior-Level Responsibilities

Define data engineering standards and best practices
Lead data platform design and modernization efforts
Mentor junior data engineers
Drive data strategy and roadmap planning

Required Qualifications

Bachelor’s or Master’s degree in Computer Science, Engineering, Data Science, or a related field
Proven experience as a Data Engineer or similar role
Strong programming skills in Python and SQL (Scala or Java preferred)
Hands-on experience with big data frameworks such as Spark, Kafka, Hadoop
Experience designing and maintaining data pipelines and ETL/ELT processes
Strong understanding of data modeling, data warehousing, and lakehouse architectures
Experience with cloud platforms: AWS, Azure, or GCP
Hands-on experience with SQL and NoSQL databases
Experience with workflow orchestration tools (Airflow, Prefect, Luigi)
Knowledge of data quality, governance, and security best practices
Strong analytical, troubleshooting, and problem-solving skills
Excellent communication and collaboration abilities

Key Skills & Tools

Languages: Python, SQL, Scala, Java
Big Data: Spark, Kafka, Hadoop
Databases: PostgreSQL, MySQL, MongoDB, Cassandra
Cloud: AWS, Azure, GCP
Orchestration: Airflow, Prefect
Data Warehouses: Redshift, Snowflake, BigQuery

Educational Qualification

Bachelor’s or Master’s degree in Computer Science or related field

We offer a professional work environment and are given every opportunity to grow in the Information technology world.

Note

Candidates are required to attend Phone/Video Call / In-person interviews and after Selection of candidate (He/She) should go through all background checks on Education and Experience.

Please email your resume to: keshini@petadata.co

After carefully reviewing your experience and skills one of our HR team members will contact you on the next steps.

About the Company

PETADATA is an Information Technology & Services company that specializes in delivering technology solutions to meet the needs of our clients. PETADATA maintains a fundamental commitment to excellence that is evident in everything we do. Our mission is to understand and meet the needs of both our clients and consultants by delivering quality, value-added solutions. PETADATA is a leading provider of Information Technology services and business solutions. The company has international offices in America and, as well as offs... Know more

AWS Data Engineer

Skills

Job Specifications

About the Company