Job Specifications
Position: Data Engineer
Location: Albany, NY (Onsite/Remote)
Experience: 13+ Years
Work type: C2C only
Visa Type: H1-B, GC and US Citizen
PETADATA is currently hiring for a Data Engineer for one of their clients.
Roles & Responsibilities
Data Pipeline Development
Design, build, and maintain scalable data pipelines for ingesting, processing, and transforming data
Develop batch and real-time data processing workflows
Ensure data pipelines are reliable, efficient, and fault-tolerant
Data Architecture & Modeling
Design and implement data architectures(data lakes, data warehouses, lakehouses)
Create and maintain data models, schemas, and metadata
Optimize data storage and retrieval strategies
Data Integration
Integrate data from multiple sources (databases, APIs, streaming platforms, third-party tools)
Handle structured, semi-structured, and unstructured data
Implement data validation and quality checks
Big Data & Processing Frameworks
Work with big data tools such as Apache Spark, Hadoop, Kafka, Flink
Build streaming and real-time data processing systems
Optimize large-scale data processing performance
Cloud & Platform Services
Use cloud platforms such as AWS, Azure, or GCP
Work with services like:
AWS: S3, Glue, Redshift, EMR, Athena
Azure: Data Factory, Synapse
GCP: BigQuery, Dataflow
Manage cloud-based data infrastructure
Database Management
Manage and optimize SQL and NoSQL databases
Implement indexing, partitioning, and performance tuning
Support data access for analytics, AI, and reporting teams
Data Quality, Security & Governance
Implement data quality checks, monitoring, and alerts
Enforce data governance, lineage, and metadata management
Ensure data security, privacy, and compliance (GDPR, HIPAA, etc.)
Automation & Orchestration
Automate data workflows using tools like Airflow, Luigi, Prefect
Schedule, monitor, and troubleshoot data jobs
Reduce manual processes through automation
Collaboration & Support
Collaborate with data scientists, analysts, and AI engineers
Support analytics, BI, and machine learning initiatives
Translate business requirements into data solutions
Monitoring & Optimization
Monitor pipeline performance and data freshness
Identify and resolve data issues and bottlenecks
Optimize costs and resource usage
Optional / Senior-Level Responsibilities
Define data engineering standards and best practices
Lead data platform design and modernization efforts
Mentor junior data engineers
Drive data strategy and roadmap planning
Required Qualifications
Bachelor’s or Master’s degree in Computer Science, Engineering, Data Science, or a related field
Proven experience as a Data Engineer or similar role
Strong programming skills in Python and SQL (Scala or Java preferred)
Hands-on experience with big data frameworks such as Spark, Kafka, Hadoop
Experience designing and maintaining data pipelines and ETL/ELT processes
Strong understanding of data modeling, data warehousing, and lakehouse architectures
Experience with cloud platforms: AWS, Azure, or GCP
Hands-on experience with SQL and NoSQL databases
Experience with workflow orchestration tools (Airflow, Prefect, Luigi)
Knowledge of data quality, governance, and security best practices
Strong analytical, troubleshooting, and problem-solving skills
Excellent communication and collaboration abilities
Key Skills & Tools
Languages: Python, SQL, Scala, Java
Big Data: Spark, Kafka, Hadoop
Databases: PostgreSQL, MySQL, MongoDB, Cassandra
Cloud: AWS, Azure, GCP
Orchestration: Airflow, Prefect
Data Warehouses: Redshift, Snowflake, BigQuery
Educational Qualification
Bachelor’s or Master’s degree in Computer Science or related field
We offer a professional work environment and are given every opportunity to grow in the Information technology world.
Note
Candidates are required to attend Phone/Video Call / In-person interviews and after Selection of candidate (He/She) should go through all background checks on Education and Experience.
Please email your resume to: keshini@petadata.co
After carefully reviewing your experience and skills one of our HR team members will contact you on the next steps.