cover image
Dfbooking Recruitment Services

ML Engineer focused on Data Engineering

On site

San francisco, United states

Full Time

24-12-2025

Share this job:

Skills

Python Data Engineering Research Training Machine Learning Organization

Job Specifications

ML Engineer focused on Data Engineering Job Description Job brief Join our San Francisco office as an ML Engineer focused on Data Engineering. Visa sponsorship available for global talent
Implement and maintain distributed storage solutions to provide seamless data access across all training machines.
Develop strategies to reduce data storage costs while ensuring high availability and reliability.
Optimize input/output operations to accelerate data retrieval and processing during training and inference phases.
Build and manage Back End systems to track and manage different versions of datasets, ensuring reproducibility and consistency in experiments.
A machine learning engineer with an emphasis on data engineering is needed by this organization to manage and improve their massive datasets, which include hundreds of millions of photos and videos. This position is essential to supporting our research and development teams by guaranteeing effective data access, storage, and versioning. Your contributions will fund state-of-the-art research and development initiatives while immediately improving the effectiveness of our data handling procedures. What you\'ll do:
Implement and maintain distributed storage solutions to provide seamless data access across all training machines.
Develop strategies to reduce data storage costs while ensuring high availability and reliability.
Optimize input/output operations to accelerate data retrieval and processing during training and inference phases.
Build and manage Back End systems to track and manage different versions of datasets, ensuring reproducibility and consistency in experiments. Our culture:
We work Full time and in-person at our waterfront office in San Francisco.
We believe that demonstrated interest in the creative space is key: our team includes musicians, designers, visual artists and more. Example tacit skills we\'re looking for:
Experience with distributed storage systems including deployment, configuration, and optimization.
Strong skills in Python and (ideally) C+ for developing data processing pipelines and integrating storage solutions.
Experience in building and maintaining data pipelines capable of handling large-scale datasets efficiently.
Experience with K8s.
Generalist Back End experience with familiarity in ETL and infrastructure. What we offer:
Openness to sponsoring International candidates (eg STEM OPT, OPT, H1B, O1, E3)
Work alongside a world class developing the future of AI tooling
Significant impact on our market presence and growth
Competitive compensation (75% percentile of market rates) with significant equity upside Required Skills:
ETL
Data Engineering
Organization
Operations
High Availability
Data Processing
Output Pipelines
Compensation
Storage
Machine Learning
Infrastructure
Availability
C+
Research
Engineering
Python
Training

About the Company

Founded in 2020, DFBOOKING Recruitment Services is a trusted international recruitment and education consulting agency helping people build better lives and careers abroad. Through our platform dfbooking.com, we connect skilled job seekers and students from Africa and Southeast Asia with visa-sponsored employment opportunities and world-class education programs in countries like the USA, Canada, UK, Ireland, Australia, and Germany. We specialize in: Visa-sponsored jobs in healthcare, education, and other skilled professions... Know more