cover image
Tubi

Principal, ML Infrastructure Engineer

Hybrid

Toronto, Canada

Senior

Full Time

20-11-2025

Share this job:

Skills

Communication Leadership Problem Solving Python Java Scala Stakeholder Management Research Architecture Machine Learning Programming Databases Organization AWS Project Management cloud platforms python programming

Job Specifications

About Tubi:

Boldly built for every fandom, Tubi is a free streaming service that entertains over 100 million monthly active users. Tubi offers the world's largest collection of Hollywood movies and TV shows, thousands of creator-led stories and hundreds of Tubi Originals made for the most passionate fans. Headquartered in San Francisco and founded in 2014, Tubi is part of Tubi Media Group, a division of Fox Corporation.

About the Role:

As a Principal Engineer on the ML Infrastructure team, you will be a technical leader and visionary, driving the evolution of our machine learning platform. You will tackle the most complex and impactful technical challenges, shaping the architecture and technology choices that enable our ML capabilities to scale and deliver exceptional user experiences. You will be a key influencer, bridging the gap between engineering and product, and a mentor to senior engineers, fostering a culture of technical excellence and continuous improvement. Your work will be used by millions of users.

This is a hybrid role in our Toronto office.

What You'll Do:

Technical Leadership & Strategy:

Vision & Influence: Define and champion the long-term vision for ML infrastructure, aligning it with company goals and industry best practices. Influence technical direction and technology selection across the ML platform
Strategic Roadmap: Develop and maintain a roadmap (6-12 months) for the ML Infra team, anticipating future needs and proactively addressing emerging trends
Innovation & Optimization: Identify opportunities to improve ML infrastructure efficiency, scalability, and performance. Research and advocate for new technologies and approaches to optimize the ML development lifecycle

Architecture, Design & Engineering:

System Design: Lead the architecture and design of complex ML systems, ensuring scalability, reliability, security, and maintainability
Distributed Systems Expertise: Design and build scalable, high-throughput, and/or low-latency distributed systems using Scala and related technologies. This includes expertise in areas like distributed databases, message queues, and stream processing
Quality & Standards: Champion and enforce engineering best practices, including code quality, testing, and documentation. Contribute to the development and implementation of ML infrastructure standards

Problem Solving & Delivery:

Technical Problem Solving: Resolve critical and complex technical challenges related to ML infrastructure, demonstrating expertise in debugging, performance optimization, and system troubleshooting
Project Execution: Lead and deliver complex ML infrastructure projects, effectively managing scope, timelines, and dependencies. Mentor engineers on project management best practices
Collaboration & Mentorship: Foster a collaborative environment and provide technical mentorship to other engineers, enabling their growth and development

Communication & Collaboration:

Cross-functional Partnership: Collaborate effectively with data scientists, ML engineers, and product managers to understand their needs and translate them into infrastructure solutions
Stakeholder Management: Communicate effectively with stakeholders at all levels, including senior leadership. Clearly articulate technical concepts, progress updates, and roadblocks
Knowledge Sharing: Promote knowledge sharing and best practices across the organization through documentation, presentations, and mentorship

Your Background:

10+ years of experience in software engineering, with a significant focus on building and scaling large-scale distributed systems
Bachelor's or Master's degree in Computer Science or a related field
Proven experience as a technical leader, architecting and designing complex systems, preferably in the ML infrastructure domain
5+ years of experience with databases, caching technologies, and message brokers
Expertise in Scala, Java, Python programming languages
Extensive experience with cloud platforms (preferably AWS)

Bonus:

Experience in the media or streaming industry
Contributions to open-source projects related to ML infrastructure

Tubi is a division of Fox Corporation, and the FOX Employee Benefits summarized here, covers the majority of all US employee benefits. The following distinctions below outline the differences between the Tubi and FOX benefits:

For US-based non-exempt Tubi employees, the FOX Employee Benefits summary accurately captures the Vacation and Sick Time.
For all salaried/exempt employees, in lieu of the FOX Vacation policy, Tubi offers a Flexible Time off Policy to manage all personal matters.
For all full-time, regular employees, in lieu of FOX Paid Parental Leave, Tubi offers a generous Parental Leave Program, which allows parents twelve (12) weeks of paid bonding leave within the first year of birth, adoption, surrogacy, or foster placement of a child in addition to applicable government leave program(s) and FOX’s short-term disability policy. This time is 100% paid

About the Company

Tubi is the most watched free TV and movie streaming service in the U.S., dedicated to providing all people access to all the world's stories. As a leading ad-supported video-on-demand service, the company engages diverse audiences through a personalized experience and the world's largest content library of over 275,000 movies and TV episodes, a growing collection of Tubi Originals, and nearly 250 FAST channels. Tubi is part of the Tubi Media Group, a division of Fox Corporation that oversees the company's digital businesses Know more