Job Specifications
Your New Company
A leading Canadian retailer at the forefront of food and health innovation, committed to delivering quality, value, and sustainability. With a diverse portfolio of trusted brands and a strong presence across grocery, pharmacy, and wellness, this organization plays a vital role in serving millions of Canadians every day. Their focus on community, environmental responsibility, and long-term growth makes them a cornerstone of the Canadian retail landscape.
Your Role:
We are seeking a Data Engineer / Python Developer to lead the data acquisition and processing efforts for a high-stakes, agentic AI chatbot in the healthcare domain. This is not a traditional BI or ETL role; you will not be building dashboards or moving data for analytics. Instead, you will architect a robust, modular engine capable of crawling, parsing, and normalizing vast amounts of unstructured and structured healthcare data from diverse sources—ranging from dynamic JavaScript websites and PDFs to proprietary vendor formats.
Your work will serve as the foundational data layer for a Retrieval-Augmented Generation (RAG) system, ensuring our AI agent has access to accurate, up-to-date, and secure information regarding medications and personalized health data.
Agentic Systems & ML Services
Architect Multi-Agent Workflows: Design and implement a 2-layer supervisor-router graph and subgraphs using LangGraph and LangChain to coordinate complex task execution.
Reasoning & Planning: Implement ReAct patterns to enable the AI to perform autonomous chain-of-thought reasoning, strategic planning, and tool-based actions.
Tool Integration: Develop and maintain function calling capabilities and Model Context Protocol (MCP) servers to allow the agent to interact with external APIs and databases.
Safety & Guardrails: Build and deploy rigorous guardrail systems to detect and mitigate malicious inputs, handle medical crisis queries, and prevent inappropriate or biased outputs.
Evaluation Frameworks: Build and maintain a comprehensive evaluation service to measure LLM performance, grounding accuracy, and agentic reliability.
Data Engineering & Infrastructure
Data Acquisition: Develop scalable web scrapers and data collection pipelines using Scrapy and BeautifulSoup.
Pipeline Orchestration: Manage complex ETL/ELT workflows using Apache Airflow to process and ingest healthcare data.
Hybrid Data Storage: Architect and optimize data ingestion into Weaviate (Vector DB) for semantic search and possibly future case for Neo4j (Knowledge Graph) for structured relationship mapping.
RAG Optimization: Build and refine a full Retrieval-Augmented Generation (RAG) pipeline to ensure all LLM responses are grounded in verified healthcare data sources.
Minimum Qualifications
Core Skills
Expert Python: Deep experience in backend Python development with a focus on data processing.
Web Scraping Stack: Mastery of Scrapy, BeautifulSoup, and tools for handling dynamic content (e.g., Selenium, Playwright, or headless browsers).
Orchestration: Professional experience building and monitoring pipelines in Apache Airflow.
Data Formatting: Proficiency in handling diverse serialization formats (JSONL, Parquet, XML) and unstructured data (PDF parsing).
Experience & Qualifications
Healthcare Domain: Prior experience working with sensitive data, including PII/PHI and adhering to security compliance (e.g., HIPAA).
Cloud Platforms: Strong preference for candidates with hands-on experience in GCP (BigQuery, Cloud Functions, GCS).
Collaborative Mindset: Proven ability to work in a team-oriented environment, collaborating closely with AI and Backend engineers.
Best Practices: Strong grasp of software engineering principles (DRY, SOLID) and data engineering patterns.
Experience & Knowledge
RAG & Grounding: Deep understanding of embedding models, retrieval strategies, and grounding techniques to minimize hallucinations.
Agentic Patterns: Practical experience implementing ReAct, Plan-and-Execute, or similar agentic reasoning patterns.
Safety & Ethics: Experience implementing LLM safety layers and handling sensitive user queries (preferably in a regulated domain like healthcare).
API Development: Strong experience building and consuming RESTful APIs and implementing tool-calling interfaces.
What You'll Get in Return
Competitive rate.
Challenging and great work environment
About the Company
We are leaders in specialist recruitment and workforce solutions, offering advisory services such as learning and skill development, career transitions and employer brand positioning.
As the Leadership Partner to our customers, we invest in lifelong partnerships that empower people and businesses to succeed. We help you achieve your career goals and deliver your business needs by combining meaningful innovation with our global scale and insights.
Last year we helped over 280,000 people find their next career. Join the mill...
Know more