cover image
Handshake

AI Research Scientist - Evaluation, Handshake AI

Hybrid

New york, United states

Full Time

10-09-2025

Share this job:

Skills

Leadership Python Dynamics Research Training Coaching Synthetic Data Machine Learning PyTorch Large Language Models

Job Specifications

About Handshake AI

Handshake is building the career network for the AI economy. Our three-sided marketplace connects 18 million students and alumni, 1,500+ academic institutions across the U.S. and Europe, and 1 million employers to power how the next generation explores careers, builds skills, and gets hired.

Handshake AI is a human data labeling business that leverages the scale of the largest early career network. We work directly with the world's leading AI research labs to build a new generation of human data products. From PhDs in physics to undergrads fluent in LLMs, Handshake AI is the trusted partner for domain-specific data and evaluation at scale.

This is a unique opportunity to join a fast-growing team shaping the future of AI through better data, better tools, and better systems--for experts, by experts.

Now's a great time to join Handshake. Here's why:

Leading the AI Career Revolution: Be part of the team redefining work in the AI economy for millions worldwide.
Proven Market Demand: Deep employer partnerships across Fortune 500s and the world's leading AI research labs.
World-Class Team: Leadership from Scale AI, Meta, xAI, Notion, Coinbase, and Palantir, just to name a few.
Capitalized & Scaling: $3.5B valuation from top investors including Kleiner Perkins, True Ventures, Notable Capital, and more.

About The Role

Design and conduct original research in LLM understanding, evaluation methodologies, and the dynamics of human-AI knowledge interaction
Develop novel evaluation frameworks and assessment techniques that reveal deep insights into model capabilities and limitations
Collaborate with engineers to transform research breakthroughs into scalable benchmarks and evaluation systems
Pioneer new approaches to measuring model understanding, reasoning capabilities, and alignment with human knowledge
Write high-quality code to support large-scale experimentation, evaluation, and knowledge assessment workflows
Publish findings in top-tier conferences and contribute to advancing the field's understanding of AI capabilities
Work with cross-functional teams to establish new standards for responsible AI evaluation and knowledge alignment

Desired Capabilities

PhD or equivalent research experience in machine learning, computer science, cognitive science, or a related field with focus on AI evaluation or understanding
Strong background in LLM research, model evaluation methodologies, interpretability, or foundational AI assessment techniques
Demonstrated ability to independently lead post training and evaluation research projects from theoretical framework to empirical validation
Proficiency in Python and deep experience with PyTorch for large-scale model analysis and evaluation
Experience designing and conducting experiments with large language models, benchmark development, or systematic model assessment
Strong publication record in post training, AI evaluation, model understanding, interpretability, or related areas that advance our comprehension of AI capabilities
Ability to clearly communicate complex insights about model behavior, evaluation methodologies, and their implications for AI development

Extra Credit

Experience with RL, agent modeling, or AI alignment
Familiarity with data-centric AI approaches, synthetic data generation, or human-in-the-loop systems
Understanding of the challenges in scaling foundation models (e.g., training stability, safety, inference efficiency)
Contributions to open-source AI libraries or research tooling
Interest in shaping the societal impact, deployment ethics, and governance of frontier models

Perks

Handshake delivers benefits that help you feel supported--and thrive at work and in life.

The below benefits are for full-time US employees.

Ownership: Equity in a fast-growing company

Financial Wellness: 401(k) match, competitive compensation, financial coaching

Family Support: Paid parental leave, fertility benefits, parental coaching

Wellbeing: Medical, dental, and vision, mental health support, $500 wellness stipend

Growth: $2,000 learning stipend, ongoing development

Remote & Office: Stipends for home office setup, internet, commuting, and free lunch/gym in our SF office

Time Off: Flexible PTO, 15 holidays + 2 flex days, winter #ShakeBreak where our whole office closes for a week!

Connection: Team outings & referral bonuses

Explore our mission, values, and comprehensive US benefits at joinhandshake.com/careers.

About the Company

Handshake is the career platform for Gen Z. With a community of over 17 million students, alumni, employers, and career educators, Handshake's network is where career advice and discovery turn into first, second, and third jobs. Nearly 1 million companies use Handshake to build their future workforce--from Fortune 500 to federal agencies, school districts to startups, healthcare systems to small businesses. Handshake is built for where you're going, not where you've been. Know more