Skills

Python Go Bash PostgreSQL CI/CD DevOps Docker Kubernetes Monitoring Databases AWS Agile Snowflake Redis Kafka Prometheus Grafana Loki

Job Specifications

The Client

We are working with a leading multi-strategy Hedge Fund where engineering plays a critical role in the core business rather than operating as a support function. Technical teams are given real ownership, work on challenging and meaningful problems, and collaborate closely with end users, ensuring their work has a clear impact.

They are seeking a Senior SRE to join their London HQ to work hands-on with cloud and on-prem platforms.They require someone to supercharge system reliability and elevate performance across every part of their trading infrastructure

Why You'll Get

Joining a group that focuses on modern platforms, high engineering standards, and rewarding strong performance.
It offers an environment where technologists can grow, innovate, and see the results of their contributions.
Long-term career progression as the firm continues to grow.
The chance to cultivate their SRE philosophy, processes, and technologies from the ground up.
Drive standards and foster adoption within your core team, whilst closely partnering with our DevOps and Cloud teams.
An opportunity to be instrumental in evolving our operations and boosting performance across diverse systems and platforms.

What You'll Do

Define and embed SRE principles, creating processes and standards that support scalable and reliable infrastructure.
Design and maintain comprehensive monitoring and observability using Prometheus, Grafana, Loki, and Tempo to ensure clear insights into system and application performance.
Participate in the team’s on-call rotation, sharing responsibility for approximately one week per month.
Set and maintain reliability requirements for applications running in Kubernetes, balancing performance, cost efficiency, and system resilience.
Develop tools and automation to improve deployment pipelines, system health checks, and recovery procedures.
Work closely with development teams to improve service stability, scalability, and fault tolerance, applying best practices such as SLOs and blameless post-mortems.

What You'll Need

5+ years in SRE or similar roles with complex, distributed systems
Degree in engineering, computer science, or equivalent experience
Expert in Prometheus, Grafana, Loki, Tempo (OTEL) and observability tooling
Skilled with Kubernetes, Docker, and containerised environments
Hands-on with cloud (AWS preferred) and on-prem infrastructure
Proficient in Python, Bash, or Go for automation and pipelines
Solid grasp of CI/CD, DevOps, and agile workflows
Self-starter with a passion for reliability and operational excellence
Strong communicator, able to translate technical concepts across teams

Bonus Skills: Experience with databases (PostgreSQL, Redis, Snowflake), messaging systems (Kafka, Solace), or workflow orchestration (Airflow)

If you’re a motivated Senior SRE looking to step into an ideas-driven, high-performance environment, we’d love to hear from you.

About the Company

Tempest Vane Partners is a specialist recruitment business delivering to Investment Management, Digital Assets and related FinTech businesses in London and New York City. With 20 years of experience covering the financial markets industry TVP offers deep recruitment expertise across Technology, Operations & Quantitative Analytics. Know more