Skills

Communication Big Data Problem-solving Research Attention to detail Training Machine Learning Databases Software Development

Job Specifications

About The Team

Part of Ensembl, a world leading provider of genomics data resources and bioinformatics software tools, the HAVANA team produces reference quality gene annotation for the human and mouse genomes. As part of the GENCODE project we leverage cutting-edge technologies and develop novel approaches to ensure GENCODE provides foundational reference genome annotation for human and mouse. A Global Core Biodata Resource, GENCODE has a wide range of users from internationally significant consortia and resources to individual researchers. Our role in the Primary Annotated Resources to Advance Discovery In Genomic Medicine (PARADIGM) consortium is to ensure that gene annotation is optimised to support genomic medicine, for example through our collaboration with NCBI RefSeq to produce the Matched Annotation from NCBI and EMBL-EBI (MANE) transcript set.

Our team comprises both genomic data analysts and bioinformatics developers, working together to incorporate cutting edge scientific knowledge and data into gene annotation. Hands-on annotation of genes and transcripts by expert scientists is directly incorporated into the GENCODE and PARADIGM annotation products and also informs the development of manually supervised automated annotation pipelines that support the scaling of human-expert–quality annotation.

We are looking for highly motivated candidates who possess scientific curiosity and initiative. You are a bioinformatics developer looking to apply your skills and knowledge to software development, large-scale computation, big data, pipeline workflows, and automation to generate gene annotation for GENCODE or PARADIGM that will be used worldwide in applications from fundamental research to clinical genomics.

Duties & Responsibilities. In This Role You Will

Your main responsibility will be the development and deployment of annotation systems to produce and validate reference-quality gene annotations. More specifically, you will:

Work alongside genomic data analysts in the team to produce and improve reference-quality, evidence-based annotation including protein-coding genes, noncoding RNA genes
Integrate cutting edge primary data to create and validate gene structure and function annotation.
Develop pipelines to extend GENCODE annotation from a single reference genome to the human and mouse pangenomes
Develop pipelines to generate training sets for machine learning initiatives.
Maintaining a version controlled code base
Querying databases and helping maintain/expand associated schemas
Operate in a release-based environment and coordinate with other teams.
Collaborate with international partners on genome projects.
Participate in training users in our annotation methods and workflows.

You have (Requirements)

You should hold an MSc, PhD, or equivalent experience in Computer Science, Bioinformatics, Genetics, or a related field.

You Will Be Expected To Write, Understand, And Maintain Complex Code. You Should Also Have Domain Experience In Some Of The Following Areas

Genome annotation
Methods for DNA/RNA sequencing and sequence alignment
Relational databases
Scaling and optimization
Machine learning concepts

Behaviours We Look For In Our Team

You are curious, motivated, with strong attention to detail and an interest in genome annotation and genomic medicine. You enjoy working both independently and as part of a collaborative, international team. You can meet more of the team here:

Strong communication and interpersonal skills
Ability to manage your own time and balance multiple tasks
Excellent attention to detail and problem-solving skills
Ability to work effectively in a team
Willingness to learn, develop, and adapt in a fast-moving research environment
Confidence communicating both biological and computational concepts, orally and in writing

You might also have (Desirable)

Previous experience processing large biological data sets in a production environment is advantageous. This includes an understanding of compute clusters, pipeline workflows, software design, and automation.
Evidence of working in a dynamic, team-based environment or contributing to a large, shared codebase is desirable. Experience with Riboseq data is also highly desirable.

Other helpful information

To apply: Please submit an application with a personalised cover letter and CV. Incomplete applications will not be considered

Hybrid Working: At EMBL-EBI we are pleased to offer hybrid working options for all our employees. A dedicated desk will be available everyday, our team work three days on site and two from home.

Interviews: We plan to hold introductory meetings with selected candidates remotely.

Contract length: 3 years (Grant Based Contract)

Salary: Grade 5 monthly salary starting at £3,303 per month after tax but excluding pension and insurance contributions. Plus generous benefits.

Why join us

Do something meaningful

At EMBL-EBI you can apply your talent and passion to accelerate science and tackle some of

About the Company

Working at EMBL-EBI gives you the opportunity to focus your energy and skills on something that really matters: using technology to contribute to discoveries that benefit humankind. We empower researchers everywhere to realise the potential of 'big data' in biology, and build sophisticated tools for exploring life at the atomic level. Know more