Lead Site Reliability Engineer

Job description

We are looking for a site reliability engineer to join our development team and lead the introduction of SRE into this young, rapidly growing business. You know and value your craft and want to apply it to a mission that matters. You will have significant influence on the business as you build this critical function and instil a site reliability mindset across the whole team.

 
You’ll be as happy working with developers on architectural decisions to drive performance and reliability as you are working on infrastructure automation, monitoring and problem diagnosis. You will be an expert voice in team discussions and a key stakeholder in operational and quality processes and tactics.
 
This role will also champion the SRE way of thinking to drive quality across our stack from our client applications user to the control of robots in the lab and all services in between. You will help both our software and wetlab teams embed observability and reliability into everything we build.

The Company:

Named by the World Economic Forum as one of the world’s 30 Technology Pioneers 2016, Synthace is re-imagining how we work with biology, exponentially improving the speed and quality of the final results. This is made possible through our high level language and operating system for labs, Antha, which is already impacting how scientists work with biology in major companies like Dow, Merck and GSK.

Synthace now has a proven product in the market and is starting to grow its customer base more rapidly. This is the right time for the company to instil practices, processes and tools that will ensure performance and reliability as we enter the next phase for the business.

We will be building a team to grow this function with the business and this role is expected to build and lead the team.
 
The Role: 
Reporting to the VP Engineering, you’ll be working within a tight-knit development team on exciting projects with plenty of technical challenge to get your teeth into, you will:


  • Develop and execute plans for building an SRE function that is right for Synthace
  • Ensure systems scale effectively through mechanisms like automation, and evolve systems by pushing for changes that improve reliability and velocity.
  • Harden our existing production environments and introduce good practice and processes for continuous improvement
  • Troubleshoot, test, and maintain high-performance and scalable microservices in a global, federated container cluster
  • Master our current technology stack which includes: Kubernetes Federation, Go, Google Cloud,  Microsoft Azure, Docker, and Ansible
  • Be an advocate for SRE practices across the development team and provide practical guidance
  • Monitor and maintain our production system
  • Help own (with the development team) the processes and automation infrastructure which ensures high quality continuous releases of reliable infrastructure for our customers


Requirements

Key requirements:

  • BA/BS degree in computer science or equivalent work experience
  • Prior experience in relevant role (SRE, Dev ops)
  • Well versed in any of the following programming languages: C, C++, Java, Python, or Go
  • Knowledge of HA and distributed systems
  • Experience with Kubernetes, Google Cloud, backup systems, and monitoring

In addition, you are committed to delivering high quality software to a tight schedule. Excellent communication skills are also a must in our fast-paced, multidisciplinary environment.   Major bonus for some background and interest in biology