Hiring
So things are going well: you've built trust, you have the approval to hire, and you know the company's needs and how you can help accelerate meeting them. Congrats!
If you can, hire for a title other than "bioinformatician" to attract the right person for the role; "bioinformatician" means something different to everyone.
Choosing the right role
Here are several roles that I've seen labeled as "bioinformatician", and more specific alternatives.
For better or worse, salaries are determined by supply and demand – not required training. For example, software engineers are typically paid more than computational biologists – even if the software engineer has only a bachelor's and the computational biologist has a PhD and a postdoc.
Requires training in biology and computer science
Computational biologist
Deep training in both biology and computer science, often including a PhD and a postdoc. Can develop core tooling and novel algorithms for analyzing genomic data, if needed.
An aside: salaries
For better or worse, salaries are determined by supply and demand – not training – and more people want to be computational biologists than the demand for computational biologists. As a result, software engineers are typically paid more than computational biologists, even if the former has a bachelor's and the latter has a PhD and a postdoc.
Requires training in biology, but not much computer science
Biostatistician
Analyzes statistics related to biological data. This term usually isn't used for those who directly with raw NGS data and its analysis.
Data Scientist:
Analyzes data, but might not have as strong of training in either biology or algorithms as a computational biologist. Strong understanding of statistics.
Research Scientist
An ambiguous title. Usually used for those with more biology training and a wet lab role. They're usually able to write at least simple code.
Requires training in computer science, but not much biology
Data Engineer
Someone who builds the infrastructure for running pipelines or analyzing data. For example, setting up AWS for pipelines and ensuring their successful completion. They often have no understanding of the underlying biology.
Variants:
- A "bioinformatics data engineer" is a data engineer with the quirks of bioinformatics data.
- A "DevOps engineer" is someone who bridges the gap between software engineering and infrastructure.
- A "reliability engineer" or "SRE" is responsible for the reliability of production pipelines.
Software Engineer
Someone who writes code for web apps, APIs, and can set up some infrastructure.
An aside: hiring software and data engineers
Hiring software and data engineers can give your team superpowers, but it's also dangerous.
Bioinformatics pipelines are different from normal data pipelines. (See Ben Siranosian's blog post on the topic). This isn't widely known, and as a result software and data engineers struggle:
- The tooling they're used to doesn't work well for bioinformatics
- The gap between code and infrastructure is wider
- Testing is less prevalent and more computationally expensive
- Iteration cycles take longer
- Validation requirements are higher
That can lead to poor outcomes:
- Burnout
- Inefficient pipelines
- Siloed pipelines that the non-engineers are unable to edit
- Pipelines that hit an edge cases of standard developer tooling and return corrupt data as a result
Biotech also pays less than tech and respects software and data engineers less – compounding these challenges. They can double their salary and have a more positive career outlook if they move outside of biology. As a result, it's really hard to recruit and retain great software engineers.
As an alternative to hiring a team of software or data engineers to work alongside a bioinformatics team, consider FlowDeploy. It augments your existing team to help bridge the gap between bioinformatics and compute infrastructure, with the help of standard open-source workflow managers like Nextflow and Snakemake. Disclaimer: this is a biased recommendation (what you're reading is a FlowDeploy guide).
Fallback: bioinformatician
So who should actually take the title "bioinformatician"? Someone who understands some degree of both the biology and software, but doesn't fit into any of the titles above.