Data Scientist - Data Pipeline and Predictive Model Development

Cambridge, Massachusetts, United States | Full-time | Partially remote


Manus Bio is a next-generation industrial biotechnology company based in Cambridge, Massachusetts that produces plant-based ingredients. We use a variety of patented and proprietary technologies to engineer microbes for the production of specialty chemicals, such as food ingredients, agricultural chemicals, and pharmaceuticals. We are seeking a data scientist who will develop and evaluate predictive models using various data science techniques (statistical analysis, machine learning, and deep learning etc). The candidate is responsible for preparing data existing in a variety of formats (structured and unstructured) by querying in-house databases and developing data workflows. This candidate will also assist in the development of data visualization platforms. This person will be expected to play an integral part of Manus Bio’s R&D team.

Why work at Manus Bio:

  • Opportunity – For motivated, results-oriented team members, our growth creates opportunities for personal and professional advancement.
  • Accountability – You are given the resources you need to succeed and the freedom to make it happen; in return, we hold each other accountable for our high expectations.
  • Passion – We love what we do and enjoy working with others who feel the same way. We embrace the challenge and hard work that come with working on the cutting edge.


  • Develop various predictive models from a variety of data streams to accelerate protein engineering, strain engineering and fermentation process optimization
  • Perform statistical analysis on large omics datasets to maximize learnings
  • Stay at the cutting edge of data science concepts and algorithms
  • Develop data workflows and dashboard to contribute in the extension of our digital infrastructure

Required qualifications:

  • Ph.D. or MS degree in Data Sciences, Statistics, Computer Science, Computational Science or a similar discipline
  • Working knowledge of Python and R
  • ML/DL model development experience

Preferred qualifications:

  • Working knowledge of database and SQL language
  • Experience with web technologies and frameworks (e.g. HTML(5), CSS(3), PHP, JavaScript, etc.)
  • Experience translating unstructured data into serialized (JSON, xml, yml) and/or structured (DB) data