“`html
As expenses for diagnostic and sequencing technologies have drastically decreased in recent years, researchers have amassed an unparalleled volume of data surrounding diseases and biology. Regrettably, scientists aiming to convert data into new treatments often need assistance from individuals skilled in software development.
Currently, Watershed Bio is assisting researchers and bioinformaticians in executing experiments and gaining insights through a platform that allows users to analyze intricate datasets irrespective of their computational expertise. The cloud-hosted platform offers workflow templates and an adaptable interface to assist users in exploring and sharing various data types, including whole-genome sequencing, transcriptomics, proteomics, metabolomics, high-content imaging, protein folding, and beyond.
“Researchers seek to understand the software and data science facets of the field, but they aren’t looking to transform into software engineers writing code merely to comprehend their data,” co-founder and CEO Jonathan Wang ’13, SM ’15 states. “With Watershed, they don’t need to.”
Watershed is utilized by both large and small research teams across industries and academia to propel discovery and decision-making. When novel advanced analytical methodologies are detailed in scientific publications, they can be integrated into Watershed’s platform promptly as templates, rendering cutting-edge tools more accessible and collaborative for researchers from various backgrounds.
“The data in biology is expanding rapidly, and the sequencing technologies producing this data are continually improving and becoming more affordable,” Wang notes. “Coming from MIT, this challenge was right within my expertise: It’s a difficult technical issue. It’s also a significant hurdle because these individuals are dedicated to treating illnesses. They recognize that this data holds value, yet struggle to leverage it. We aim to assist them in revealing more insights at a quicker pace.”
No code discovery
Wang anticipated majoring in biology at MIT, but he soon became enthusiastic about the opportunities of creating solutions that could expand to millions of users through computer science. He ultimately earned both his bachelor’s and master’s degrees from the Department of Electrical Engineering and Computer Science (EECS). Wang also interned at a biology laboratory at MIT, where he was astonished by how slow and labor-intensive experiments were.
“I perceived the distinction between biology and computer science, where you experience these dynamic environments [in computer science] that allow you to receive immediate feedback,” Wang shares. “Even as a solitary programmer, you possess so much at your disposal to experiment with.”
While working on machine learning and high-performance computing at MIT, Wang also co-founded a high-frequency trading firm with several classmates. His team employed researchers with PhD degrees in areas like mathematics and physics to devise new trading strategies, but they soon faced a bottleneck in their workflow.
“Progress was sluggish because the researchers were accustomed to developing prototypes,” Wang explains. “These were small approximations of models that they could execute locally on their machines. To bring those methods into production, they required engineers to facilitate their functionality in a high-throughput manner on a computing cluster. However, the engineers did not grasp the nature of the research, leading to much back and forth. It resulted in concepts that should have been implemented in a day taking weeks.”
To address the challenge, Wang’s team created a software layer that simplified the process of constructing production-ready models to the same level as creating prototypes on a laptop. Then, a few years after graduating from MIT, Wang observed that technologies such as DNA sequencing had become inexpensive and widespread.
“The bottleneck was no longer sequencing; therefore, people said, ‘Let’s sequence everything,’” Wang reminisces. “The limiting factor shifted to computation. Individuals were unsure how to manage the vast amounts of data generated. Biologists were relying on data scientists and bioinformaticians for assistance, but those professionals didn’t always grasp the biology deeply enough.”
The scenario felt familiar to Wang.
“It mirrored what occurred in finance, where researchers were attempting to collaborate with engineers, yet the engineers never fully comprehended, resulting in tremendous inefficiency with people relying on the engineers,” Wang recalls. “Meanwhile, I discovered that biologists are eager to conduct these experiments, but there is such a significant gap that they feel compelled to become software engineers or solely concentrate on the science.”
Wang officially established Watershed in 2019 with physician Mark Kalinich ’13, a former MIT classmate who is no longer involved in the daily operations of the company.
Since then, Wang has heard from executives in biotech and pharmaceutical sectors regarding the increasing complexity of biological research. Gaining new insights increasingly necessitates analyzing data from entire genomes, population studies, RNA sequencing, mass spectrometry, and more. Crafting personalized treatments or identifying patient populations for clinical studies can also require substantial datasets, along with the continual emergence of new methods to analyze data published in scientific journals.
Presently, companies can conduct extensive analyses on Watershed without needing to establish their own servers or cloud computing accounts. Researchers can utilize pre-made templates that accommodate all the most common data types to expedite their work. Renowned AI tools like AlphaFold and Geneformer are also accessible, and Watershed’s platform simplifies the sharing of workflows and delving deeper into results.
“The platform occupies an ideal balance of usability and customizability for individuals from all backgrounds,” Wang states. “No scientific endeavor is ever truly identical. I shy away from the term product because that suggests you deploy something and then just run it at scale indefinitely. Research doesn’t operate that way. Research is about formulating an idea, testing it, and using the findings to develop another idea. The quicker you can design, implement, and execute experiments, the sooner you can proceed to the next one.”
Accelerating biology
Wang believes Watershed is aiding biologists in keeping up with the latest developments in biology and dramatically speeding up scientific discoveries in the process.
“If you can assist scientists in unveiling insights not just a bit quicker, but 10 to 20 times faster, it can truly make a difference,” Wang asserts.
Watershed is being utilized by researchers in both academia and organizations of varying sizes. Executives in biotech and pharmaceutical companies also leverage Watershed to make informed decisions regarding new experiments and drug candidates.
“We’ve observed success in all these domains, and the common thread is individuals comprehending research but not being experts in computer science or software engineering,” Wang notes. “It’s thrilling to witness this industry evolve. For me, coming from MIT and now being back in Kendall Square where Watershed operates is fantastic. This is where so much of the groundbreaking advancement is occurring. We’re striving to contribute to the future of biology.”
“`
