Knowledge from biological ontologies,
leveraging the structure they impose,
together with multiple imaging formats
and other data
to fuel the development of next generation of Knowledge-Guided Machine Learning models that are more interpretable, transferable, robust, and label- and sample-efficient,
expanding the role of ML in addressing the most challenging biology problems, forming a virtuous cycle.
The traits that characterize living organisms—in particular, their morphology, physiology, behavior and genetic make-up—enable them to cope with forces of the physical as well as the biological and social environments that impinge on them. Moreover, since function follows form, traits provide the raw material upon which natural selection operates, thus shaping evolutionary trajectories and the history of life. Interestingly, most living organisms, from microscopic microbes to charismatic megafauna, reveal themselves visually and are routinely captured in copious images taken by humans from all walks of life. The resulting massive amount of image data has the potential to further our understanding of how multifaceted traits of organisms shape the behavior of individuals, collectives, populations, and the ecological communities they live in, as well as the evolutionary trajectories of the species they comprise. Images are increasingly the currency for documenting the details of life on the planet, and yet traits of organisms, known or novel, cannot be readily extracted from them. Just like with genomic data two decades ago, our ability to collect data at the moment far outstripts our ability to extract biological insight from it. The Institute will establish a new field of IMAGEOMICS, in which biologists utilize machine learning algorithms (ML) to analyze vast stores of existing image data—especially publicly funded digital collections from national centers, field stations, museums and individual laboratories—to characterize patterns and gain novel insights on how function follows form in all areas of biology to expand our understanding of the rules of life on Earth and how it evolves.
This Institute will introduce structured knowledge from the biological sciences to guide and structure ML algorithms to enable biological trait discovery from images, establishing the field of Imageomics. With images captured and annotated by scientists and the public serving as the basis for the work, the Institute’s convergent approach uses structured biological knowledge to provide scientifically validated inductive biases and rich supervision for ML, and ML will in turn enrich the body of biological knowledge. The resulting ML models and tools will help to make what was hidden visible, so that scientists from a wide range of biological communities can discover and infer the traits of organisms; assess shared similarities and differences between individuals, populations and species; and come to see the world in new ways. Imageomics will accelerate and transform the biomedical, agricultural and basic biological sciences as they seek to understand and control genes that relate to particular phenotypes and enable an overarching understanding of how the genome evolved in tandem with the organismal phenome. Because traits are the essential links between genes and the environment, using ML to help characterize them will lead to emergent understandings of how they function. Harnessing the insights that arise from these new visualizations will stimulate the use of new genetic technologies, such as CRISPER, and more nuanced ecological practices, such as modified land use schemes that emerge from better understanding the connections between individual decision-making within species and their impact on their population dynamics. With the emergence of new and better targeted practices that generate fewer unintended consequences, the new linkages resulting from a better understanding of traits and their consequences will bolster the nation’s bioeconomy. In addition, by leveraging and expanding existing diverse, inclusive and intellectually wide-ranging collaborative networks, the Institute will also educate the next generation of scientists and engage the broader public in scientific inquiry and knowledge discovery so that Imageomics can transform and democratize science for public good.