Standardisation and intelligent querying for chemical biology screening experiments
On Thursday, 8th March, I visited the Center for Computational Science at the University of Miami. Note, not computer science but computational science: the center is a broad-ranging interdisciplinary think-tank for projects that need to use innovative computing in order to tackle challenging questions at the frontiers of research in the life sciences, social sciences, physical sciences, and other fields. The center brings together interdisciplinary teams of experienced computer scientists, domain science researchers and software engineers to create computational solutions and new methods.
Being there only for one day meant that I didn't get to enjoy the Miami beaches as I would have liked, but did enjoy some good views of Miami downtown and the fresh sea breeze from the hotel. The reason for the visit was to learn more about the BioAssay Ontology being developed by an interdisciplinary team headed by Dr Stephan Schürer.
The BioAssay Ontology (BAO) was developed to address the standardisation of assay descriptions in the PubChem BioAssay database. Bioassays in PubChem are deposited by screening centers performing high-throughput and high-content screening experiments across a wide variety of scientific topics, technology platforms and experimental design strategies. For various reasons having to do with historical legacy, the description of the assay experimentsin PubChem was largely free text, meaning that similar experiments were frequently described differently and that it was near to impossible to compare and aggregate results across experiments originating from different screening centers where different methodologies and standards may be internally applicable.
BAO provides standardised terminology for all aspects of chemical biology screening experiment description organised into a hierarchy and supplemented by OWL axioms to encode additional semantics and to allow for more advanced automated reasoning across assays annotated to BAO. It is therefore a prime candidate for adoption within the EU-OPENSCREEN project. The challenges faced by EU-OPENSCREEN will be similar -- aggregation and comparison of results between experiments originating in different screening centers -- but given the pan-European nature of the project and required infrastructure, ontology-backed standardisation will be an essential component from the very beginning of the project in order to elegantly deal with the issues arising from different national languages and local or national operational paradigms and constraints.
However, I had some concerns with an earlier version of BAO that was presented at ICBO 2011. One concern was that the classification hierarchy allowed incorrect inferences to be made. For example, they had 'small molecule' classified as subClassOf 'perturbagen'. This means that all small molecules are perturbagens. Now, I am pretty sure I can think of some very inert small molecules that certainly do not perturb any biological systems (diamonds?). Other concerns were the lack of alignment to upper level ontologies and some other OBO ontologies such as OBI. However, I was delighted to discover that the new version of BAO, 2.0, currently under development, includes extensive refactoring and alignment to BFO. In the new version, small molecules that are active in experiments are much more sensibly encoded as 'small molecule' and has_role some 'perturbagen role'. BAO 2.0 has not been released yet, but will be soon. Other updates that we enjoyed during the day-long workshop included a sneak preview of new developments in the intelligent ontology-based search interface BAOSearch and a fascinating presentation of an ongoing interdisciplinary project to discover agents that are able to stimulate neuronal regeneration as treatments for spinal cord injuries, where novel forms of high-content image-based screening technology are being developed as a core component of the scientific methodology.
The new version of BAO is therefore the leading contender for adoption in the EU-OPENSCREEN project for standardisation of assay descriptions in the a centralised EU-wide chemical biology database, meaning that European and US-based chemical biology assays will be automatically integrated and scientists will be able to compare and contrast results across a rapidly widening collection of openly available experiments with an increasingly global perspective.