#EUvsVirus: Covid-19 Bioassays in the Open Research Knowledge Graph

The #EUvsVirus Pan-European Hackathon was organized as a full remote event coordinated over Slack channels over the weekend from 24 to 26 April 2020. The two day event aimed to sketch solutions to fight covid. The scale of this event was unforeseen, record-breaking even. More than 20,000 people formed teams, created product ideas, and helped each other to fight the pandemic all over the world. Over 2,000 projects were submitted for the 37 challenges that this hackathon presented. The 37  hackathon challenges were organized in 6 broad categories: Health & Life, Business Continuity, Social & Political Cohesion, Remote Working & Education, Digital Finance, and other. ‘Health & Life’ was the most popular category with over 800 submissions for its challenges. We also participated in this category under the ‘Research’ challenge.

Our team ‘TIB ORKG’ was a team of six people who met online in one of the 500 Slack channels created for the hackathon. We grouped on a common agreement: scholarly articles structured in the ORKG was a great idea to help researchers easily comprehend the articles’ content.

We were three domain specialists in biochemistry and neuroscience at the PhD and PostDoc level, a software development expert, and two specializing in Artificial Intelligence and Natural Language Processing at the PhD and PostDoc levels. In the 48 hours of the hackathon, we seamlessly carried out our tasks: one member curated the six gold-standard bioassays in the ORKG which was a great contribution to our team submission; two members built the machine learning model; another member verified a portion of the automatically annotated data; the software developer worked on a feature to enhance the ORKG Contribution display user interface; and one member handled marketing and pitch creation.

Our hackathon challenge proposal was: “COVID-19 Bioassays in the Open Research Knowledge Graph.” Our aim with this proposal was: “Allowing scientists to use their annotated bioassays to easily search for similar assays as well as compare various semantically structured bioassays in the datastore based on their similar features”

And as a structured comparison of COVID-19 Bioassays based on their key properties and values, taken from scholarly articles, a part of what we created is this.

https://www.orkg.org/orkg/comparison/R38481

If not for the ORKG method of distilling comparisons to the key properties and values as shown above, their corresponding scholarly articles otherwise would compare as below.

In general, literature search for the relevant scientific articles is a tedious job. Even in a very specific science domain, it is often time-consuming to search and catalogue most pertinent scientific papers, in spite of the existence of structured databases such as Pubmed, Scopus and Web of Science. In this regard, semantic structuring protocols offer a more user-tailored approach to catalogue and compare multiple studies/articles based on specific search terms. The Open Research Knowledge Graph (ORKG) project represents scientific articles in a content-dependent manner, making the underlying information machine-readable and suitable for automated processing.

In scholarly articles, descriptions of the bioassays are among the most structured contents due to their intrinsic association with the underlying standard protocols and methodologies. However, comparisons across several bioassays is a challenging problem in today’s context-dependent structures, where semantically organized information is critically lacking. This problem gets compounded with the volume of research/assays published/approved, especially in the extremely dynamic research domains such as COVID-19 research.

The COVID-19 pandemic has invoked a response of unprecedented magnitude from the research community across scientific domains. Development of novel bioassays also has not been left behind with the addition of new and innovative diagnostics and therapeutics associated assays everyday. In such a scenario, it is critical for the developers and users of these assays to carefully examine and compare the relevant ones in both pre and post phases of the development, to confer applicational benefit.

Our Solution

Our application of the ORKG method to create a semantical comparison approach for the relevant bioassays in current COVID-19 research, aimed to fill the gap of lack of structured content-dependent approaches in querying bioassay information. We realized that the development of such an approach would bolster the scholarly literature search for the COVID-19 associated bioassays and contribute to the development and application of appropriate bioassays in this domain.

What it did

  • Presented structured, semantified, and comparable COVID-19 bioassay protocols in the Open Research Knowledge Graph infrastructure converting the text-based protocols into machine-readable and comparable elements.
  • Specifically, our expert-curated data within the ORKG presented a tabulated summary comparing bioassays along the specific dimensions of the data property as defined in the Bioassay ontology from the Semantic Web community. Our curated data presented a summarized, state-of-the-art in COVID-19 bioassay research.
  • Further, we also implemented a moodbar feature for highlighting values of selected semantified properties in the bioassays contributions comparisons table based on their usage: the more widely used methods/instruments are marked in shades of green while the less common ones are marked in shades of blue. The logic behind this feature is to highlight the most used materials, methods in the creation of bioassays, and conversely the least used ones.

How we built it

For the comparisons, the annotations were manually entered into the Open Research Knowledge Graph by the domain-expert via its user-friendly interface for adding papers. Six bioassays were selected from the PubChem library with COVID-19 as the search criteria. All the assays aim to characterize the inhibitory activity of different compounds against the coronavirus infection. The reference papers were chosen because the assays results demonstrated the presence of an active molecule at the end of the study, thus presenting a potential inspiration for future research. The resulting six bioassays data that we structured, semantified, and made comparable were shown earlier in this post and are here.

Accomplishments that we were proud of

  • Excellent team cooperation
  • Successfully implementing the idea we envisioned within the timeframe of the hackathon

Many platforms have been developed with the common purpose of presenting the COVID-19 research outcomes. However, an open access platform for sharing information exactly about the experiment lab processes in the battle against COVID-19 does not exist. Researchers should not waste valuable time trouble-shooting for their experiments. In some cases, they may want to implement in their study design using a technique they are not familiar with or which is out of the scope of their particular specialty thus wasting time navigating in literature in order to find the right method or a protocol to follow. This is where ‘COVID-19 Bioassays in the Open Research Knowledge Graph’ come into play, offering the scientific community an open platform to share various techniques for the research undertaken against COVID-19 and access bioasssays on-demand’ and we’re proud to implement this idea during this hackathon.

What we learned

Hackathons are excellent venues to test skills, platforms, and build collaborations.

References

  1. Zhang, Han-Zhong, et al. „Design and synthesis of dipeptidyl glutaminyl fluoromethyl ketones as potent severe acute respiratory syndrome coronovirus (SARS-CoV) inhibitors.“ Journal of medicinal chemistry 49.3 (2006): 1198-1201.
  2. Chuck, Chi-Pang, et al. „Design, synthesis and crystallographic analysis of nitrile-based broad-spectrum peptidomimetic inhibitors for coronavirus 3C-like proteases.“ European journal of medicinal chemistry 59 (2013): 1-6.
  3. Wen, Chih-Chun, et al. „Specific plant terpenoids and lignoids possess potent antiviral activities against severe acute respiratory syndrome coronavirus.“ Journal of medicinal chemistry 50.17 (2007): 4087-4095.
  4. Liu, Wei, et al. „Synthesis, modification and docking studies of 5-sulfonyl isatin derivatives as SARS-CoV 3C-like protease inhibitors.“ Bioorganic & medicinal chemistry 22.1 (2014): 292-302.
  5. Dooley, Andrea J., et al. „From genome to drug lead: identification of a small-molecule inhibitor of the SARS virus.“ Bioorganic & medicinal chemistry letters 16.4 (2006): 830-833.
  6. Chen, Li-Rung, et al. „Synthesis and evaluation of isatin derivatives as effective SARS coronavirus 3CL protease inhibitors.“ Bioorganic & medicinal chemistry letters 15.12 (2005): 3058-3062.

Additional Information