On 20 March 2018, TIB organized an expert workshop on the realization of an Open Research Knowledge Graph (ORKG).
The workshop aimed at discussing requirements, framework architecture and first steps toward an implementation of an ORKG infrastructure for semantically representing the content of scientific publications, as outlined in TIB’s recent position paper.
Overall, more than 40 participants joined the workshop from a wide range of organizations including GESIS Leibniz Institute for the Social Sciences, ZB MED – Information Centre for Life Sciences, FIZ Karlsruhe – Leibniz Institute for Information Infrastructure, ZPID – Leibniz Institute for Psychology Information, the University of Münster, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), German Research Center for Artificial Intelligence (DFKI), Paderborn University, L3S Research Center, Chemnitz University of Technology (TU Chemnitz), Fraunhofer Institute for Intelligent Analysis and Information Systems IAIS, Research Group Agile Knowledge Engineering and Semantic Web (AKSW) at Universität Leipzig and the University of Bonn.
Following short presentations by TIB-internal and external participants, the afternoon was organized in parallel workgroups that developed concrete topics as follows:
- ORKG core ontology / conceptualization of Research Contribution: This subgroup discussed the relevant components of research contributions and possible ways to approach a conceptualization of research processes and their outcomes with the goal to develop a data model on which the ORKG can be based. The subgroup agreed on reusing an existing ontology and developing it collaboratively into a suitable ORKG core ontology by aligning it with the purposes and use cases of the ORKG. The subgroup also briefly explored different options for a governance strategy with respect to a continued evolution of the core ontology based on the input and feedback by users from the various research domains that we intend to address.
- Architecture, Storage, REST API: The subgroup discussed several aspects of technology currently used to implement graphs, such as Linked Property Graphs (LPGs) and RDF triple-stores. The topic of storage, scalability and the development of a general data model were discussed and several options evaluated. Storing provenance information efficiently was also explored. The main discussion revolved around how to export RDF if a LPG is used on the storage side, as well as the up- and downsides of this approach.
- Frontend and ORKG user interface: A key message of this subgroup was that visualization has a purpose; hence, a UI addresses a purpose. The subgroup discussed three broad UI types, namely the Submission UI, the Visualization UI and the Curation UI. The Submission UI supports feeding the ORKG with facts, and the subgroup discussed how to make this easy for users, how to integrate in existing systems (e.g. paper submission systems), and how to improve the quality of submitted facts (e.g. by integrating with the review process). The Visualization UI supports the exploration and navigation of ORKG facts. The subgroup suggested that the infrastructure should leverage on user authentication in order to include the user and her preferences. Finally, the Curation UI is designed to support experts in quality assurance and general curation of the ORKG.
- Pilot applications, testbeds and use cases, cooperation and governance: This group discussed use cases and horizontal topics. As pilot domains computer science (especially Semantic Web research) and mathematics appears to be promising due to the existing awareness of semantic and conceptual structures. The licence for ORKG representations should be as open as possible, but at the same time ensure the sustainability of the infrastructure. Since there is currently no distinct funding programme, the development of the ORKG infrastructure needs to be integrated into existing projects and initiatives, e.g. the German National Research Data Initiative. For governance at some point an independent legal structure should be created, but TIB should be s steward of ORKG for the time being.
Predictably, the workshop raised more questions than it answered. This is surely to be expected at this early stage of the project. The workshop demonstrated the importance of developing the research knowledge graph in an open manner. In addition to developing a first prototype, it will thus be critical for TIB to engage stakeholders, including partner libraries. The community plans to organize and communicate through the mailinglist at: https://groups.google.com/forum/#!forum/orkg
In fall, in conjunction with the DILS conference, another ORKG workshop is planned.
Another workshop report by Ricardo Usbeck from DICE @ Uni Paderborn can be found here.