Finally persistently identifying academic events and event series

by Julian Franken und Philip Strömert

We have minted the (probably) first DOI for a conference! Only some not overly twisted interpretation was needed.

There are several good reasons why academic events or event series should receive PIDs and subsequently persistent metadata. Only one of which is that it is easy to mix up conferences since a lot of them share similar names or even have the same abbreviation. Academic events and event series receiving PIDs would enable a lot of use cases for multiple different user groups. For example a conference’s quality or the quality of the conference series could be judged based on the descriptive metadata even long after the original conference website may have shut down and information about it is otherwise hard to find. A lot of work has already been done by the DataCite and Crossref working group on PIDs for conference.

Building on the working group’s efforts and with considerable support and advice from DataCite the ConfIDent project minted its first DOIs for the VIVO Conference 2021 and its conference series as a proof of concept (view in DataCite Commons). The organizers of the VIVO conference were so kind (and brave) to let us use their conference as our first real-life test candidate.

In this Blog post we give a brief overview on how we used the current DataCite Metadata Schema for this purpose and provide an outlook on what is next. In the following list you will find some of the DataCite metadata schema properties for which some interpretation was needed or which are noteworthy for other reasons.

title: In addtion to their frequently rather long titles/names, a lot of academic events and series often use acronyms that are sometimes even more commonly used than their full name. In this example the conference title is „12th Annual / International VIVO Conference 2021“ and its acronym is „VIVO 2021“. The latter is further specified with the titleType subproperty using the value „alternativeTitle“. In our cooperation with Datacite, we suggested to add another term to the controlled list, something like „abbreviation“ or „short title“, which would fit better and might also be useful for other ressources, such as projects.

creator: In contrast to most other entities, determining who (or what in case of an institution) is the creator of an academic event or series leaves room for debates. We decided to understand the creator of an academic event or event series as the entity that is responsible for its organization. In the case of the VIVO 2021 conference it is the „VIVO 2021 team“ as stated on the conference website and because VIVO cannot really be understood as an institution on its own but more as a project. The affiliations of the VIVO 2021 team members which could be identified with ROR-IDs (persistent identifier for organisations) was also provided. In the case of the VIVO conference series, we interpreted the creator in a similar way and used „The VIVO Project“, as the series is being realized by this open-source project.

publisher: Determining a publisher of a conference is similarly problematic as is the case with creator becaues academic events and event series are not really published works but rather the realization of planned processes. Unfortunately for our purpose, it is one of the few mandatory fields. So we had to think a bit more creatively about how to interpret it. For now, our solution is to simply enter the entity that is responsible for the organization of the conference again (publisher = creator). Although sometimes „the publisher of a conference“ is understood as the „publisher of the conference proceedings“ we decided against interpreting DataCite’s publisher field this way because not all conferences even publish proceedings and this interpretation would be too far fetched.

resourceType: DataCite has two properties with which we can specify the type of an academic event. First there is the mandatory and controlled list based ressourceTypGeneral property to specify from the general type of a ressource. At the moment there is only the general type „Event“ which we can use in our case. The second property „ressourceType“ is a free text field that enables us to provide further detail with regard to the event type, so in this case an „Online Conference„. When it comes to describing an academic event series, we use the current DataCite scheme as follows. As there is no suitable entry in the controlled list of the ressourceTypeGeneral yet, we also use „Event“ for a series, but correct this with the ressourceType property where we can specify e.g. „Conference Series“. DataCite mentioned that in the course of aligning with Crossref for the PIDGraph, it is considered to add a term like „Series“ to the controlled list of ressourceTypeGeneral to also better describe book or journal series. Once this is implemented, we could then use this for academic event series as well, whereas the ressourceType would remain the same, specifying in more detail what kind of series it is. Interestingly the ressource type „Event“ is used in DataCite for a range of different scientific entities which until now are mostly not academic conferences. A quick search on DataCite commons showed that scientific cruises (e.g. https://commons.datacite.org/doi.org/10.7284/908875), some lectures (e.g. https://commons.datacite.org/doi.org/10.3929/ethz-b-000470116), short Biology video clips (e.g. https://commons.datacite.org/doi.org/10.7282/t3zw1jb7) and conference presentation slides (e.g. https://commons.datacite.org/doi.org/10.3929/ethz-b-000483136) all use this label. There are maybe even more entities that are categorized as events in DataCite which we so far didn’t discover.

subject: The scientific subject or academic field a conference belongs to is one of the most important pieces of information when it comes to the very important use case of researchers searching for suitable conferences to participate at which ConfIDent needs to address. We are planning to use the DDC classification as a high level classification. Many discussions were had which classification would serve our purpose best. The Fields of Science and Technology (FOS), which is used by DataCite, was an option we considered for a long time. In the end, chosing a subject classification is always a compromise. Not going into too much detail of the pros and cons of each option we considered, the compelling reason that had us opt for the DDC was that it is (in contrast to the FOS) well known with our librarian colleagues. Supporting us now and in the future with the hard work of metadata curation and validation of events and series most said that it can describe the subjects they are responsible for fairly well. Also some mapping work from the DDC to our „inhouse“-classification already has been done which we intend to reuse. This is a prerequisite if we want to take advantage of the vast amount of metadata TIB is constantly creating with regards to conference publications.

contributor: Mentioning individual people that contributed with their efforts to the realization of an academic event or series and thus laying the basis for more systematic acknowledgement of these type of efforts is one of the core goals of the ConfIDent project. Making use of ORCID IDs in this context was an obvious choice. As the interpretation of most of the possible types of contributors in Datacite’s schema would entail a rather strong semantic stretch, we chose „Other“ for the program committee members in this example. The only types we can use without such a semantic stretch are „HostingInstitution“ and „Sponsor“, as can be seen in this exampe for the three VIVO 2021 sponsors TIB, SIGMA AIE and Clarivate. However, for the description of academic events and series we need more specific contributor types, which can be seen in the contributor role branch of ConfIDent’s ontology AEON. This is a perfect example for why we need services such as ConfIDent; in order to provide more metadata about academic events and series than we need for their persistent identification. Generally speaking: more metadata allows services that build upon it to cover more use cases.

relatedIdentifier: This property is used to relate an academic event to its series and vice versa, using the „isPartOf“ and „hasPart“ relationType. One advantage of this rather abstract „many-to-many“ realtionship is that it is supported by both DataCite and Crossref and allows us to cover (less exotic than one would expect) cases like a conference being part of two different series. We also use it to provide links to other plattforms that have metadata about the event, in this case OpenResearch.org, using the „HasMetadata“ relationType. In case a conference proceeding will be published its identifier will be linked to the event with relatedIdentifier and the relationType „IsDocumentedBy“.

Following this first prove of concept we plan to mint a few more DOIs for academic events and series. After having sufficiently proven that the current DataCite schema allows for different conferences to be described in a comprehensive manner we plan to automate the DOI minting process and motivate others to mint DOIs for „their“ academic events and series. Like identifiers for other entities the usefulness of academic event PIDs will increase with their adoption rate.

Autoren:

Julian Franken
Research Assistant in the Lab Non-Textual Materials at TIB

more about Julian Franken in the TIB research information system

Philip Strömert
Research Assistant in the TIB Open Science Lab

more about Philip Strömert in the TIB research information system