Microsoft Academic – a major data source dries up

diesen Beitrag auf Deutsch lesen

Microsoft has announced that the Microsoft Academic Graph will be discontinued at the end of 2021. What does this mean? Why is this important for researchers and research institutions?

The Microsoft Academic Graph is the database behind Microsoft Academic, one of the most comprehensive academic search engines. However, Microsoft does not only use the data for this in-house search service. In contrast to its well-known competitor Google Scholar, it has decided to make the data available to third parties. And this has been used many times in recent years. One reason for this is certainly the size of Microsoft Academic. In a recent study by the Centre for Science and Technology Studies (CWTS), various bibliographic data sources were examined, and the following figure speaks for itself – irregardless of some limitations put forward by the authors regarding comparability.

Überlappung von Dokumenten zwischen Scopus, Dimensions, Crossref und Microsoft Academic
Martijn Visser, Nees Jan van Eck, Ludo Waltman; Large-scale comparison of bibliographic data sources: Scopus, Web of Science, Dimensions, Crossref, and Microsoft Academic. Quantitative Science Studies 2021; 2 (1): 20–41. doi: https://doi.org/10.1162/qss_a_00112

Microsoft Academic is thus a huge data source that is also available via an interface and as a dump. In addition, it is prepared and documented for bibliometric analyses. The relevance of this data for science is readily apparent. The two publications describing this service have been cited (according to Microsoft Academic) more than 500 and 1700 times respectively.

In the announcement of the discontinuation of the service, Microsoft mentions some alternatives. If you take a closer look at them, you will see that some of them are at least partly based on Microsoft Academic, e.g. Semantic Scholar or The Lens. The existing data will presumably continue to be used, also thanks to initiatives such as MAKG, where a copy including interfaces is made available. However, updates will no longer be made in the foreseeable future.

This announcement has major implications for all those who have relied on Microsoft Academic as a data source. That it is not a good idea to rely on the availability of a proprietary data source is clearly evident here.

At TIB, we are therefore involved in various initiatives to improve open data sources, such as the Initiative for Open Abstracts. In the recently launched OPTIMETA project, we will support open access journals in making open citation data and spatiotemporal metadata freely available. In the TAPIR project, we are testing the use of open research information for research reporting, and with our partners from the State Scientific-Technical Library of Ukraine (SSTL) we are discussing FAIR Research Information in Open Infrastructures. For research institutions, libraries and also researchers, the current incident can be a reminder that it is worth working on open alternatives to proprietary research infrastructures.

Author:

Christian Hauschke
Head of Lab Open Research Information within the Open Science Lab

all blog posts by Christian Hauschke
more about Christian Hauschke in the TIB research information system

 

Article image by Aleksandar Cvetanović on Pixabay