Blog

All the News about bibliometrix and the related works

Bibliographic databases supported by Bibliometrix and Biblioshiny: characteristics and differences

 

Keypoints

  • Bibliographic databases are essential tools for researchers, providing access to millions of scientific publications, patents, and other scholarly works.
  • Web of Science (WoS) and Scopus are two of the most widely used bibliographic databases, providing access to over 36 million and 20 million articles respectively.
  • Newer databases such as Dimensions.ai and OpenAlex are also worth considering for researchers looking for open access articles and patents, and advanced features to analyze the data.
  • Our academic spin-off K-synth has developed some additional packages for querying PubMed, Dimensions, and OpenAlex databases in order to perform analysis with bibliometrix and biblioshiny.

 

Bibliographic databases are essential tools for researchers, providing access to millions of scientific publications, patents, and other scholarly works. There are several main bibliographic databases that are widely used in various fields of research. Each of these databases has its own characteristics and differences, and it is important for researchers to understand these in order to select the best database for their needs.

One of the most well-known generalist bibliographic databases is Web of Science (WoS), which is maintained by Clarivate Analytics and provides access to over 36 million articles from the sciences, social sciences, and humanities.

For researchers in the field of social sciences, the WoS Social Science Citation Index (SSCI) is a useful source. It is maintained by Clarivate Analytics and provides access to over 3 million articles from the social sciences, including sociology, psychology, and political science. 

Scopus is another widely used generalist bibliographic database that is worth mentioning. It is maintained by Elsevier and provides access to over 20 million articles from a wide range of fields including sciences, technology, medicine, social sciences, and the arts. It also provides access to conference proceedings, book series, and patents.

Scopus has a strong emphasis on the natural sciences and engineering, and it's considered one of the most comprehensive databases for these fields. It also includes several advanced features that make it a valuable tool for researchers, such as citation analysis, author identification, and document similarity search. Scopus also provides a range of tools for data visualization and analysis, such as Scopus Insights and SciVal, which can be used to identify trends and key players in a field.

WoS and Scopus are widely used by researchers, and they are considered the most widely used databases among researchers. Many universities have a subscription to them, and they are considered standard tools for bibliometric analysis and research evaluation.

An important specialized bibliographic database is PubMed, which is maintained by the National Library of Medicine and provides access to over 30 million citations from the biomedical literature. PubMed is particularly useful for researchers in the fields of medicine, nursing, and other health-related disciplines.

K-synth, our academic spin-off, provides a tool for querying PubMed, called pubmedR. It goal is to gather metadata about publications, grants, MeSH terms, and clinical trials from the PubMed database using NCBI REST APIs. Inside biblioshiny, the webapp of bibliometrix, the access to PubMed is possible through APIs.
(Aria, M. and Cuccurullo, C. (2020). pubmedR: Gathering Metadata About Publications, Grants, Clinical Trials from 'PubMed' Database. R package version 0.0.3. https://CRAN.R-project.org/package=pubmedR).

Another important database for biomedical research is Cochrane, which is maintained by Wiley and provides access to over 32 million articles from the biomedical and pharmacological literature.

In addition to the main bibliographic databases described earlier, there are also newer databases that are worth mentioning. One of these is Dimensions.ai, that provides access to over 200 million scholarly articles, patents, clinical trials, and policy documents. It also provides a range of advanced features, such as natural language processing and machine learning, that allow researchers to easily discover new research, track trends and identify key players in their field.

The dimensionsR package, developed by our academic spin-off K-synth, permits to download data from Dimensions.ai for performing bibliometric analysis.

(Aria, M. and Cuccurullo, C. (2020). dimensionsR: Gathering Bibliographic Records from 'Digital Science Dimensions' Using 'DSL' API. R package version 0.0.3. https://CRAN.R-project.org/package=dimensionsR).

Another database worth mentioning is OpenAlex, which is an open-access database that provides access to over 100 million scientific papers, patents, and other scholarly works. It allows researchers to search, browse, and download article metadata for free through API. OpenAlex aims to make scientific knowledge more accessible to researchers around the world by removing barriers to access, such as paywalls and limited library subscriptions. OpenAlex was launched in January 2022, timely replacing the retired Microsoft Academic.

Our openalexR is a package to gather bibliographic metadata about publications, authors, venues, institutions, and concepts from OpenAlex using API.

(Aria, M., Le, T., Cuccurullo, C., Belfiore, A., Choe, J. (2023). openalexR: An R-tool for collecting bibliometric data from OpenAlex. The R journal - https://github.com/massimoaria/openalexR).

Both Dimensions.ai and OpenAlex have the advantage of being open access, which means that the metadata about articles and patents are available for free. This can be especially valuable for researchers from developing countries or from institutions with limited resources. Moreover, both databases have a strong focus on advanced searching features, which can help researchers to quickly identify key areas of research and to stay up-to-date with the latest developments in their field. However, it's important to note that as these databases are relatively new, the coverage and completeness of the articles and patents might not be as extensive as other established databases, and the quality of the data may not be as reliable, thus, researchers should always cross-check their results with other sources and use multiple databases to get a more comprehensive view of the literature.

As seen in the following graph, Scopus and Web of Science have a very high overlap. Dimensions outperforms Scopus in terms of records contained, but the largest number of papers are included in the former Microsoft Academic, that is now OpenAlex.

Overlap of documents among data sources.

Source: Visser, M., van Eck, N. J., & Waltman, L. (2021).

Large-scale comparison of bibliographic data sources: Scopus, Web of Science, Dimensions, Crossref, and Microsoft Academic. 

Quantitative Science Studies, 2(1), 20-41.

 

Another new free database is lens.org. It provides access to a comprehensive collection of scientific and scholarly research publications. It is a platform that contains a vast number of scientific articles, patents, and other research-related materials, which are freely accessible to users from around the world. The databases cover a wide range of scientific disciplines, including medicine, engineering, physics, chemistry, and biology, among others. One of the key features of the lens.org is their focus on providing open access to scientific literature. The platform has a commitment to making research publications freely available to researchers, scientists, and the public, thereby promoting greater transparency and collaboration in scientific research.