Introduction to searching with APIs#

Searching for scientific articles using an API (application programming interface) allows you to extract data from publisher platforms and databases. With an API, you can create programmatic searches of a citation database, extract statistical data, or query and manipulate your results within a Notebook.

In this chapter, we will look at two APIs, CrossRef and arXiv.

1. What is Crossref?#

Crossref is a non-profit organization that helps to provides access to scientific literature. According to their website, Crossref “makes research outputs easy to find, cite, link, and assess”.

Crossref data on scientific publications essentially consists of three elements:
1) Metadata about a publication
2) A URL link to the article
3) A document identifier (doi)

At present Crossref contains information on 80 million scientific publications including articles, books and book chapters.

In the accompanying exercise, you will explore the Crossref dataset and learn to:

  • Create a topic query and explore the structure of Crossref data

  • Visualize the results of your query to determine which titles are most likely to publish articles on your query

  • Determine which articles are most cited

  • Pair multiple queries to understand topic trends

Find notebook here

2. What is arXiV?#

arXiV is a free e-print service and an open-access archive for scholarly articles, used often in the fields of physics, mathematics, computer science, quantitative biology, quantitative finance, statistics, electrical engineering and systems science, and economics.

Searching arXiV lets you:

  • track new research and trending topics

  • find open versions of works that may be published later behind a paywall

  • understand how articles are versioned and updated