ArXiv API#

The ArXiV API allows programmatic access to the arXiv’s e-print content and metadata. “The goal of the interface is to facilitate new and creative use of the the vast body of material on the arXiv by providing a low barrier to entry for application developers.” https://arxiv.org/help/api

The API’s user manual (https://arxiv.org/help/api/user-manual) provides helpful documentation for using the API and retrieving article metadata.

Our examples below will introduce you to the basics of querying the ArXiV API.

Install Packages#

import urllib
import arxiv
import requests
import json
import csv
import pandas as pd
from collections import Counter, defaultdict
import numpy as np # for array manipulation
import matplotlib.pyplot as plt # for data visualization
%matplotlib inline 
import datetime

Query the API#

Perform a simple query for “graphene.” We’ll limit results to the titles of the 10 most recent papers.

search = arxiv.Search(
  query = "graphene",
  max_results = 10,
  sort_by = arxiv.SortCriterion.SubmittedDate
)

for result in search.results():
  print(result.title)
/opt/hostedtoolcache/Python/3.7.17/x64/lib/python3.7/site-packages/ipykernel_launcher.py:7: DeprecationWarning: The 'Search.results' method is deprecated, use 'Client.results' instead
  import sys
Electrical Spectroscopy of Polaritonic Nanoresonators
Quantum nuclear motion in silicene: Assessing structural and vibrational properties through path-integral simulations
Classification of mass terms in kagome semimetals
Spin Seebeck Effect in Graphene
Kekulé distortions in graphene on cadmium sulfide
Topological chiral superconductivity
High- and low-energy many-body effects of graphene in a unified approach
Coupled Spin and Valley Hall Effects Driven by Coherent Tunneling
Multichannel joint-polarization-frequency-modulation encrypted metasurface in secure THz communication
Detection of terahertz radiation using topological graphene micro-nanoribbon structures with transverse plasmonic resonant cavities

Do another query for the topic “quantum dots,” but note that you could swap in a topic of your liking.

You can define a custom arXiv API client with specialized pagination behavior. This time we’ll process each paper as it’s fetched rather than exhausting the result-generator into a list; this is useful for running analysis while the client sleeps.

Because this arxiv.Search doesn’t bound the number of results with max_results, it will fetch every matching paper (roughly 10,000). This may take several minutes.

results_generator = arxiv.Client(
  page_size=1000,
  delay_seconds=3,
  num_retries=3
).results(arxiv.Search(
  query='"quantum dots"',
  id_list=[],
  sort_by=arxiv.SortCriterion.Relevance,
  sort_order=arxiv.SortOrder.Descending,
))

quantum_dots = []
for paper in results_generator:
  # You could do per-paper analysis here; for now, just collect them in a list.
  quantum_dots.append(paper)

Organize and analyze your results#

Create a dataframe to better analyze your results. This example uses Python’s vars built-in function to convert search results into Python dictionaries of paper metadata.

qd_df = pd.DataFrame([vars(paper) for paper in quantum_dots])

We’ll look at the first 10 results.

qd_df.head(10)
entry_id updated published title authors summary comment journal_ref doi primary_category categories links pdf_url _raw
0 http://arxiv.org/abs/cond-mat/0310363v1 2003-10-15 20:15:59+00:00 2003-10-15 20:15:59+00:00 Excitonic properties of strained wurtzite and ... [Vladimir A. Fonoberov, Alexander A. Balandin] We investigate exciton states theoretically in... 18 pages, accepted for publication in the Jour... J. Appl. Phys. 94, 7178 (2003) 10.1063/1.1623330 cond-mat.mes-hall [cond-mat.mes-hall] [http://dx.doi.org/10.1063/1.1623330, http://a... http://arxiv.org/pdf/cond-mat/0310363v1 {'id': 'http://arxiv.org/abs/cond-mat/0310363v...
1 http://arxiv.org/abs/2008.11666v1 2020-08-26 16:48:21+00:00 2020-08-26 16:48:21+00:00 A two-dimensional array of single-hole quantum... [F. van Riggelen, N. W. Hendrickx, W. I. L. La... Quantum dots fabricated using techniques and m... 7 pages, 4 figures None 10.1063/5.0037330 cond-mat.mes-hall [cond-mat.mes-hall] [http://dx.doi.org/10.1063/5.0037330, http://a... http://arxiv.org/pdf/2008.11666v1 {'id': 'http://arxiv.org/abs/2008.11666v1', 'g...
2 http://arxiv.org/abs/cond-mat/0411742v1 2004-11-30 02:56:15+00:00 2004-11-30 02:56:15+00:00 Polar optical phonons in wurtzite spheroidal q... [Vladimir A. Fonoberov, Alexander A. Balandin] Polar optical-phonon modes are derived analyti... 11 pages J. Phys.: Condens. Matter 17, 1085 (2005) 10.1088/0953-8984/17/7/003 cond-mat.mes-hall [cond-mat.mes-hall] [http://dx.doi.org/10.1088/0953-8984/17/7/003,... http://arxiv.org/pdf/cond-mat/0411742v1 {'id': 'http://arxiv.org/abs/cond-mat/0411742v...
3 http://arxiv.org/abs/1403.4790v1 2014-03-19 13:03:49+00:00 2014-03-19 13:03:49+00:00 Group-velocity slowdown in quantum-dots and qu... [Stephan Michael, Weng W. Chow, Hans Christian... We investigate theoretically the slowdown of o... Physics and Simulation of Optoelectronic Devic... None 10.1117/12.2042412 cond-mat.mes-hall [cond-mat.mes-hall, cond-mat.mtrl-sci] [http://dx.doi.org/10.1117/12.2042412, http://... http://arxiv.org/pdf/1403.4790v1 {'id': 'http://arxiv.org/abs/1403.4790v1', 'gu...
4 http://arxiv.org/abs/cond-mat/0403328v1 2004-03-12 18:28:06+00:00 2004-03-12 18:28:06+00:00 A new method to epitaxially grow long-range or... [J. Bauer, D. Schuh, E. Uccelli, R. Schulz, A.... We report on a new approach for positioning of... None None None cond-mat.mes-hall [cond-mat.mes-hall] [http://arxiv.org/abs/cond-mat/0403328v1, http... http://arxiv.org/pdf/cond-mat/0403328v1 {'id': 'http://arxiv.org/abs/cond-mat/0403328v...
5 http://arxiv.org/abs/cond-mat/0411484v1 2004-11-18 16:47:14+00:00 2004-11-18 16:47:14+00:00 Giant optical anisotropy in a single InAs quan... [I. Favero, Guillaume Cassabois, A. Jankovic, ... We present the experimental evidence of giant ... submitted to Applied Physics Letters None 10.1063/1.1854733 cond-mat.other [cond-mat.other] [http://dx.doi.org/10.1063/1.1854733, http://a... http://arxiv.org/pdf/cond-mat/0411484v1 {'id': 'http://arxiv.org/abs/cond-mat/0411484v...
6 http://arxiv.org/abs/1003.2350v1 2010-03-11 15:52:09+00:00 2010-03-11 15:52:09+00:00 Linewidth broadening of a quantum dot coupled ... [Arka Majumdar, Andrei Faraon, Erik Kim, Dirk ... We study the coupling between a photonic cryst... 5 pages, 4 figures None 10.1103/PhysRevB.82.045306 quant-ph [quant-ph] [http://dx.doi.org/10.1103/PhysRevB.82.045306,... http://arxiv.org/pdf/1003.2350v1 {'id': 'http://arxiv.org/abs/1003.2350v1', 'gu...
7 http://arxiv.org/abs/1201.1258v1 2012-01-05 18:56:21+00:00 2012-01-05 18:56:21+00:00 Photoluminescence from In0.5Ga0.5As/GaP quantu... [Kelley Rivoire, Sonia Buckley, Yuncheng Song,... We demonstrate room temperature visible wavele... None None 10.1103/PhysRevB.85.045319 quant-ph [quant-ph, physics.optics] [http://dx.doi.org/10.1103/PhysRevB.85.045319,... http://arxiv.org/pdf/1201.1258v1 {'id': 'http://arxiv.org/abs/1201.1258v1', 'gu...
8 http://arxiv.org/abs/1206.2674v1 2012-06-12 21:00:22+00:00 2012-06-12 21:00:22+00:00 Effective microscopic theory of quantum dot su... [U. Aeberhard] We introduce a quantum dot orbital tight-bindi... 9 pages, 6 figures; Special Issue: Numerical S... Optical and Quantum Electronics 44, 133 (2012) 10.1007/s11082-011-9529-9 cond-mat.mes-hall [cond-mat.mes-hall, cond-mat.mtrl-sci] [http://dx.doi.org/10.1007/s11082-011-9529-9, ... http://arxiv.org/pdf/1206.2674v1 {'id': 'http://arxiv.org/abs/1206.2674v1', 'gu...
9 http://arxiv.org/abs/1405.1981v1 2014-05-08 15:51:52+00:00 2014-05-08 15:51:52+00:00 A single quantum dot as an optical thermometer... [Florian Haupt, Atac Imamoglu, Martin Kroner] Resonant laser spectroscopy of a negatively ch... 11 pages, 4 figures Phys. Rev. Applied 2, 024001 (2014) 10.1103/PhysRevApplied.2.024001 cond-mat.mes-hall [cond-mat.mes-hall] [http://dx.doi.org/10.1103/PhysRevApplied.2.02... http://arxiv.org/pdf/1405.1981v1 {'id': 'http://arxiv.org/abs/1405.1981v1', 'gu...

Next, we’ll create list of all of the columns in the dataframe to see what else is there:

list(qd_df)
['entry_id',
 'updated',
 'published',
 'title',
 'authors',
 'summary',
 'comment',
 'journal_ref',
 'doi',
 'primary_category',
 'categories',
 'links',
 'pdf_url',
 '_raw']

We have 14 columns overall. We’ll add two derived columns––the name of the first listed author and a reference to the original arxiv.Result object-–then narrow the dataframe to paper titles, published dates, and first authors to run some analysis of publishing patterns over time.

# Add a first_author column: the name of the first author among each paper's list of authors.
qd_df['first_author'] = [authors_list[0].name for authors_list in qd_df['authors']]
# Keep a reference to the original results in the dataframe: this is useful for downloading PDFs.
qd_df['_result'] = quantum_dots

# Narrow our dataframe to just the columns we want for our analysis.
qd_df = qd_df[['title', 'published', 'first_author', '_result']]
qd_df
title published first_author _result
0 Excitonic properties of strained wurtzite and ... 2003-10-15 20:15:59+00:00 Vladimir A. Fonoberov http://arxiv.org/abs/cond-mat/0310363v1
1 A two-dimensional array of single-hole quantum... 2020-08-26 16:48:21+00:00 F. van Riggelen http://arxiv.org/abs/2008.11666v1
2 Polar optical phonons in wurtzite spheroidal q... 2004-11-30 02:56:15+00:00 Vladimir A. Fonoberov http://arxiv.org/abs/cond-mat/0411742v1
3 Group-velocity slowdown in quantum-dots and qu... 2014-03-19 13:03:49+00:00 Stephan Michael http://arxiv.org/abs/1403.4790v1
4 A new method to epitaxially grow long-range or... 2004-03-12 18:28:06+00:00 J. Bauer http://arxiv.org/abs/cond-mat/0403328v1
... ... ... ... ...
12426 Electric Field Control of Molecular Charge Sta... 2024-05-29 08:10:14+00:00 Dhaneesh Kumar http://arxiv.org/abs/2405.18855v1
12427 Experimental single-photon quantum key distrib... 2024-06-04 07:28:15+00:00 Yang Zhang http://arxiv.org/abs/2406.02045v1
12428 Assessment of error variation in high-fidelity... 2023-03-07 17:50:32+00:00 Tuomo Tanttu http://arxiv.org/abs/2303.04090v3
12429 Functional light diffusers based on hybrid CsP... 2023-07-12 14:36:29+00:00 Lena M. Saure http://arxiv.org/abs/2307.06197v1
12430 Interferometric Single-Shot Parity Measurement... 2024-01-17 19:05:29+00:00 Morteza Aghaee http://arxiv.org/abs/2401.09549v4

12431 rows × 4 columns

Visualize your results#

Get a sense of the how your topic has trended over time. When did research on your topic take off? Create a bar chart of the number of articles published in each year.

qd_df["published"].groupby(qd_df["published"].dt.year).count().plot(kind="bar")
<AxesSubplot:xlabel='published'>
../_images/d04c2c17ec00afd116ca2c57ff3167ee4b686e3bbc70b0560611de7d195ddc1a.png

Explore authors to see who is publishing your topic. Group by author, then sort and select the top 20 authors.

qd_authors = qd_df.groupby(qd_df["first_author"])["first_author"].count().sort_values(ascending=False)
qd_authors.head(20)
first_author
Bing Dong                 27
Constantine Yannouleas    23
Y. Alhassid               20
Vidar Gudmundsson         17
Akira Oguri               16
David M. -T. Kuo          16
B. Szafran                16
Xuedong Hu                15
Rafael Sánchez            14
Kicheon Kang              14
Massimo Rontani           14
Ulrich Hohenester         14
P. W. Brouwer             13
C. W. J. Beenakker        13
David M T Kuo             13
G. Giavaras               12
Rui Li                    12
Tetsufumi Tanamoto        12
O. Entin-Wohlman          12
Piotr Trocha              12
Name: first_author, dtype: int64

Identify and download papers#

Let’s download the oldest paper about quantum dots co-authored by Piotr Trocha:

qd_Trocha_sorted  = qd_df[qd_df['first_author']=='Piotr Trocha'].sort_values('published')
qd_Trocha_sorted
title published first_author _result
820 Dicke-like effect in spin-polarized transport ... 2007-11-22 16:11:11+00:00 Piotr Trocha http://arxiv.org/abs/0711.3611v2
2507 Kondo-Dicke resonances in electronic transport... 2008-03-28 15:49:07+00:00 Piotr Trocha http://arxiv.org/abs/0803.4154v1
3642 Negative tunnel magnetoresistance and differen... 2009-11-02 11:45:03+00:00 Piotr Trocha http://arxiv.org/abs/0911.0291v1
5845 Beating in electronic transport through quantu... 2010-04-11 16:20:04+00:00 Piotr Trocha http://arxiv.org/abs/1004.1819v2
2664 Orbital Kondo effect in double quantum dots 2010-08-17 14:13:23+00:00 Piotr Trocha http://arxiv.org/abs/1008.2902v2
2712 The influence of spin-flip scattering on the p... 2011-05-08 20:12:41+00:00 Piotr Trocha http://arxiv.org/abs/1105.1550v1
6016 Large enhancement of thermoelectric effects in... 2011-08-11 14:49:51+00:00 Piotr Trocha http://arxiv.org/abs/1108.2422v2
8545 The role of the indirect tunneling processes a... 2011-09-12 20:51:49+00:00 Piotr Trocha http://arxiv.org/abs/1109.2621v1
2858 Spin-polarized Andreev transport influenced by... 2014-09-14 23:54:35+00:00 Piotr Trocha http://arxiv.org/abs/1409.4122v1
8632 Spin-resolved Andreev transport through double... 2015-08-24 19:02:49+00:00 Piotr Trocha http://arxiv.org/abs/1508.05915v1
6741 Spin-dependent thermoelectric phenomena in a q... 2017-05-02 14:55:38+00:00 Piotr Trocha http://arxiv.org/abs/1705.01007v1
6872 Cross-correlations in a quantum dot Cooper pai... 2018-07-23 22:28:01+00:00 Piotr Trocha http://arxiv.org/abs/1807.08850v1
# Use the arxiv.Result object stored in the _result column to trigger a PDF download.
qd_Trocha_oldest = qd_Trocha_sorted.iloc[0]
qd_Trocha_oldest._result.download_pdf()
'./0711.3611v2.Dicke_like_effect_in_spin_polarized_transport_through_coupled_quantum_dots.pdf'

Confirm that the PDF has downloaded!

Bibliography#