Miscellaneous material not directly related to another HEPData repository
This Jupyter notebook counts all INSPIRE records that have an associated HEPData record using the INSPIRE API.
Count publications per LHC experiment per year and see which fraction has a HEPData record.
Written by Graeme Watt (Project Manager for HEPData) on 11th October 2017. Updated for new INSPIRE API on 15th April 2021.
This Jupyter notebook makes some plots showing the number of HEPData
submissions in progress and finished by different experiments, including
a breakdown by the different ATLAS and CMS physics groups. The input is
a CSV file obtained by running the command
hepdata submissions write-stats-to-files
in the production environment.
This Jupyter notebook makes a plot showing the number of "version 1"
HEPData submissions per month, with a linear fit overlaid. It also plots
the number of HEPData submissions per year. Again, the
input is a (different) CSV file obtained by running the command
hepdata submissions write-stats-to-files
in the production environment.
This Jupyter notebook investigates the delay between arXiv publication, indicated by creation of an INSPIRE record, and release of the first version of a corresponding HEPData record. A histogram is plotted of these delays for a given time period and experimental collaboration.
Written by Graeme Watt on 25th November 2022.
This Jupyter notebook investigates the consistency between the INSPIRE record numbers obtained from either INSPIRE or HEPData. Discrepancies usually occur because an INSPIRE record has changed its record number.
This Jupyter notebook compares the LaTeX to Unicode conversions obtained with three different Python packages: latex2text, unicodeit and unicodeitplus. Test data is obtained from the publication titles of all finished HEPData records. The intended application is to tweet these titles when the HEPData records are first released or later revised.
This Jupyter notebook gets the access count of all records in HEPData from the
ATLAS experiment. It first makes a paginated search
"collaborations:ATLAS"
to find the INSPIRE IDs. Then the JSON format of individual records is retrieved
to get the access counts and associated metadata, using the light=true
option to reduce the size of the
JSON by removing data tables. See JSON Endpoints. The output
is written to a CSV file for possible further analysis by Python, Excel, etc.