Research

Research Data Management

I am the lead developer of PyRDM (Research Data Management with Python), a Python-based library which facilitates the automated publication of scientific software and data in online repositories such as those provided by Figshare. The Digital Object Identifier (DOI) that is minted for each repository enables a more formal way of citing research software and data (compared to, say, citing a generic user manual with no information about the specific revision of the software used to produce a particular data set).

An overview of the typical actions that PyRDM performs is shown below for the case of software.

Typical actions performed by PyRDM when publishing software source code with Figshare. Image by Christian Jacobs, first presented at the 10th International Digital Curation Conference.

Typical actions performed by PyRDM when publishing software source code with Figshare.
Image by Christian Jacobs, first presented at the 10th International Digital Curation Conference.
The conference poster is available on Figshare.

Git is first interrogated to obtain the revision of the source code currently in use. PyRDM then searches Figshare via its API to see whether that same revision of the code has already been published. If it has, the DOI can be reused; otherwise, the code is uploaded to a new repository on Figshare. PyRDM tags the new repository with the revision identifier (in the form of a SHA-1 Git commit hash), and adds author information obtained from the software's AUTHORS file, if present.

An in-depth description of the PyRDM library and its applications can be found in the following papers:

  • C. T. Jacobs, A. Avdis, G. J. Gorman, M. D. Piggott (2014). PyRDM: A Python-based library for automating the management and online publication of scientific software and data. Journal of Open Research Software, 2(1):e28, DOI: 10.5334/jors.bj

  • C. T. Jacobs, A. Avdis, S. L. Mouradian, M. D. Piggott (2015). Integrating Research Data Management into Geographical Information Systems. In Proceedings of the 5th International Workshop on Semantic Digital Archives, held in PoznaƄ, Poland on 18 September 2015. Handle: 10044/1/28557

  • C. T. Jacobs, A. Avdis (2016). Git-RDM: A research data management plugin for the Git version control system. The Journal of Open Source Software, 1(2), DOI: 10.21105/joss.00029

  • S. L. Mouradian, A. Avdis, M. D. Piggott, C. T. Jacobs, C. Villaret, D. R. de Mijolla, J. Lietava (2016). TELEMAC model archive: Integrating open-source tools for the management and visualisation of model data. In Proceedings of the 23rd TELEMAC-MASCARET User Club, held in Paris, France on 11-13 October 2016. Pre-print: http://eprints.soton.ac.uk/405307/