Tell us about the software preservation program at your organization.
We have a few projects around the Libraries that look at software preservation in various forms, but no formal software preservation program as a part of core digital archiving services. We have three current software preservation projects underway:
1. Saving Data Journalism — an IMLS-funded project to create an emulation-based web archiving prototype to capture complex interactive websites.
2. Enhancing Services to Preserve New Forms of Scholarship – an Andrew W. Mellon Foundation-funded project to look at preserving new forms of scholarship.
3. Investigating & Archiving the Scholarly Git Experience — an Alfred P. Sloan funded project to write an archival specification for the git & it’s data format, a version control system for software.
What has your organization accomplished recently that you're proud of - big or small?
We recently launched the code of an emulation-based web archiving tool, built as an extension to ReproZip, that is currently in testing and development. This represents the completion of our 2018-2019 IMLS planning grant. The tool was built to specifically capture dynamic data journalism websites. We announced the launch at the annual conference for the National Institute for Computer-Assisted Reporting (NICAR), and received a lot of initial positive feedback from that community, which has been energizing.
We have just received an award to explore the means by which academic publishers can better preserve new forms of scholarship. We will be working with NYU Press, the University of Minnesota Press, Michigan Publishing, CLOCKSS, Portico, and several other publishing, preservation, and emulation services. Over the next 16 months, we plan to test different formats and develop guidelines so that publishers can advise authors at the beginning of a project about the long-term “preservability” of different approaches.
Tell us about a challenge that your organization is facing in its software preservation work or that the field is facing as a whole?
The biggest challenge to the field of software preservation is copyright/legal concerns. In the United States, software are considered ‘literary works’ and so copyright protects not only the literal elements (e.g. source code), but also non-literal elements (e.g. code sequence, unique application of utilitarian methods).
While libraries specifically have an exception to copyright laws that “…allows them (libraries) to reproduce and distribute copyrighted works under certain specific conditions,” this has not been tested for creating an archival copy of a copyrighted software. Most GLAMs err on the side of caution, and as such a lot of fair use for archiving software has not been tested in the courts. The legal infrastructure tested as a part of software preservation has the potential to be transformative for GLAMs who want to archive critical software for their users (e.g. the first version of PhotoShop to see artwork in the original environment) but are scared of the legal consequences of doing so.