The University of Virginia Library was interested in the Fostering a Community of Practice grant from the Software Preservation Network as a way to further develop and test our born digital archiving and preservation workflows for software and software-dependent digital materials. We were particularly interested in the grant and working with SPN as a way to test tools, systems, methods (and some of our own assumptions!) to accommodate the increasing number of digital materials that the Library and archives are expected to handle in the course of work for a large research institution.
Architectural materials are one of the collecting priorities for the Albert and Shirley Small Special Collections Library, and our “Emulation in the Archives” project focused on a single hybrid collection, (meaning the collection contains both analog and born digital materials), including commercial computer-aided design/building information modeling software (CAD/BIM), of a local Charlottesville architecture firm, Sheeran Architects. This use case not only reflects many of the more recent hybrid collections in our archives and special collections, but also represented a category of use case previously identified by SPN Curation-Ready working group in 2017 – materials that depend on software and the preservation of that software to provide ongoing access to unique data over time. Even though this unique collection consists of digital materials that are only 10-20 years old, many files were already incompatible or “too old”, as the error message above says, to even open in available modern viewers in 2020.
UVA’s project had several goals aimed at developing our institutional capacity to acquire, preserve, and provide access to software and born-digital materials dependent on software. As Dr. Acker observed in her field notes from her visit to UVA last summer, we focused as much as possible on a lightweight, “just in time” approach that we thought would suit the scope of the grant and reflect our institutional realities and resourcing.
Our project goals were:
- Develop a curatorial workflow focused on early intervention and information gathering that would allow us to make informed preservation and archival processing decisions about software and born-digital materials
- Create an updated technical workflow and system design to better accommodate our preservation work as well as access and archival discovery needs for born digital materials, software, and emulated digital collections, including leveraging existing linked data efforts like Wikidata as well as UVA metadata repositories
- Draft the first version of an archival description strategy for software and emulated materials in ArchivesSpace based on best practice
- Host a free “Emulation in the Archives” one day workshop to share our developing work and to highlight the work of others in this shared space
Over the past year, our grant work has developed in ways that sometimes aligned closely with our original intentions and at other points reframed and highlighted important gaps and opportunities in our understanding of workflows, scale, sustainability, and collaboration both within our institution and beyond.
Our focus on curatorial work for the grant included updating and reconfiguring our donor and outreach documentation, updating our Deed of Gift, and creating a new software questionnaire for donors and curators to use to aid earlier intervention and understanding around acquiring, processing, and preserving software and born-digital materials in Special Collections.
During the course of our work on this outcome, we were also fortunate to be able to reframe and strengthen some existing collaborations to help with the nitty gritty detail and work around building and adapting processing workflows for digital materials, architectural materials, and software. It is important to highlight these collaborations here in our near-wrap-up review, as they are absolutely necessary to the maintenance and further development of our work at UVA and for SPN, yet may not always fall under what might be identified as “software preservation” work at institutions thinking about their work with software and software-dependent materials. In fact, many people we relied on at different times were external to our grant team.
These collaborations included the expertise of Ann Burns, a metadata librarian with a background in architectural history to help us establish archival processing workflows for collection materials (both born digital and analog), refining and updating accessioning and preservation workflows around digital materials with various library and archives staff, enlisting the help of our Information Policy director, Brandon Butler, around legal questions for both the deed of gift and licenses for commercial software, working with UVA user experience (UX) team members to carry out some beta testing on ArchivesSpace access to emulated environments, and collaborating with Special Collections public services staff as we were developing our curatorial and access workflows to emulated materials to leverage their knowledge around research and reading room practices.
These last two highlighted collaborations with UX and public services underscores one of the components we would like to add to our work on at UVA, and would also encourage for broader SPN community collaboration and research – the user experience of researchers, staff, students, and community members discovering, using, and teaching with born-digital materials, software, and emulated environments. Though some scholarship has started to focus on this subject, a better understanding of user experience would allow us to adapt some of our existing workflows to actual user feedback where we may have had to previously make assumptions about access needs, which in turn could affect preservation approaches and other upstream workflows.
What’s next for our curatorial work at UVA : We will continue to work to finish the processing of the architectural collection for eventual use in the reading room, and plan to conduct ongoing testing of curatorial documentation and software questionnaire for other collections. We also plan to continue coordination with public services staff to build out training and digital materials around access to digital and emulated materials for researchers, students, and the broader community, and plan to conduct future user experience studies for born-digital and emulated materials, as well as plan for remote reading room access protocols.
One area external to our current FCOP project scope that developed interest during the course of the grant is around the use of emulation for appraisal purposes in library workflows – this could be an interesting continuation of FCOP cohort and community work and research that might be able to better suit some current legal realities for community partners, especially in the international community.
In developing the system design for our project goals, we started first from the ideal vision for access and then worked backwards to determine the needs to support it. That vision was when researchers came across descriptions of these digital materials they would be presented with enough information to determine their relevance, and instructions on how to access them. For files which rely on obsolete software and platforms, like those in the architectural collection for the FCOP grant, access would be provided through emulation.
The primary access system for descriptions of our archival collections is a local instance of the public interface for ArchivesSpace. We had already established that digital content (as well as some software necessary to interpret that content) would undergo initial processing using Archivematica. Archivematica is an open source digital preservation system that UVA is in the process implementing as a component of our preservation and archival processing workflows, and allows us to do things like create automatic processing routines to produce technical metadata about digital assets and puts them into Archival Information Packages (as defined in the OAIS reference model).
Having established our ideal presentation, we just needed to figure out where to store all the bits of metadata necessary to achieve it, then build or implement any missing pieces.
ArchivesSpace has little support for digital objects, except to link out to other systems (for example, this digitized audio recording links out to a streaming copy in our AV system). Neither Archivematica (used as part of UVA archival and preservation workflows) nor ArchivesSpace are ideally suited for machine-actionable metadata assertions about emulation environments. In some cases the information we might need (for instance, to display information about a software package) already exists as linked open data (like this Wikidata record for the Vectorworks software), so there was some desire to leverage that resource rather than reproduce that description. In other cases, we believed we would need to store assertions locally. This included information about which files (using identifiers assigned when processing the files in Archivematica) were available in which emulation environments.
What’s next for us on the technical front at UVA, now that we have outlined our technical systems draft, is to fully test our assumptions and integrations in our systems workflows, starting with continuing on with the processing workflow for the Sheeran collection in Archivematica and adding metadata information to the UVA metadata repository, Apollo. Future work that will need to be prioritized likely includes building the “Curio” view (custom view) outlined in the workflow that will pull metadata and information from various sources, and finding ways to link to that view from ArchivesSpace and other systems.
Preservation and systems work to refine our digital preservation and archival processing is expected to continue along these next steps, which also includes the ongoing work of how our local metadata and digital materials workflow fit within the developing EaaSI platform. We are also aiming to be as forward thinking as possible around future potential integrations with other types of collections in our dual role in the EaaSI project, including for research and cultural data that may also rely on emulation. Of course, “just” testing and refining systems workflow for digital materials and software is an entire world unto itself, and continued collaboration with the FCOP cohort and other SPN and community members is essential to continue to scale this work.
Archival description strategy for software and emulated materials
Before embarking on descriptive work for our project focus, we needed to ensure alignment within the broader framework of community standards and best practices. This was especially relevant to the challenges involved with describing both born digital material and the software we relied on to access this content. We were fortunate to have access to published documents to inform our approaches:
- The University of California Guidelines for Born-Digital Archival Description helped us organize and describe our digital content within the context of an archival collection.
- The Software Preservation Network’s Metadata Model helped us prioritize core elements to include in our description
We proceeded with experimental description in ArchivesSpace and developed different options to determine how best to meet our user’s information needs. As we stepped through our prototype descriptions, we learned that both Wikidata and WorldCat were excellent sources for tracking down technical metadata for obscure or deprecated software titles. Worldcat, in particular, was useful for software titles that are commercial in nature, as was the case for the Sheeran collection of CAD/BIM software.
What’s next for our descriptive work at UVA: Using MARC as an initial jumping-off point for leveraging existing information for software description was the subject of lively debate during our workshop (see the Metadata Q&A section in the shared notes!), and we look forward to hearing more feedback from those who review our first draft of our Archival Description Strategies for Emulated Software.
Our next steps for description include scaling up description of emulated material and dependent software to test feasibility of our proposed approaches at UVA, and integrating any local and community feedback about our proposed approaches.
We are also interested in continuing to work on our descriptive practices as they relate to integration with EaaSI node work and additional types of collections outside of the archives, and how we might leverage our existing knowledge to aid development for description across the needs of Library users. This would be a great space within which we could continue collaboration with other members of FCOP as well as the broader SPN community.
We are proud of what we accomplished during our time as part of FCOP, but recognize there is still quite a bit to do, both locally at UVA and within the broader community of practice. Being part of a cohort has helped us leverage the skills and expertise of others and form community relationships, while contributing some of our own work back. We hope that these contributions help make this work just a bit more accessible to the broad community of practice, and welcome feedback!