(Knowledge graphs and archives, episode 3) - previous episodes : RiC-O converter, Sparnatural on FranceArchives.
The SAPA Foundation, Swiss Performing Arts Archive, is an archive and competence center for the preservation and valorization of the cultural heritage of the performing arts in Switzerland.
The foundation uses a knowledge graph to manage its collections. This graph is a lasagne with 3 "layers", plus one :
- The central and most important layer is the content description of the batches of documents held by the foundation. This part is maintained by archivists, and is modeled in RiC-O
- The "lower" layer (conceptually speaking) is the physical description of the documents, including digital media. This part is maintained by curator-restorers. It is also based on RiC-O (these are Instantiations), but extended with specific metadata for media description coming notably from ebucore and premis.
- The "top" layer is the knowledge capture of contexts linked to collections: people, companies, productions, theatres, performances, works, etc. These entities are interrelated. These entities are interrelated. This part is maintained by documentalists, and modeled with FRBRoo and CIDOC-CRM.
- The additional layer is the set of controlled vocabularies that support the description of the other entities.
These 4 parts are intimately linked in the same graph, which ensures the compatibility and fluidity of data between professions: physical objects and media are linked to the intellectual resources of the collections, which are themselves indexed on the named entities of the knowledge part. The whole system is based on the transversal layer of controlled vocabularies.
The following diagram, created by Baptiste de Coulon, data archivist at the SAPA Foundation, gives an idea of the structure of the knowledge graph:
The graph is browsable in the public SAPA platform and exposed in a SPARQL service. The data will also be available in a downloadable RDF dump.
Sparna works with the SAPA Foundation on a number of projects:
- the re-documentation of data model layers, based on SHACL specifications. These are published at http://shapes.performing-arts.ch/
- the implementation of a simplified ("faceted") search interface, directly interfaced with the Instantiations RiC-O data. This simplified search interface is derived from Sparnatural.
- management and publication of controlled vocabularies used to describe other entities. These vocabularies are published on a dedicated site.
- continuous improvement of the knowledge graph, notably through quality control and cleaning.
More details can be found in Baptiste de Coulon's intervention at the semweb.pro 2024 conference.