rico: RDF Prefix

International Council on Archives Records in Contexts Ontology (ICA RiC-O) version 0.2

Introduction RiC-O (Records in Contexts-Ontology) is an OWL ontology for describing archival record resources. As the second part of Records in Contexts standard, it is a formal representation of Records in Contexts Conceptual Model (RiC-CM). This version, which is v0.2, is the current official release. It is compliant with RiC-CM v0.2, that will be published soon after the release of RiC-O v0.2. RiC-O design principles The following design principles were followed when developing RiC-O. RiC-O is a domain or reference ontology. It provides a generic vocabulary and formal rules for creating RDF datasets (or generating them from existing archival metadata) that describe in a consistent way any kind of archival record resource. It can support publishing RDF datasets as Linked Data, querying them using SPARQL, and making inferences using the logic of the ontology. While some projects have built some specific ontologies for describing archives, at this time no generic domain ontology exists for the specific needs of the archival community. This is why EGAD decided to develop RiC-O as the second part of RiC standard. Apart this first, main target, RiC-O also can help archival institutions and engineers to design and develop other technical implementations of RiC-CM that represent record resources and their layers of contexts as oriented, interconnected graphs. Of course, other technical implementations may be developed later on, including XML models, or (hopefully) new versions of EAD and EAC-CPF XML models. As RiC-O is a generic, domain ontology, it does not address by itself every specific need or expectation that may occur in every archival institution or project. It is rather a high level framework and a project can either limit itself to the use of a selection of components, or can add more subcomponents where needed. As a domain ontology, RiC-O, at this stage at least, does not borrow any component from other existing ontologies (such as the cultural heritage models – IFLA-LRM and CIDOC-CRM, PREMIS, or PROV-O). It should therefore be easier, for an archival institution or archival project, to understand, implement and maintain RiC-O within its system. Later on, RiC-O will be aligned with these existing models. This is of course essential for interconnecting RDF datasets conforming to RiC-O with other datasets, or for using parts of RiC-O in other contexts than th archival or record management realm. RiC-O must be immediately usable. This is a key feature for a new model. In particular, it is very important that existing archival metadata, that are created or generated in current information systems, can be converted to RDF conforming to RiC-O, without losing any data, structural or partially implicit information. What is at stake here is that metadata conforming to the previous existing ICA standards can be processed successfully. During the ongoing development process, a lot of successful testing has been made, using XML/EAD finding aids and XML/EAC-CPF authority records, that have been converted to RDF datasets, either by hand or using scripts. A conversion software is being developed and will soon be available. While some existing metadata sets may have a very fine level of granularity and accuracy, already using, for example, controlled vocabularies, or describing curation events separately, often these metadata don’t have the very precise structure that RiC-CM recommends. Even then, such a conversion process should remain possible. In order to allow this, RiC-O sometimes provides several methods for representing information (as described below). From this point of view, the current official version of RiC-O may be considered a transitional ontology, in which some components may be deprecated later on. The usability of a model also depends on its documentation. That’s why the current official release has been fully documented (this documentation will be continously improved). RiC-O will also soon be acompanied with examples (RDF datasets). Some tutorials should also be written, and EGAD will organize practical workshops. RiC-O has to provide a flexible framework. This is a very important principle too. It is related with the usability principle quoted above. Moreover, archival description is flexible by essence. It is quite commonly noted that the level of granularity of information varies from one finding aid to another (or from one authority record to another), or even within the same finding aid. Some series or agents are described summarily because little is known about them and there is little time for extensive research, while other series, even records, or agents are described in detail; some relations (e.g. that relating to provenance) may be described without any detail while others may be thoroughly documented, as ISAAR(CPF) and EAC-CPF allow it. Being generally flexible, for an OWL ontology, depends first on the polyhierarchical systems of classes and properties it provides. A superproperty or superclass, more general or generic than its subproperties or subclasses, must exist and be available for handling information, while at the same time more accurate subcomponents must be there for handling more accurate description. Also, RiC-O should provide several methods for expressing whether relations are well attested and certain or are more vague as well as direct and short paths between entities alongside more complex ones. RiC-O opens new potential for archival description. This means that Linked Data tools and interfaces should enable end users to go through RDF/RiC-O graphs, to query them using SPARQL in an efficient way and consulting archives and their contexts in new ways. As an example, an end user should be able to ask « What are (according to your dataset) the corporate bodies that succeeded to this given entity from its end of existence, by 1840, to nowadays (as concerns this given activity) ?» or « tell me what instantiations of this photograph exist? » « what are the existing copies of this original charter?», and get a list of these entities. In other words, institutions or projects that make the effort to implement RiC-O will get new insight into the content and context of their archives that wasn't visible with the existing ICA-standards. It should be even more interesting if you can infer new assertions from the RDF datasets you built, and of course link your datasets to other ressources outside of your institution. RiC-O should be extensible. Institutions are free to extend the ontology by adding new subclasses or subproperties if needed. RiC-O has also the potential to be useable in other contexts than purely archival ones. This implies that hierarchies of classes and properties are defined and that mappings are developed with other ontologies as mentioned above. It may also imply that RiC-O should provide “hooks” enabling connections with, for example, existing SKOS vocabularies. Understanding RiC-O: a quick overview of some features From RiC-CM to RiC-O In the system of classes of RiC-O, for each RiC-CM entity, you can find a corresponding class. These classes are organized according to the same hierarchy as in RiC-CM. In some projects, you may need very few of them (e.g. Agent, Record Resource and Activity only), while in other ones, you may need more (e.g. Corporate Body and Person; Record; Place; Provenance Relation). Certain classes only exist in RiC-O and not in RiC-CM. These additional classes address special needs: some correspond to RiC-CM attributes, when it may be considered necessary to handle them as full entities. This is the case for Type and its subclasses, that correspond to RiC-CM attributes that contain controlled values, and that can help to articulate RiC-O with external RDF resources like SKOS vocabularies; and also for Language, Name and Identifier, that can be considered as full entities and as key linking nodes in a RDF graph. All these classes have been grouped under a Concept class. some classes have been added in order to provide a more accurate definition and model for some entities. Place thus comes along with a Physical Location class, and with a Coordinates class. A Place is considered a both geographical and historical entity. As a historical entity, among other features, it has a history, and may be preceded or succeeded by other Places. A Place also may have zero to many Physical Location through time (for instance, its boundaries, if it is an administrative area or a country, may change). Each Physical Location may be connected to zero to many Coordinates. This model is quite close to the Linked Places Format (https://github.com/LinkedPasts/linked-places). Another example of such an addition is the Proxy class, that represents (stands for) a Record Resource as it exists in a specific Record Set. finally, a system of classes helps to implement the Relations section of RiC-CM. While these relations also are represented as simple, binary object properties (e.g. ‘hasProvenance’ that corresponds to RiC-R026 relation), you may need to assign different attributes to a relation, e.g. a date, certainty or description, as it is already possible, and quite often done, in a XML/EAC-CPF file. One of the standard available methods for representing such a documented relation in RDF for now is to use a class. RDF* and SPARQL* specification, which is being developed by the W3C RDF-DEV Community Group, provides a far simpler method (allowing to consider a triple as the subject or object of another triple; see https://w3c.github.io/rdf-star/) and is already being used by some tools; however it is not yet a W3C standard. Thus, for example, in RiC-O an AgentOriginationRelation class exists. This class may connect one to many Agents to one to many created or accumulated Record Resources or Instantiations, and has some specific object properties (certainty, date, description, source). Back to the ‘hasProvenance’ object property, let us add that it is formally defined in RiC-O, using OWL 2 property chain axiom (see https://www.w3.org/TR/owl2-new-features/, as a ‘shortcut’ for the longer path ‘recordResourceOrInstantiationIsSourceOfAgentOriginationRelation/agentOriginationRelationHasTarget’, where the intermediate node is an instance of Agent Origination Relation: A triplestore, with the appropriate configuration, may thus infer the direct ‘hasProvenance’ assertion from this long path. Most of the datatype properties in RiC-O correspond to RiC-CM attributes that contain free, plain text. See for example rico:descriptiveNote, rico:history and rico:scopeAndContent. In addition to these datatype properties, the Name and Identifier RiC-CM attributes also have corresponding classes (subclasses of rico:Appellation). A resource may have several Identifiers and each comes with different attributes (e.g. archival reference code, system number, digital object identifier), in this case the identifiers will be modelled in a class. In many simpler usecases it's sufficent to just use the identifier datatype property, typically for the archival reference code. The Location RiC-CM attribute also has a rico:Physical Location corresponding class (for users who want to use Place, Physical Location and Coordinates for handling places). As already said too, every RiC-CM attribute that has ‘controlled value’ or ‘rule-based’ as a schema value, has a class as corresponding component in RiC-O. For these CM attributes that correspond to a RiC-O class, as it is necessary to provide an immediately usable ontology, two supplementary datatype properties exist that allow not to use the classes, at least for a while, if you want to implement RiC-O and create RiC-O/RDF datasets from existing archival metadata without being able to handle URIs for the information you have. For example, you may not be able to handle and maintain URIs for some controlled values you use in EAD finding aids for carrier types: maybe your information system does not use a vocabulary for this, and you cannot for a while consider these carrier types as full entities. Nevertheless you want to produce RiC-O datasets where every piece of information is kept, and you want to avoid blank nodes. If RiC-O would only provide the Carrier Type class, it would be an issue for you. So RiC-O provides a rico:type datatype property, with range rdfs:literal, which allows you to move forward. Therefore, for the RiC-CM *Type attributes, you have a corresponding rico:type datatype property. For RiC-CM Coordinates attribute, you also have rico:geographicalCoordinates datatype property. These datatype properties have a skos:scopeNote which says (for example) "Provided for usability reasons. May be made deprecated or removed later on. Use only if you don't use Physical Location and Coordinates classes with Place." The same key design principle (RiC-O must be immediately usable) led us to define some datatype properties that would enable users to use RiC-O in simple usecases where they do not want to consider dates and rules as full entities. Thus, there of course is Date and Rule classes in RiC-O (since there are Date and Rule entities in RiC-CM). And you also have 'date' datatype properties; plus a rico:ruleFollowed datatype property. The same analysis led us to keep the rico:history datatype property in RiC-O (same as RiC-CM history attribute), while RiC-CM and RiC-O also provide the Event class, and using a series of Events may of course be a better method, easier to query, link and display, than simply using a history prose discourse. The two methods may be used in parallel within the same dataset by an institution that, for example, would decide to emphasize only the accession, appraisal and description events among the whole history of Record resources. These datatype properties have the same kind of skos:scopeNote as above. Finally, we have introduced a few datatype properties that do not correspond to any RiC-CM attribute. Some are superproperties, and thus group datatype properties (rico:physicalOrLogicalExtent, with rico:carrierExtent, rico:instantiationExtent and rico:recordResourceExtent as subproperties ; rico:textualValue, with rico:expressedDate and rico:normalizedValue as subproperties; rico:measure (and its subproperties); rico:referenceSystem, superproperty of rico:dateStandard (and of other datatype properties that do not exist in RiC-CM.) Some are simply more specific properties : rico:accrualStatus ; rico:recordResourceStructure and rico:instantiationStructure, subproperties of rico:structure ; rico:title (subproperty of rico:name) ; rico:altitude, rico:latitude, rico:longitude (subproperties of rico:measure), rico:geodesicSystem and rico:altimetricSystem (subproperties of rico:referenceSystem). In order to connect all the classes created, a significant number of object properties have been defined. While their 'flat' list is a long one, they are grouped hierarchically, so that one can use the upper to intermediate level ones for simplicity sake, or choose the most accurate and expressive ones, or extend the system adding a subproperty easily. It is expected that, in most use cases, a subset of these properties only will be needed. As already said above, some of the object properties are also formally defined as shortcuts. Finally, let us mention that we added to RiC-O six individuals, considering that they would address current and frequent needs: Two (FindingAid and AuthorityRecord) are both instances of the Documentary Form Type class, and of skos:Concept. They can be used for categorizing Records, finding aids and authority records being considered as Records. A Record that would have ‘Finding Aid’ as a Documentary Form Type, can be connected to one to many Record Resources using 'rico:describes’ object property. Four (Fonds, Series, File, and Collection) are both instances of the Record Set Type class, and of skos:Concept. Their definition is taken from the ISAD(G) glossary. They can be used for categorizing Record Sets. In the future, we can imagine that many other categories of the kind will be defined by the archival community, forming at least rich SKOS (hopefully multilingual) vocabularies. Considering the concepts thus defined as being also instances of some RiC-O classes may be of great interest for producing a richer description (for example, an instance of the Documentary Form Type class may have a history and some temporal relations to other documentary form types). RiC-O documentation and annotation properties Each class or property has at least an English label (rdfs:label) and description (rdfs:comment). Some have a skos:scopeNote or a skos:example. When a RiC-O class or property corresponds in a way to a RiC-CM component, its description and scope note are, either the same as, or derived from, their definition and scope note in RiC-CM. We have created two annotation properties, subproperties of rdfs:comment, for handling: Information about the corresponding RiC-CM component when appliable (rico:RiCCMCorrespondingComponent). Various phrasings are used in this property depending on the rule applied for defining the RiC-CM component. Information, most often in prose text for now, about possible mappings with other models or ontologies (rico:closeTo, rarely used in this 0.1 version)). Finally, in this official 0.2 release, any change in the definition of a class or property, since December 2019, is documented using a skos:changeNote. Next steps The following is a non exhaustive list of known issues, topics or tasks on which EGAD has begun to work and will continue to work in the next months: providing more examples of known implementations of RiC-O in different institutions and contexts. The goal is to show different practices on how RiC-O is being implemented. We have begun to release such examples in the public RiC-O repository on GitHub. One can also have a look at the Projects and tools page on RiC-O website. finishing the system of relations (where some subclasses are still missing) assessing, and changing in some cases, the tense of the verbs in some object properties (e.g. for the properties that correspond to some RiC-CM relations). This has been done, following RiC-CM v0.2 updates, where many relations have changed name so that they can be used for recording both past and present situations. articulating the Event and Activity classes, and the Relation system of classes improving the names of object properties. This has been done, following RiC-CM v0.2 updates and applying a few naming rules, so that, for example, the same verb is used for naming a relation and the inverse relation when it exists. adding suggestions of mappings (in rico:closeTo) and OWL equivalences between some classes or properties and components in other models (among which - this is not an exhaustive list- CIDOC-CRM, IFLA-LRM, PREMIS, PROV-O, Wikidata and Schema.org) documenting RiC-O in French and Spanish providing users with some SPARQL constructs for inferring.

The https://www.ica.org/standards/RiC/ontology# namespace defines:

2 owl:AnnotationProperty
106 owl:Class
62 owl:DatatypeProperty
423 owl:ObjectProperty
15 owl:SymmetricProperty

106 owl:Class

62 owl:DatatypeProperty

423 owl:ObjectProperty

0 other term