Toolkit Overview    :    Define    :    Plan    :    Gather    :    Preserve    :    METADATA    :    Storytelling   :   Share     :    Recommendations


El Grito de Sunset Park Use Case

Step 5: SCHEMA DEVELOPMENT & DOCUMENTATION

What is a metadata schema?

After the data model is complete, the next step is to create a metadata schema. The official ISO definition of a schema is “a logical plan showing the relationships between metadata elements, normally through establishing rules for the use and management of metadata specifically as regards the semantics, the syntax and the optionality (obligation level) of values.” (ISO 23081).

So, in other words, it is similar to a data model, except that it also additionally defines rules for the actual implementation of the data model in a database, such as how data should be entered (e.g. format, spellings, controlled vocabularies), and whether it is mandatory or optional to fill out each field.

Two of the primary ways that metadata schemas are expressed or documented are:

  • Data dictionaries
  • XML schema definition (XSD)

Data Dictionaries and Schema Definitions

A data dictionary is a document, like a manual, for implementing a data model and schema. It explains and provides definitions for all of the data elements in a data model, and provides guidelines on how to create data that conforms to its structures and rules. Besides a narrative description that discusses the overall model and its design, a data dictionary will usually go through each attribute, or field, and define the following:

Entity Depending on data structure, if the element belongs to larger entity, e.g. Officers, Incidents, Videos, etc.
Attribute Name or Label The name of the attribute in the database, e.g. FirstName, LastName, etc.
Definition The meaning of the data element, e.g. “The number worn on a badge to identify a lower-level police officer.”
Rationale What purpose this element serves, or why this data element is included in the schema.
Data Type The type of data that the element is, e.g. free-text, date, number, calculation, etc.
Data Rules /  Guidelines Rules or guidelines for data entry, e.g. “Enter date as yyyy-mm-dd.” or “Shield numbers must be exactly 5 digits.”, etc.

You can also point to any internal or external controlled vocabularies that should be used, e.g. “Use Berkeley Copwatch misconduct codes”.

Obligation to Use Whether use of this element is mandatory, recommended, or optional.
Mapping to External Databases If applicable, how this field maps to other schemas.
Example An example of a valid entry.


XML schema definition is like a data dictionary, but meant for computers (although they are human-readable too). XML stands for Extensible Mark-Up Language. It’s a tagged-text language similar to HTML. An XML schema definition (XSD) is a specific kind of XML document that defines a schema. It can be used make and validate metadata records that are written in XML.

Here are some examples of XSDs for metadata schema used in the archiving and library world:

Data Values and Rules

As mentioned above, a metadata schema can define rules for data values.  Rules can define the vocabulary, syntax, or format of data for data entry.

Rules are useful to ensure the data is represented consistently.  It can also make data entry easier, such as if you have a dropdown list for controlled vocabularies that catalogers can select from, rather than typing out manually.

El Grito Example

The goal of this process was to try to describe or express the information structure that emerged from working with El Grito and their video collection, and to create a model that can be built upon by others. A fully fledged metadata schema is beyond the scope of the project at this point, but it is a potentially worthwhile next step. Developing a metadata schema, along with other community standards like controlled vocabularies and definitions, would ideally be pursued in collaboration with others working on police accountability.

HUMAN RIGHTS CONTROLLED VOCABS

HURIDOCS logo

HURIDOCS has created a multi-lingual Degrees of Involvement micro-thesauri for describing ways that human rights perpetrators are involved in acts of abuse. This vocabulary could be used, for example, in the “Officer Actions” attribute in the El Grito data model.


Micro-thesauri cover page

HURIDOCS has also published other micro-thesauri for describing human rights in multilingual PDFs and individual Google Sheets.