El Grito de Sunset Park Use Case
Step 4: DATA MODEL TEST
This is a very important step. A model is only useful if it actually works!
To test the data model, you will need to build a database based on the model and populate it with some real-life data.
You can create a database using a database application, such as:
- OpenOffice Base / LibreOffice Base – Free and open source, multi-platform.
- MS Access – Comes with Microsoft Office, Windows only.
- FileMaker Pro – Owned by Apple, but multi-platform. More expensive than Access.
- MySQL – Free and open-source, needs a front-end interface.
- Uwazi – Free, open-source, web-based platform for organizing, analyzing, and publishing human rights information. Developed by HURIDOCS.
Once you have a test database up and added some records, try running some queries, views, or reports, and see what you get! Some questions to ask yourself are:
- Can I retrieve the information that is needed for the goals of the project?
- Is there any important information missing?
- Is any of the information I retrieve invalid or incorrect?
- Do I get any database errors, does anything break?
- Does the model seem efficient, or are there ways to simplify?
- Is the model easily “readable,” i.e. could someone else look at the model and easily understand my information structure?
El Grito Example
We built the test database in FileMaker Pro, because it’s what WITNESS’s Media Archive already uses, and what one of our team members could most easily use to build a test database. It may or may not be the best choice for the project in the long-run. While it has a simple user interface and is easy to install and get running, it is also one of the more expensive options (~$330 for an individual installation) and not open-source.
We are making the test database (with no records included) available for download, in case others want to work with it, adapt it, or improve it. Please keep in mind that this was built for testing purposes; we make no warranties that it is error-free!
As part of refining and testing this model, we shared it with a project advisor who is a data expert. He observed that an officer’s rank, salary, shield number, and precinct histories are not really one-to-many relationships — an officer only has one rank, salary, shield number, or precinct at a time.
Instead, he suggested that the entities are in fact “Officers” and “Time,” which was confusing at first, but came to make sense. We could see that Officers and Time have a many-to-many relationship, an officer’s shield number, rank, etc. could describe an officer at a given point in time.
So instead of modeling the relationship like this:
It might look more like this:
Modeling the relationship this way would better enable a user to query the shield number, rank, salary, and precinct of an officer on a given date, such as the date of an incident. But would modeling this way make it more difficult glimpse an officer’s historical timeline (e.g. start and end dates that an officer served at each rank or salary level)? This is something we look forward to examining further as we continue to test and seek feedback!