-
Notifications
You must be signed in to change notification settings - Fork 10
Tutorial
This tutorial guides you through using MIK to create a set of Islandora basic image objects from metadata in a CSV file. When you finish the tutorial, you will be able to import the objects into Islandora.
To complete the tutorial, you will need a computer that has MIK installed on it.
You will also need a text editor. Any decent editor will do (so, Windows Notepad is not a viable option). If you don't already have a text editor installed, check out Atom. It's free, it works on all major operating systems, and it's easy to use.
Finally, you will also need to know a little bit about Islandora. In particular, the tutorial below assumes that you know what Islandora objects are, and that you are familiar with some of the different types of Islandora objects, like Basic Image objects.
A zip file containing the sample images, metadata, and configuration files used in this tutorial can be downloaded here. It contains everything you need to create the Islandora import packages. When you unzip it, its contents should look like this:
To get ready to start the tutorial,
- unzip the file
- copy the files that aren't images (tutorial_config.ini, tutorial_mappings.csv, and tutorial_metadata.csv) into the same directory where MIK is installed, and
- edit tutorial_config.ini to define your input and output directories.
We will cover editing tutorial_config.ini file in detail in Step 3, below.
Note that Step 1 ("Create your metadata CSV file") and Step 2 ("Create your mappings file") have already been done for you. You don't need to edit those two files in order to proceed with the tutorial. We include the steps here to represent a typical MIK workflow. In real life, if you weren't using prepackaged content like that included in this tutorial, you would need to complete those steps. In the Step 1 and Step 2 sections below, we'll describe what you would need to do if you were preparing your own content for use with MIK.
Even though you don't need to edit tutorial_metadata.csv to complete this tutorial, it will be useful to note a few things about the CSV metadata files that MIK can take as input"
- The first row of the CSV file must contain column labels/headings. These are the "fields" of the metadata that MIK will convert to MODS.
- All column headings must be unique, and the heading row cannot contain any empty cells.
- By default, fields are separated by a comma, and enclosed in double quotation marks. However, you can specify other delimiters and enclosure characters in the .ini file if you want.
- Each record in the CSV file corresponds to one Islandora object
- One of the fields must contains a unique identifier for each row in the file. This field must be named in the [FETCHER] section's "record_key" configuration setting.
- One of the fields contains the name of the file that is to be used in each of the created objects. This field must be named in the [FILE_GETTER] section's "file_name_field" configuration setting.
tutorial_metadata.csv illustrates these attributes of CSV metadata files:
Identifier,File,Title,Creator,Date taken,Subjects,Note
"image01","IMG_1410.JPG","Small boats in Havana Harbour","Jordan, Mark","2015-03-08","Boats; water","Taken on vacation in Cuba."
"image02","IMG_2549.JPG","Manhatten Island","Jordan, Mark","2015-09-13","Cityscapes","Taken from the ferry from downtown New York to Highlands, NJ. Weather was windy."
"image03","IMG_2940.JPG","Looking across Burrard Inlet","Jordan, Mark","2011-08-01",,"View from Deep Cove to Burnaby Mountain. Simon Fraser University is visible on the top of the mountain in the distance."
"image04","IMG_2958.JPG","Amsterdam waterfront","Jordan, Mark","2013-01-17",,"Amsterdam waterfront on an overcast day."
"image05","IMG_5083.JPG","Alcatraz Island","Jordan, Mark","2014-01-14","Alcatraz Federal Penitentiary; islands","Taken from Fisherman's Wharf, San Francisco."
You can prepare your CSV metadata files in any application that can save data in a standard CSV format.
The mapping file contains two columns - in fact, it is also a CSV file. The column on the left identifies the field names in the "source" metadata record, and the column on the right defines the "target" MODS XML snippet that takes the value of the corresponding source field. Some important things about the snippets:
- They must be well-formed XML (that is, opening and closing tags must match, and must follow rules defining XML attribute syntax). You can check the well formedness of your snippets by running the
./mik --config=foo.ini --checkconfig=snippets
command. This command does not validate your snippets against a schema. - They must include all XML from the first child of the root element down; that is, they are appended to the root element of the MODS XML.
- The first row of your mapping file should not contain any column headings.
- Snippets can contain the special
%value%
placeholder. MIK replaces this string is with the value of the source metadata field. For example, if your metadata has a Title field and its value is "Amsterdam waterfront" and Title is mapped to the MODS snippet<titleInfo><title>%value%</title></titleInfo>
, the resulting MODS markup will look like<titleInfo><title>Amsterdam waterfront</title></titleInfo>
.
Title,"<titleInfo><title>%value%</title></titleInfo>"
Creator,"<name type=""personal""><namePart>%value%</namePart><role><roleTerm type=""text"">photographer</roleTerm></role></name>"
Date taken,"<originInfo><dateCreated encoding=""w3cdtf"" keyDate=""yes"">%value%</dateCreated></originInfo>"
Subjects,"<subject><topic>%value%</topic></subject>"
Identifier,"<identifier type=""local"" displayLabel=""Local identifier"">%value%</identifier>"
Note, "<note>%value%</note>"
null0,"<genre authority=""marcgt"">picture</genre>"
null1,"<typeOfResource>still image</typeOfResource>"
null2,"<physicalDescription><digitalOrigin>born digital</digitalOrigin></physicalDescription>"
Time for you to start editing a file.
MIK uses a "toolchain", which is groups of MIK components that are brought together to convert a specific type of input (like CSV metadata) into a specific type of output (like import packages for Islandora Basic Image objects). A toolchain is defined in an MIK configuration file, also known as an .ini file since that's the format the files take. All the .ini file contains is groups of configuration settings for your toolchain. MIK configuration files can also contain some comment lines that begin with a semicolon (;
). These lines are ignored by MIK and really only function as inline documentation within the .ini file. You can also comment out a line to disable a configuration setting.
The .ini file below is the one that we'll be using in this tutorial. Even though this section is titled "Create and .ini file", you will only need to edit this one to run MIK, not create a new one. Specifically, you will need to change
- the path to your input directory,
- the path to your output directory,
- the path to your log file.
Different operating systems define paths differently. The .ini file below contains paths Linux paths, which look like this:
temp_directory = "/tmp/miktutorial_temp"
The values for the input_directory
, output_directory
, and path_to_log
settings will need to be compatible with your operating system. For example, on Windows, paths look like this:
temp_directory = "c:\temp\miktutorial_temp"
whereas on a Mac they look like this:
temp_directory = "/Users/mark/miktutorial_temp"
Here is the .ini file as it is provided in the tutorial sample data. Assuming that MIK is installed correctly on your computer, and that you have copied tutorial_config.ini, tutorial_mappings.csv, and tutorial_metadata.csv into the same directory where MIK is installed, you should be able to run MIK after you have updated tutorial_config.ini with your own paths.
; MIK configuration file for the MIK Tutorial.
[CONFIG]
config_id = MIK tutorial
last_updated_on = "2016-02-03"
last_update_by = "Mark Jordan"
[FETCHER]
class = Csv
input_file = "tutorial_metadata.csv"
temp_directory = "/tmp/miktutorial_temp"
record_key = Identifier
[METADATA_PARSER]
class = mods\CsvToMods
mapping_csv_path = "tutorial_mappings.csv"
[FILE_GETTER]
class = CsvSingleFile
input_directory = "/home/mark/Downloads/mik_tutorial_data"
temp_directory = "/tmp/miktutorial_temp"
file_name_field = File
[WRITER]
class = CsvSingleFile
preserve_content_filenames = true
output_directory = "/tmp/miktutorial_output"
; Note that you will need to adjust the path to your system's php executable.
postwritehooks[] = "/usr/bin/php extras/scripts/postwritehooks/validate_mods.php"
; During testing, we only want to create the MODS XML file.
; datastreams[] = "MODS"
[MANIPULATORS]
metadatamanipulators[] = "FilterModsTopic|subject"
[LOGGING]
path_to_log = "/tmp/miktutorial_output/mik.log"
- Open tutorial_config.ini in your text editor.
- In the [FETCHER] section, modify the value of "temp_directory" so that....
- In the [FILE_GETTER] section, modify the value of "temp_directory" so that is has the same value as...
- In the [WRITER] section, modify the value of "output_directory" so that.....
- In the [LOGGING] section, modify the value of "path_to_log" so that ...
- Save your file in the same directory in the MIK installation directory.
php mik --config=tutorial_config.ini --checkconfig=all
php mik --config=tutorial_config.ini
Zip up the output from MIK. Be sure to remove the log files from the output directory first.
Now that you can run MIK and you know how to use its output, you may want to try some of the following activities.
- Modify the values in the metadata file (but for now, keep the same column structure) and rerun MIK. Open the XML files in your text editor that MIK creates to see your values in the MODS.
- Add a new field to the mappings file that will have the same value for all objects. For example, add the following line to the end of the file:
null3,"<accessCondition type=""use and reproduction"">Images are in the public domain.</accessCondition>"
- Add a new column to the CSV metadata file and populate it with different values for each image. Then add a mapping for the new field using the special
%value%
token so that your MODS will use the value of the new field for each image.
Content on the Move to Islandora Kit wiki is licensed under a Creative Commons Attribution 4.0 International License.