Skip to content
Mark Jordan edited this page Apr 24, 2018 · 45 revisions

This tutorial guides you through using MIK to create a set of Islandora basic image objects from metadata in a CSV file. When you finish the tutorial, you will be able to import the objects into Islandora.

To complete the tutorial, you will need a computer that has MIK installed on it.

You will also need a text editor. Any decent editor will do (so, Windows Notepad is not a viable option). If you don't already have a text editor installed, check out Atom. It's free, it works on all major operating systems, and it's easy to use.

And of course, you will also need to know a little bit about Islandora. In particular, the tutorial assumes that you know what Islandora objects are, and that you are familiar with at least one type of Islandora objects, Basic Image objects.

Getting ready

The last thing you need to do is download the sample data used in the tutorial. It contains everything you need to create the Islandora import packages: sample images, metadata, and configuration files. When you unzip it, its contents should look like this:

MIK tutorial sample files

To get ready to start the tutorial,

  1. unzip the file
  2. move (not copy) the files that aren't images (tutorial_config.ini, tutorial_mappings.csv, and tutorial_metadata.csv) into the same directory where MIK is installed
  3. edit tutorial_config.ini to define your output and input directories.

The input directory will be the place where the image files that were in the zip file are located.

We will cover editing tutorial_config.ini file in detail in Step 3, below.

One last thing before we begin

Good news - Step 1 ("Create your metadata CSV file") and Step 2 ("Create your mappings file") have already been done for you. You don't even need to edit those two files in order to proceed with the tutorial. We include the steps here to represent a typical MIK workflow, and to provide an overview of how those two files are structured. If you weren't using prepackaged content like that included in this tutorial, you would need to complete those two steps. In the Step 1 and Step 2 sections below, we'll describe what you would need to do if you were preparing your own content for use with MIK.

Step 1: Create your metadata CSV file

Even though you don't need to edit tutorial_metadata.csv to complete this tutorial, it may be useful to note a few things about the CSV metadata files that MIK can take as input:

  • The first row of the CSV file must contain column labels/headings. These are the "fields" of the metadata that MIK will convert to MODS.
  • All column headings must be unique, and the heading row cannot contain any empty cells.
  • By default, fields are separated by a comma, and enclosed in double quotation marks. However, you can specify other delimiters and enclosure characters in the .ini file if you want.
  • Each record in the CSV file corresponds to one Islandora object
  • One of the fields must contains a unique identifier for each row in the file. This field must be named in the [FETCHER] section's "record_key" configuration setting. In our file, it's the first column, "Identifier".
  • One of the fields contains the name of the file that is to be used in each of the created objects. This field must be named in the [FILE_GETTER] section's "file_name_field" configuration setting. In our file, it's the second column, "File".

tutorial_metadata.csv illustrates these requirements:

Identifier,File,Title,Creator,Date taken,Subjects,Note
"image01","IMG_1410.JPG","Small boats in Havana Harbour","Jordan, Mark","2015-03-08","Boats; water","Taken on vacation in Cuba."
"image02","IMG_2549.JPG","Manhatten Island","Jordan, Mark","2015-09-13","Cityscapes","Taken from the ferry from downtown New York to Highlands, NJ. Weather was windy."
"image03","IMG_2940.JPG","Looking across Burrard Inlet","Jordan, Mark","2011-08-01",,"View from Deep Cove to Burnaby Mountain. Simon Fraser University is visible on the top of the mountain in the distance."
"image04","IMG_2958.JPG","Amsterdam waterfront","Jordan, Mark","2013-01-17",,"Amsterdam waterfront on an overcast day."
"image05","IMG_5083.JPG","Alcatraz Island","Jordan, Mark","2014-01-14","Alcatraz Federal Penitentiary; islands","Taken from Fisherman's Wharf, San Francisco."

You can prepare your CSV metadata files in any application that can save data in a standard CSV format.

Step 2: Create your mappings file

The mapping file contains two columns - in fact, it is also a CSV file. The column on the left identifies the field names in the "source" metadata record, and the column on the right defines the "target" MODS XML snippet that takes the value of the corresponding source field. Some important things about the snippets:

  • They must be well-formed XML (that is, opening and closing tags must match, and must follow rules defining XML attribute syntax). You can check the well formedness of your snippets by running the ./mik --config foo.ini --checkconfig snippets command. This command does not validate your snippets against a schema.
  • They must include all XML from the first child of the root element down; that is, they are appended to the root element of the MODS XML. In plain English, they can only contain elements listed as "top-level" in the MODS guidelines, but the elements can include any child elements that MODS allows.
  • The first row of your mapping file should not contain any column headings.
  • Snippets can contain the special %value% placeholder. MIK replaces this string is with the value of the source metadata field. For example, if your metadata has a Title field and its value is "Amsterdam waterfront" and Title is mapped to the MODS snippet <titleInfo><title>%value%</title></titleInfo>, the resulting MODS markup will look like <titleInfo><title>Amsterdam waterfront</title></titleInfo>.
Title,"<titleInfo><title>%value%</title></titleInfo>"
Creator,"<name type=""personal""><namePart>%value%</namePart><role><roleTerm type=""text"">photographer</roleTerm></role></name>"
Date taken,"<originInfo><dateCreated encoding=""w3cdtf"" keyDate=""yes"">%value%</dateCreated></originInfo>"
Subjects,"<subject><topic>%value%</topic></subject>"
Identifier,"<identifier type=""local"" displayLabel=""Local identifier"">%value%</identifier>"
Note, "<note>%value%</note>"
null0,"<genre authority=""marcgt"">picture</genre>"
null1,"<typeOfResource>still image</typeOfResource>"
null2,"<physicalDescription><digitalOrigin>born digital</digitalOrigin></physicalDescription>"

Step 3: Create an .ini file

Time for you to start editing a file.

MIK uses a "toolchain", which is a group of MIK components that are brought together to convert a specific type of input (like CSV metadata) into a specific type of output (like import packages for Islandora Basic Image objects). A toolchain is defined in an MIK configuration file, also known as an .ini file since that's the format the files take. All the .ini file contains is groups of configuration settings for your toolchain. MIK configuration files can also contain some comment lines that begin with a semicolon (;). These lines are ignored by MIK and really only function as inline documentation within the .ini file. You can also comment out a line to disable a configuration setting.

The .ini file below is the one that we'll be using in this tutorial. Even though this section is titled "Create and .ini file", you will only need to edit this one to run MIK, not create a new one. Specifically, you will need to change

  1. the path to your input directory (where the image files that were in the sample data zip file are),
  2. the path to your output directory,
  3. the path to your log file.

Different operating systems define paths differently. The .ini file below contains paths Linux paths, which look like this:

temp_directory = "/tmp/miktutorial_temp"

The values for the input_directory, output_directory, and path_to_log settings will need to be compatible with your operating system. For example, on Windows, paths look like this:

temp_directory = "c:\temp\miktutorial_temp"

whereas on a Mac they look like this:

temp_directory = "/Users/mark/miktutorial_temp"

Here is the .ini file as it is provided in the tutorial sample data. Assuming that MIK is installed correctly on your computer, and that you have copied tutorial_config.ini, tutorial_mappings.csv, and tutorial_metadata.csv into the same directory where MIK is installed, you should be able to run MIK after you have updated tutorial_config.ini with your own paths.

; MIK configuration file for the MIK Tutorial.

[CONFIG]
config_id = MIK tutorial
last_updated_on = "2016-02-03"
last_update_by = "Mark Jordan"

[FETCHER]
class = Csv
input_file = "tutorial_metadata.csv"
temp_directory = "/tmp/miktutorial_temp"
record_key = Identifier

[METADATA_PARSER]
class = mods\CsvToMods
mapping_csv_path = "tutorial_mappings.csv"

[FILE_GETTER]
class = CsvSingleFile
input_directory = "/home/mark/Downloads/mik_tutorial_data"
temp_directory = "/tmp/miktutorial_temp"
file_name_field = File

[WRITER]
class = CsvSingleFile
preserve_content_filenames = true
output_directory = "/tmp/miktutorial_output"
; Note that you will need to adjust the path to your system's php executable.
postwritehooks[] = "/usr/bin/php extras/scripts/postwritehooks/validate_mods.php"

[MANIPULATORS]
metadatamanipulators[] = "SplitRepeatedValues|Subjects|/subject/topic|;"

[LOGGING]
path_to_log = "/tmp/miktutorial_output/mik.log"

Activity: Editing the .ini file

  1. Open tutorial_config.ini in your text editor.
  2. In the [FETCHER] section, modify the value of "temp_directory" so that....
  3. In the [FILE_GETTER] section, modify the value of "temp_directory" so that is has the same value as...
  4. In the [WRITER] section, modify the value of "output_directory" so that.....
  5. In the [LOGGING] section, modify the value of "path_to_log" so that ...
  6. Save your file in the same directory in the MIK installation directory.

Step 4: Test (--checkconfig and MODS.xml)

To be honest, there's a lot that can go wrong here. MIK .ini files are a little complex.

While it is not absolutely necessary, you can (and in fact should) check your MIK configuration by running MIK and passing it the --checkconfig option in addition to telling it which .ini file to use. Running MIK with these two options looks like this:

Running MIK in checkconfig mode

As you can see, MIK provides some simple feedback indicating whether it encountered any problems with your .ini file.

Activity: Test your configuration

php mik --config tutorial_config.ini --checkconfig all

Step 5: Run MIK to create your import packages

If your configuration check didn't reveal any problems, you are ready to run MIK and generate your Islandora import packages:

Running MIK

When MIK finishes, it tells you where the packages are, where the MIK log is, and how long MIK took to run:

MIK after finished running

If you look in the output directory indicated in MIK's message, you will see an XML file corresponding to each of the images that we started with:

Output after MIK finishes running

This set of files is what you load into Islandora to create basic image objects. There is one thing you need to do before loading the files into Islandora, however: move (don't delete) the log files mik.log and problem_records.log. You don't want to load those into Islandora. After you move the log files, the only things that should be in your output directory are image files, each with a corresponding XML file:

Output after log files have been deleted

The XML files are MODS documents, which MIK has created from the original CSV metadata. Each MODS document describes on image.

Activity: Create your Islandora import packages

php mik --config tutorial_config.ini

If running MIK didn't cause any error messages to appear, and you see the message indicating where your packages are, you are done using MIK! The final two steps in the tutorial are less about MIK than about the content you created using MIK.

Step 6: Perform QA on packages

It is a very, very good idea to perform some quality assurance on the Islandora import packages before you import them into Islandora. At a minimum, you should:

  • open the problem_records.log file in your text editor to see if anything appears in it (it is normal for there to be a single line that indicates when MIK ran, etc.)
  • make sure there are no extra files or directories in the MIK output directory
  • open a random sample of MODS XML files to make sure they look like they should, e.g., are all the fields that were included in your mappings file in the MODS?
  • make sure your MODS documents are valid.

If your MIK configuration file included the line

postwritehooks[] = "/usr/bin/php extras/scripts/postwritehooks/validate_mods.php"

in its [WRITER] section (which it did unless you commented it out or removed it), MIK has already validated all of your MODS documents. Looking in the mik.log file will reveal whether any of the MODS files didn't validate.

Activity: Open mik.log to make sure your MODS files validated

Open mik.log in your text editor. You should see lines in it like

postwritehooks/validate_mods.php.INFO: MODS file validates {"MODS file":"/tmp/miktutorial_output/IMG_1410.xml"} []

If you see lines that indicate there was a problem, it is likely that the path to PHP defined in the post-write hook entry in your .ini file does not match the path to PHP on your computer. (@todo: create a cookbook entry on how to troubleshoot this and link it from here?)

Step 7: Load your packages into Islandora

The easiest way to load your content into Islandora is to zip it up and upload the resulting zip file using Islandora's web interface. When you create your zip file, make sure that your zip file contains only the output files and no subdirectories. Also note that you should move or delete all log files that MIK puts in the output directory before creating your zip file.

Once you have your zip file, you can upload it by going to your collection in Islandora, then go to the 'Manage' tab, and then to the Collection subtab, where you will find "Batch Import Objects" link. This is where you upload your zip file.

In the screenshot below, I have used "mik" and "mik tutorial" for the Islandora collection name, zip file name, and Islandora namespace. That's just a coincidence. You can use whatever collection name, zip file name, and namespace you want. Note however that if you are importing the packages created during this tutorial, you do need to import the objects into a collection that allows the "Islandora Basic Image Content Model".

Importing images

If you import the basic image objects created throughout this tutorial, they will look something like this:

Imported objects

That's it!

To summarize what we've covered in this tutoial:

  1. Create your metadata CSV file
  2. Create your mappings file
  3. Create an .ini file
  4. Test our configuration
  5. Run MIK to create Islandora import packages
  6. Perform quality checks on the packages
  7. Import your packages into Islandora

If you experienced problems with any of those tasks and need some help, open an issue or ask for help in the Islandora Google Group.

Things to try on your own

Now that you can run MIK and you know how to use its output, you may want to try some of the following activities.

  • Modify the values in the metadata file (but for now, keep the same column structure) and rerun MIK. Open the XML files in your text editor that MIK creates to see your values in the MODS.
  • Add a new field to the mappings file that will have the same value for all objects. For example, add the following line to the end of the file: null3,"<accessCondition type=""use and reproduction"">Images are in the public domain.</accessCondition>"
  • Add a new column to the CSV metadata file and populate it with different values for each image. Then add a mapping for the new field using the special %value% token so that your MODS will use the value of the new field for each image.
  • Browse the MIK Cookbook for solutions to specific types of tasks.
  • Learn about some of MIK's plugins, called "manipulators". The Random Set manipulator is really useful for testing your configuration and your source content, and the Piratize Abstract manipulator converts the content of <abstract> elements in your MODS XML to pirate speak. Not that you'd really want to do that, but the Piratize Abstract manipulator illustrates a potentially useful capability of MIK (using a web service to help enhance your metadata) and, well, because pirate speak.
Clone this wiki locally