-
Notifications
You must be signed in to change notification settings - Fork 10
Cookbook: Templated Metadata Parser
The Templated metadata parser creates MODS or DC XML from Twig templates. It differs from the InsertXmlFromTemplate metadata manipulator used with the CSV and CONTENTdm toolchains in that it generates an entire MODS or DC XML file, whereas the InsertXmlFromTemplate metadata manipulator only generates a single top-level MODS element.
The Templated metadata parser is a drop-in replacement for the mods\CsvToMods
and mods\CdmToMods
metadata parsers, and can be used with any toolchains those two metadata parsers can be. It does not use a mappings file, but instead inserts CSV or CONTENTdm metadata values directly into the template. This metadata parser has some advantages over mods\CsvToMods
and mods\CdmToMods
but it also has some limitations:
Parser | Pros | Cons |
---|---|---|
mods\CsvToMods and mods\CdmToMods
|
Can use configure-and-run metadata manipulators | Require mappings files, which can be tricky to create |
mods\CsvToMods and mods\CdmToMods
|
Use simple one-to-one mappings between source and output metadata structures | |
templated\Templated |
Avoids mappings files | Cannot use configure-and-run metadata manipulators (other than the SimpleReplaceTemplated manipulator) |
templated\Templated |
Allows the use of Twig's control structures and filters |
Using CSV input data like this:
Identifier,File,Title,Creator,Description
"image01","IMG_1410.JPG","Small boats in Havana Harbour on a sunney day","Jordan, Mark","Taken on vacation in Cuba."
"image02","IMG_2549.JPG","Manhatten Island","Jordan, Mark","Taken from the ferry from downtown New York to Highlands, NJ. Weather was windy."
"image03","IMG_2940.JPG","Looking across Burrard Inlet","Jordan, Mark","View from Deep Cove to Burnaby Mountain. Simon Fraser University is visible on the top of the mountain in the distance."
"image04","IMG_2958.JPG","Amsterdam waterfront in a picture","Jordan, Mark","Amsterdam waterfront on an overcast day."
"image05","IMG_5083.JPG","Alcatraz Island from Fisherman's Wharf","Jordan, Mark","2014-01-14","Taken from Fisherman's Wharf, San Francisco."
and a Twig template like this:
<?xml version="1.0"?>
<mods xmlns="http://www.loc.gov/mods/v3" xmlns:mods="http://www.loc.gov/mods/v3" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink">
<titleInfo>
<title>{{ Title }}</title>
</titleInfo>
<name type="personal">
<namePart>{{ Creator }}</namePart>
<role>
<roleTerm type="text">photographer</roleTerm>
</role>
</name>
<abstract>{{ Description }}</abstract>
<identifier type="local" displayLabel="Local identifier">{{ Identifier }}</identifier>
<typeOfResource>still image</typeOfResource>
</mods>
this metadata parser can generate MODS XML files like this:
<?xml version="1.0"?>
<mods xmlns="http://www.loc.gov/mods/v3" xmlns:mods="http://www.loc.gov/mods/v3" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink">
<titleInfo>
<title>Small boats in Havana Harbour on a sunney day</title>
</titleInfo>
<name type="personal">
<namePart>Jordan, Mark</namePart>
<role>
<roleTerm type="text">photographer</roleTerm>
</role>
</name>
<abstract>Taken on vacation in Cuba.</abstract>
<identifier type="local" displayLabel="Local identifier">image01</identifier>
<typeOfResource>still image</typeOfResource>
</mods>
Twig features such as control structures, functions, and whitespace control are available within the templates. For example, the template below uses Twig's if/elseif/else
control structure, its length
filter, its trim
and slice
functions, and a test for an empty string (elseif not Title|length
).
<?xml version="1.0"?>
<mods xmlns="http://www.loc.gov/mods/v3" xmlns:mods="http://www.loc.gov/mods/v3" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink">
<titleInfo>
{% if Title|length < 256 %}
<title>{{ Title|trim }}</title>
{% elseif not Title|length %}
<title>[no title]</title>
{% else %}
<title>{{ Title|slice(0,255) | trim }} [...]</title>
{% endif %}
</titleInfo>
<name type="personal">
<namePart>{{ Creator }}</namePart>
<role>
<roleTerm type="text">photographer</roleTerm>
</role>
</name>
<abstract>{{ Description }}</abstract>
<identifier type="local" displayLabel="Local identifier">{{ Identifier }}</identifier>
<typeOfResource>still image</typeOfResource>
</mods>
Using this metadata parser requires a couple of specific settings in your .ini file:
- The
[METADATA_PARSER] class
value must betemplated\Templated
- Instead of the
[METADATA_PARSER] mapping_csv_path
setting used by themods\CsvToMods
andmods\CdmToMods
metadata parsers, this one uses the[METADATA_PARSER] template
setting, whose value is the path to the Twig template.
[FETCHER]
class = Csv
input_file = templated_metadata.csv
temp_directory = /tmp/templated_temp
record_key = Identifier
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
; This is the only section of the .ini file that is
; specific to this metadata parser.
[METADATA_PARSER]
class = templated\Templated
template = templated_mods_twig.xml
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
[FILE_GETTER]
class = CsvSingleFile
input_directory = "/home/mark/Downloads/mik_tutorial_data"
temp_directory = /tmp/templated_temp
file_name_field = File
[WRITER]
class = CsvSingleFile
preserve_content_filenames = false
output_directory = /tmp/templated_output
datastreams[] = MODS
Metadata manipulators used by the mods\CsvToMods
and mods\CdmToMods
metadata parsers are not available to the templated\Templated
metadata parser. The only metadata manipulator that is available is SimpleReplaceTemplated, which as the name suggests lets you perform simple search and replace operations on the generated XML. It is registered in the .ini file like this:
[MANIPULATORS]
metadatamanipulators[] = "SimpleReplaceTemplated|/Island/|Peninsula"
; This metadata manipulator logs its operations, so you must include the path
; to the log in your .ini file.
[LOGGING]
path_to_manipulator_log = /tmp/templated_output/manipulator.log
If you want to write a custom metadata manipulator that uses PHP's DOM interface (or SimpleXML, or XSLT) to modify the XML, SimpleReplaceTemplated can be used as a model. Your manipulator's ->manipulate()
method would look this this:
/**
* @param string $input An XML file to be manipulated.
*
* @return string
* Manipulated string
*/
public function manipulate($input)
{
// Manipulate the XML using DOM here.
}
Put your manipulator class file in src/metadatamanipulators
, run composer dump-autoload
, and register the manipulator in your .ini file:
[MANIPULATORS]
metadatamanipulators[] = "MyCustomMetadataManipulator"
Content on the Move to Islandora Kit wiki is licensed under a Creative Commons Attribution 4.0 International License.