Skip to content

5. Metadata Files

Paulo Pinheiro edited this page Sep 2, 2019 · 22 revisions

HADatAc uses five types of metadata files listed below. For each one of metadata file types, it is listed below their name convention, description, and possible error messages generated during their ingestion. Error messages during the ingestion of metadata files are written in the file's log, which are displayed when selecting the "log" button next to the file in the Manage Data File Ingestion option (HADatAc Home > Manage Data File Ingestion).

Deployment File (DPL)

Name convention: Starts with: "DPL-" and ends with ".xlsx"
Ingestion restriction: Can be ingested any time
Download template:
Download example:

Used to include platform, instrument, detector and deployment information into HADatAc. The information in this kind of file tends to be study independent in the sense that platforms, instruments and detectors can be used in multiple studies.

Study File (STD)

Name convention: Starts with: "STD-" and ends with ".xlsx" or ".csv"
Ingestion restriction: Can be ingested any time Download template:
Download example:

Used to include one or more study basic information into HADatAc. Information in STD files include things like PI's name and address. Study's aims, short description and long description.

Semantic Study Design File (SSD)

Name convention: Starts with: "SSD-" and ends with ".xlsx"
Ingestion restriction: Needs to be ingested after its corresponding STD file has been ingested Download template:
Download example:

Used to specify and identify object collections related to studies. Subjects and samples are examples of such collections. Inter-relationship among object collections are also specified in SSD files. For example, if the main group of subjects are infants, and the group of parents is another group, SSD can specify the objects in both collections, and specify how parents are mapped to infants. We suggest following this instruction to generate SSD files.

Semantic Data Dictionary File (SDD)

Name convention: Starts with: "SSD-" and ends with ".xlsx"
Ingestion restriction: Can be ingested any time
Download template:
Download example:

Used to guide the process of ingesting file content into HADatAc databases. A single SDD file can be used to ingest the content from multiple files as long these files share the same content type. We suggest following this instruction to create SDD files.

Stream Specification (STR) Files

Name convention: Starts with: STR- and ends with ".xlsx" or ".csv"
Ingestion restriction: Needs to be ingested after associated STD, SDD and DPL files have been ingested
Download template:
Download example:

An STR file identifies all the following about a given data file:

  • What is the associated study (from a study coming from an STD file);
  • Which SDD is going to be used to ingest the content of the file (from an SDD file);
  • What is the associated deployment (from a deployment coming from a DPL file)
  • Which objects from associated study are associated with the values in the data file;
  • Who is the owner of the file; and
  • Who should have access to the file.

We suggest following this instruction to create STR files.

Error messages:

  • "Error in DataAcquisitionGenerator: The specified owner email [email protected] is not a valid user!": this means that the email in the STR file does not match the email address of any user registered in HADatAc as described in Section 5.2.3.

Data Owner Guide

  1. Installation
    1.1. Installing for Linux (Production)
    1.2. Installing for Linux (Development)
    1.3. Installing for MacOS (Development)
    1.4. Deploying with Docker (Production)
    1.5. Deploying with Docker (Development)
    1.6. Installing for Vagrant under Windows
    1.7. Upgrading
    1.8. Starting HADatAc
    1.9. Stopping HADatAc
  2. Setting Up
    2.1. Software Configuration
    2.2. Knowledge Graph Bootstrap
    2.2.1. Knowledge Graph
    2.2.2. Bootstrap without Labkey
    2.2.3. Bootstrap with Labkey
    2.3. Config Verification
  3. Using HADatAc
    3.1. Initial Page
    3.1.1. Home Button
    3.1.2. Sandbox Mode Button
    3.2. File Ingestion
    3.2.1. Ingesting Study Content
    3.2.2. Manual Submission of Files
    3.2.3. Automatic Submission of Files
    3.2.4. Data File Operations
    3.3. Manage Working Files 3.3.1. [Create Empty Semantic File from Template]
    3.3.2. SDD Editor
    3.3.3. DD Editor
    3.4. Manage Metadata
    3.4.1. Manage Instrument Infrastructure
    3.4.2. Manage Deployments 3.4.3. Manage Studies
    3.4.4. [Manage Object Collections]
    3.4.5. Manage Streams
    3.4.6. Manage Semantic Data Dictionaries
    3.4.7. Manage Indicators
    3.5. Data Search
    3.5.1. Data Faceted Search
    3.5.2. Data Spatial Search
    3.6. Metadata Browser and Search
    3.7. Knowledge Graph Browser
    3.8. API
    3.9. Data Download
  4. Software Architecture
    4.1. Software Components
    4.2. The Human-Aware Science Ontology (HAScO)
  5. Metadata Files
    5.1. Deployment Specification (DPL)
    5.2. Study Specification (STD)
    5.3. Semantic Study Design (SSD)
    5.4. Semantic Data Dictionary (SDD)
    5.5. Stream Specification (STR)
  6. Content Evolution
    6.1. Namespace List Update
    6.2. Ontology Update
    6.3. [DPL Update]
    6.4. [SSD Update]
    6.5. SDD Update
  7. Data Governance
    7.1. Access Network
    7.2. User Status, Categories and Access Permissions
    7.3. Data and Metadata Privacy
  8. HADatAc-Supported Projects
  9. Derived Products and Technologies
  10. Glossary
Clone this wiki locally