Skip to content

5. Metadata Files

Paulo Pinheiro edited this page Mar 10, 2019 · 22 revisions

HADatAc uses five types of metadata files listed below. For each one of metadata file types, it is listed below their name convention, description, and possible error messages generated during their ingestion. Error messages during the ingestion of metadata files are written in the file's log, which are displayed when selecting the "log" button next to the file in the File Management option (HADatAc Home > Data File Management).

5.1. Deployment File (DPL)

Name convention: Name starts with: DPL-. Ends with .xlsx

Used to include platform, instrument, detector and deployment information into HADatAc. The information in this kind of file tends to be study independent in the sense that platforms, instruments and detectors can be used in multiple studies.

5.2. Study File (STD)

Name convention: Name starts with: STD-. Ends with .xlsx or .csv

Used to include one or more study basic information into HADatAc. Information in STD files include things like PI's name and address. Study's aims, short description and long description.

5.3. Semantic Study Design Files (SSD)

Name convention: Name starts with: SSD-. Ends with .xlsx

Used to specify and identify object collections related to studies. Subjects and samples are examples of such collections. Inter-relationship among object collections are also specified in SSD files. For example, if the main group of subjects are infants, and the group of parents is another group, SSD can specify the objects in both collections, and specify how parents are mapped to infants. Instructions on how to generate SSD files are available here.

5.4. Semantic Data Dictionary File (SDD)

Name convention: Name starts with: SSD-. Ends with .xlsx

Used to guide the process of ingesting file content into HADatAc databases. A single SDD file can be used to ingest the content from multiple files as long these files share the same content type. Instructions on how to generate SSD files are available in Section 4.3.3.

5.5. Object Access Specification (OAS) Files

Name convention: Name starts with: OAS-. Ends with .xlsx or .csv

OAS files are to identify a common ‘prefix’ that is assigned to a given collection of data files. For instance, if File1.csv and File2.csv both can use SDD-Z.xls, then we can create an OAS file saying that ‘DA-Z’ is the prefix for SDD-Z.xls. In this case, we could rename File1 into DA-Z-1 and File2 into DA-Z-2 enabling the use of a common SDD to ingest both files.

OAS Files are also used to assign files to studies and to a given owner. Thus, we need an OAS file for each collection of files (including collections of 1 file) that share the same properties:

  • Type of data content
  • Owner
  • Study that it belongs to

Instructions on how to generate OAS files are available in Section 4.3.2.

Error messages:

  • "Error in DataAcquisitionGenerator: The specified owner email [email protected] is not a valid user!": this means that the email in the ACQ file does not match the email address of any user registered in HADatAc as described in Section 5.2.3.

Data Owner Guide

  1. Installation
    1.1. Installing for Linux (Production)
    1.2. Installing for Linux (Development)
    1.3. Installing for MacOS (Development)
    1.4. Deploying with Docker (Production)
    1.5. Deploying with Docker (Development)
    1.6. Installing for Vagrant under Windows
    1.7. Upgrading
    1.8. Starting HADatAc
    1.9. Stopping HADatAc
  2. Setting Up
    2.1. Software Configuration
    2.2. Knowledge Graph Bootstrap
    2.2.1. Knowledge Graph
    2.2.2. Bootstrap without Labkey
    2.2.3. Bootstrap with Labkey
    2.3. Config Verification
  3. Using HADatAc
    3.1. Initial Page
    3.1.1. Home Button
    3.1.2. Sandbox Mode Button
    3.2. File Ingestion
    3.2.1. Ingesting Study Content
    3.2.2. Manual Submission of Files
    3.2.3. Automatic Submission of Files
    3.2.4. Data File Operations
    3.3. Manage Working Files 3.3.1. [Create Empty Semantic File from Template]
    3.3.2. SDD Editor
    3.3.3. DD Editor
    3.4. Manage Metadata
    3.4.1. Manage Instrument Infrastructure
    3.4.2. Manage Deployments 3.4.3. Manage Studies
    3.4.4. [Manage Object Collections]
    3.4.5. Manage Streams
    3.4.6. Manage Semantic Data Dictionaries
    3.4.7. Manage Indicators
    3.5. Data Search
    3.5.1. Data Faceted Search
    3.5.2. Data Spatial Search
    3.6. Metadata Browser and Search
    3.7. Knowledge Graph Browser
    3.8. API
    3.9. Data Download
  4. Software Architecture
    4.1. Software Components
    4.2. The Human-Aware Science Ontology (HAScO)
  5. Metadata Files
    5.1. Deployment Specification (DPL)
    5.2. Study Specification (STD)
    5.3. Semantic Study Design (SSD)
    5.4. Semantic Data Dictionary (SDD)
    5.5. Stream Specification (STR)
  6. Content Evolution
    6.1. Namespace List Update
    6.2. Ontology Update
    6.3. [DPL Update]
    6.4. [SSD Update]
    6.5. SDD Update
  7. Data Governance
    7.1. Access Network
    7.2. User Status, Categories and Access Permissions
    7.3. Data and Metadata Privacy
  8. HADatAc-Supported Projects
  9. Derived Products and Technologies
  10. Glossary
Clone this wiki locally