Skip to content

Managing (or ingesting) Data Files in HADatAc

Marcello Bax edited this page Apr 5, 2018 · 5 revisions

By using HADatAc on the menu entry “Manage Data Files”, we usually process the following files in order:

  1. STD
  2. PID (after STD (if any))
  3. SID (after STD (if any))
  4. MAP (after PID and SID files are loaded)
  5. SDD (any time before ACQ, even before STD)
  6. ACQ (after STD, PID, SID, MAP (if any), SDD)
  7. DA (after ACQ)

Note: after clicking on “Manage Data Files”, and before starting the ingestion process you might need to click "Start Auto-Annotator" button, on the top of the screen.

1. Load STD file for a given testing study

Verification: Go to "home" > "manage studies" and inspect the list of studies and look for an entry for the study in STD.

2. Load PID file for same STD, if any

PID files are used to include an object collection of subjects inside of a given study.

Verification: Go to "home" > "manage studies". Inspect the list of studies and look for an entry for the study in STD. On the study entry, verify that “#OCs” is greater than “0”  (zero). If so, select “objects” in the entry.

From the list of entries, verify if one of them is named “Study Population (...)” and that the number of objects in the object collection corresponds to the number of subjects in the PID file

3. Load SID file for same STD, if any

SID files are used to include an object collection of samples inside of a given study.

Verification:  Go to home > manage studies. Inspect the list of studies and look for an entry for the study in STD. On the study entry, verify that “#OCs” is greater than “0”  (zero). If so, select “objects” in the entry. From the list of entries, verify if one of them is named “Sample Collection (...)” and that the number of objects in the object collection corresponds to the number of samples in the SID file.

Verification: Visualize study in “Study Management” and verify if object collection “Sample” was created and populated accordingly.

4. Load MAP file for same PIDs and SIDs,

Verification: Go to home > Search Study. Select the study of interest (please note that you may need to press the “update” button on the top-left part of the screen to update the list of studies). Select “samples” and verify that samples are connected to subjects, if there is a case for SIDs to be connected to PIDs.

5. Load SSD files for each kind of data acquisition

(i.e., EPI data, LAB data) that is going to be ingested under current study

Verification: Visualize study to verify 9, 10, 11 and 12

6. Load EPI files (ACQ-EPI file)

Verification: perform a faceted search to verify 14

7. Load LAB file (ACQ-LAB file)

Verification: perform a faceted search to verify 16

8. Load the DA file

Data Preparation and Ingestion

Verification: When processing the DA files, you need to confirm the ingestion is ready by going to the prepare ingestion page.

Once files are uploaded, they can either be automatically ingested in case the infrastructure’s knowledge base knows how to process the file, or may guide scientists in the process of telling the system how to ingest the data.

Data Owner Guide

  1. Installation
    1.1. Installing for Linux (Production)
    1.2. Installing for Linux (Development)
    1.3. Installing for MacOS (Development)
    1.4. Deploying with Docker (Production)
    1.5. Deploying with Docker (Development)
    1.6. Installing for Vagrant under Windows
    1.7. Upgrading
    1.8. Starting HADatAc
    1.9. Stopping HADatAc
  2. Setting Up
    2.1. Software Configuration
    2.2. Knowledge Graph Bootstrap
    2.2.1. Knowledge Graph
    2.2.2. Bootstrap without Labkey
    2.2.3. Bootstrap with Labkey
    2.3. Config Verification
  3. Using HADatAc
    3.1. Initial Page
    3.1.1. Home Button
    3.1.2. Sandbox Mode Button
    3.2. File Ingestion
    3.2.1. Ingesting Study Content
    3.2.2. Manual Submission of Files
    3.2.3. Automatic Submission of Files
    3.2.4. Data File Operations
    3.3. Manage Working Files 3.3.1. [Create Empty Semantic File from Template]
    3.3.2. SDD Editor
    3.3.3. DD Editor
    3.4. Manage Metadata
    3.4.1. Manage Instrument Infrastructure
    3.4.2. Manage Deployments 3.4.3. Manage Studies
    3.4.4. [Manage Object Collections]
    3.4.5. Manage Streams
    3.4.6. Manage Semantic Data Dictionaries
    3.4.7. Manage Indicators
    3.5. Data Search
    3.5.1. Data Faceted Search
    3.5.2. Data Spatial Search
    3.6. Metadata Browser and Search
    3.7. Knowledge Graph Browser
    3.8. API
    3.9. Data Download
  4. Software Architecture
    4.1. Software Components
    4.2. The Human-Aware Science Ontology (HAScO)
  5. Metadata Files
    5.1. Deployment Specification (DPL)
    5.2. Study Specification (STD)
    5.3. Semantic Study Design (SSD)
    5.4. Semantic Data Dictionary (SDD)
    5.5. Stream Specification (STR)
  6. Content Evolution
    6.1. Namespace List Update
    6.2. Ontology Update
    6.3. [DPL Update]
    6.4. [SSD Update]
    6.5. SDD Update
  7. Data Governance
    7.1. Access Network
    7.2. User Status, Categories and Access Permissions
    7.3. Data and Metadata Privacy
  8. HADatAc-Supported Projects
  9. Derived Products and Technologies
  10. Glossary
Clone this wiki locally