DICOM Media Exporter

When preparing medical imaging data for deep learning, researchers often need to extract all imagery from a large dataset of DICOM files. DICOM Media Exporter (DME) makes this quick and easy. DME extracts all image and video data from a collection of DICOM files and saves them to disc. The tool also exports all metadata contained in the DICOM files in json format.

Quick start

All of DME's settings are passed via a yaml configuration file (see config.yml section 👇 for details). This config includes the location of your DICOM files, where you want to save the outputs, and details about the output format. Run DME using

python3 dme.py config.yml

This command will find all DICOM files in a specified directory (optionally recursively) and will export them as media files to a specified output location. If used recursively, the file structure of the input directory will be mirrored in the output directory.

Features and limitations

Tested for Ultrasound Image and Ultrasound Video DICOM files, but may work for others. Please submit an issue if you run into an unsupported DICOM type.
Uses multiprocessing to speed things up ⚡. This feature has not been tested on Windows.
Currently DICOM Media Exporter only exports still images and videos. If your application requires exporting other media types (audio for example), please submit an issue.
DME uses Pillow to encode images and FFmpeg to encode video. DME supports any output format options supported by those libraries by listing key-value pair options in the configuration file.

Installation

DME does not currently provide a package installation method. To install, simply clone this repository

git clone https://github.com/MedAI-Clemson/dicom-media-exporter.git

and install the python dependencies listed in requirements.txt file. Install them (in a virtual environment) using

python -m pip install -r requirements.txt

Note: DME depends on FFmpeg (see FFmpeg section below)

config.yml

num_workers: 4 # number of parallel processes used for the conversion. Use 0 to avoid creating child processes. 
dicom:
  dir: <input directory path> # directory containing DICOM files
  recursive: true # whether to recursively search under `dir`
media:
  dir: <output directory path> # directory to write media files
  overwrite: false # whether to overwrite existing media files with matching filepaths. Skips matching files if false.
metadata: 
  file: <output file path> # JSON-lines file (usually *.jsonl) containing DICOM metadata
  append: false # whether to append if `file` already exists. If false and `file` exists, raises an error to avoid data duplication
config: 
  file: <output file path> # location to save a copy of provided config file for reproducibility
video: 
  extension: ".mp4" # file extension for video files
  save_method_kwargs:
    encoding_format: "h264" # the encoding format passed to ffmpeg
    encoding_args: # any additional encoding options passed to ffmpeg
      pix_fmt: "yuv420p"
      options:
        crf: "17"
image:
  extension: ".png" # file extension for image files. Image encoding is inferred from extension.
  save_method_kwargs:
    pil_writer_params: {} # parameters for the Pillow image save function corresponding to `extension`

Metadata fields

DME saves all DICOM fields to the provided metadata file in JSON Lines format. The resulting file will have one json object per line corresponding to a single DICOM file. These json objects contain key-value pairs representing the DICOM metadata. The key names match the full field name provided in the DICOM file which should correspond to those defined in the DICOM standard.

Note: DME ignores DICOM fields with value representations "SQ", "OB", and "OW". These are excluded due to their large size and because these fields usually store media, which is saved separately by DME.

In addition to the DICOM fields, the exported metadata includes the following additional key-value pairs for linking to the input data and exported media:

dicom_file: the location of the input dicom file relative to the provided root DICOM directory.
media_file: the location of the output media file relative to the provided root media directory.
media_type: the type of output media, either video or image

FFmpeg

DME uses FFmpeg to encode DICOM multi-frame images as video files. Before using DME, FFmpeg must be installed and accessible via the $PATH environment variable.

There are a variety of ways to install FFmpeg:

The official download links.
Your package manager of choice (e.g. sudo apt install ffmpeg on Debian/Ubuntu, brew install ffmpeg on OS X, etc.).
For Palmetto Cluster users, FFmpeg can be loaded using module load ffmpeg/<version>.

Regardless of how FFmpeg is installed, you can check if your environment path is set correctly by running the ffmpeg command from the terminal, in which case the version information should appear, as in the following example (truncated for brevity):

$ ffmpeg
ffmpeg version 4.4.1 Copyright (c) 2000-2021 the FFmpeg developers
  built with gcc 9.5.0 (Spack GCC)

Note: The actual version information displayed here may vary from one system to another; but if a message such as ffmpeg: command not found appears instead of the version information, FFmpeg is not properly installed.

High priority TODO items

Support use of Python API to export files
Support the export of 3d data formats
Support the export of video/3d as folders with image/slice frames
Test on Windows machine
Use proper logger
Support audio export
Support conversion of database from Orthanc server

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.gitignore		.gitignore
Readme.md		Readme.md
config.yml		config.yml
dme.py		dme.py
job.sh		job.sh
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DICOM Media Exporter

Quick start

Features and limitations

Installation

config.yml

Metadata fields

FFmpeg

High priority TODO items

About

Releases

Packages

Languages

MedAI-Clemson/dicom-media-exporter

Folders and files

Latest commit

History

Repository files navigation

DICOM Media Exporter

Quick start

Features and limitations

Installation

config.yml

Metadata fields

FFmpeg

High priority TODO items

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages