Skip to content

YAML administration data sources_v1_0

Marcus Bakker edited this page Dec 20, 2021 · 1 revision

In this YAML file you can administrate your available data sources and score its quality. More information on data quality can be found here.

Sample file: data-sources-endpoints.yaml

Current version: version 1.1

File content

Name Type Required Description
version string yes Version of this data source administration file. The current version is 1.1.
file_type string yes Used to indicate what type of YAML file it is. Possible values: data-source-administration, technique-administration and group-administration. For data source administration the value should be: data-source-administration.
name string yes Describes for what type of assets you are describing the data sources for. E.g. endpoints. It is just a name which will be used in different places of the output.
platform string or list of strings yes Indicates the type of platform you are describing the data sources for. Possible values (in the list) are the MITRE ATT&CK platform values or 'all' to select all platforms: Windows, Linux, PRE, macOS, AWS, GCP, Azure, Azure AD, Office 365, SaaS, Network.
data_sources list with data source objects yes Contains all the data sources available. See the description of the data source object.
exceptions list with ATT&CK technique IDs no Contains a list of ATT&CK technique IDs.

Adding an ID result in removing that technique from any derived output. For example, it will be excluded when generating a technique YAML administration file using: python dettect.py ds -f sample-data/data-sources-endpoints.yaml --yaml
notes string no An optional field to include notes on this groups administration file.

Data source object

Name Type Required Description
data_source_name string yes The name of the data source according to MITRE ATT&CK. E.g. Process Creation.
date_registered date yyyy-mm-dd yes Date of registration of the data source in this YAML file.
date_connected date yyyy-mm-dd yes Date when the data source is connected to your security data lake. This date is used to draw a graph indicating the progress of connected data sources.
products list yes A list with one or more products where the data from the data source is located. E.g. Windows event log.
available_for_data_analytics boolean yes Indicates if the data source is available in such a way that it can be used within data analytics.
comment string yes An option to comment on this data source.

If you want to have a multiline comment in the Excel output. We recommend making use of |. For more info have a look at: https://yaml-multiline.info/.
data_quality data quality object yes The scores on the five different data quality dimensions. See the description of the data quality object.

Data quality object

The five data quality dimensions are explained here.

Name Type Required Description
device_completeness int yes Score between 0-5. Scoring this aspect is explained in a separate section.
data_field_completeness int yes Score between 0-5. Scoring this aspect is explained in a separate section.
timeliness int yes Score between 0-5. Scoring this aspect is explained in a separate section.
consistency int yes Score between 0-5. Scoring this aspect is explained in a separate section.
retention int yes Score between 0-5. Scoring this aspect is explained in a separate section.
Clone this wiki locally