From e2c8845f4976c73465bcbee3518e4851fb9bb9ef Mon Sep 17 00:00:00 2001 From: Ethan Davis Date: Mon, 10 Feb 2020 15:33:35 -0700 Subject: [PATCH 1/2] Add new integer types in CF 2.2 Data Types (#243) --- ch02.adoc | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/ch02.adoc b/ch02.adoc index 630aec9b..8c69b6be 100644 --- a/ch02.adoc +++ b/ch02.adoc @@ -13,14 +13,14 @@ NetCDF files should have the file name extension "**`.nc`**". === Data Types -The netCDF data types **`string`**, **`char`**, **`byte`**, **`short`**, -**`int`**, **`float`** or **`real`**, and **`double`** are all acceptable. +The netCDF data types **`string`**, **`char`**, **`byte`**, **`unsigned byte`**, **`short`**, **`unsigned short`**, +**`int`**, **`unsigned int`**, **`int64`**, **`unsigned int64`**, +**`float`** or **`real`**, and **`double`** are all acceptable. The **`string`** type is only available in files using the netCDF version 4 (netCDF-4) format. The **`char`** and **`string`** types are not intended for numeric data. -One byte numeric data should be stored using the **`byte`** data type. -All integer types are treated by the netCDF interface as signed. -It is possible to treat the **`byte`** type as unsigned by using the NUG +One byte numeric data should be stored using the **`byte`** or **`unsigned byte`** data types. +It is possible to treat the **`byte`** and **`short`** types as unsigned by using the NUG convention of indicating the unsigned range using the **`valid_min`**, **`valid_max`**, or **`valid_range`** attributes. From ca8721489b9690e672f87ae254567e244132adad Mon Sep 17 00:00:00 2001 From: Ethan Davis Date: Fri, 28 Feb 2020 12:31:38 -0700 Subject: [PATCH 2/2] Update use of phrase "must be type integer" to "must have an integer type". Add explicit list of integer types in section 2.2. --- apph.adoc | 47 +++++++++++++++++++++++++++++++++++++---------- ch02.adoc | 5 +++++ ch09.adoc | 9 +++++++-- 3 files changed, 49 insertions(+), 12 deletions(-) diff --git a/apph.adoc b/apph.adoc index dafdd83d..1fc0026d 100644 --- a/apph.adoc +++ b/apph.adoc @@ -357,7 +357,13 @@ where    rowStart(i) = rowStart(i-1) + row_size(i-1) if i > 0 ---- -The variable, **`row_size`** , is the count variable containing the length of each time series feature.   It is identified by having an attribute with name `**sample_dimension**` whose value is name of the sample dimension ( **`obs`** in this example). The sample dimension could optionally be the netCDF unlimited dimension. The variable bearing the `**sample_dimension**` attribute must have the instance dimension ( **`station`** in this example) as its single dimension, and must be of type integer.   This variable implicitly partitions into individual instances all variables that have the sample dimension. The auxiliary coordinate variables **`lat`** , **`lon`** , **`alt`** and **`station_name`** are station variables. +The variable, **`row_size`** , is the count variable containing the length of each time series feature. +It is identified by having an attribute with name `**sample_dimension**` whose value is name of the sample dimension ( **`obs`** in this example). +The sample dimension could optionally be the netCDF unlimited dimension. +The variable bearing the `**sample_dimension**` attribute must have the instance dimension ( **`station`** in this example) +as its single dimension, and must have an integer type. +This variable implicitly partitions into individual instances all variables that have the sample dimension. +The auxiliary coordinate variables **`lat`** , **`lon`** , **`alt`** and **`station_name`** are station variables. ==== @@ -418,7 +424,12 @@ When time series with different lengths are written incrementally, the indexed r ---- The humidity(o) and temp(o) data are associated with the coordinate values time(o), lat(i), lon(i), and alt(i), where i = stationIndex(o) is a zero-based index indicating which time series. Thus, time(0), humidity(0) and temp(0) belong to the element of the **`station`** dimension that is indicated by **`stationIndex(0)`** ; time(1), humidity(1) and temp(1) belong to element **`stationIndex(1)`** of the **`station`** dimension, etc. -The variable, **`stationIndex`** , is identified as the index variable by having an attribute with name of `**instance_dimension**` whose value is the instance dimension ( **`station`** in this example).  The variable bearing the `**instance_dimension**` attribute must have the sample dimension ( **`obs`** in this example) as its single dimension, and must be type integer. This variable implicitly assigns the station to each value of any variable having the sample dimension. The sample dimension need not be the netCDF unlimited dimension, though it commonly is. +The variable, **`stationIndex`** , is identified as the index variable by having an attribute with name of `**instance_dimension**` whose value is the instance dimension ( **`station`** in this example). +The variable bearing the `**instance_dimension**` attribute +must have the sample dimension ( **`obs`** in this example) as its single dimension, +and must have an integer type. +This variable implicitly assigns the station to each value of any variable having the sample dimension. +The sample dimension need not be the netCDF unlimited dimension, though it commonly is. ==== @@ -617,7 +628,9 @@ When the number of vertical levels for each profile varies, and one can control ---- The pressure(o), temperature(o), and humidity(o) data is associated with the coordinate values time(i), z(o), lat(i), and lon(i), where i indicates which profile. All elements for one profile are contiguous along the sample dimension. The sample dimension (obs) may be the unlimited dimension or not. All variables that have the instance dimension (profile) as their single dimension are considered to be information about the profiles. -The count variable (row_size) contains the number of elements for each profile, and is identified by having an attribute with name "sample_dimension" whose value is the sample dimension being counted. It must have the profile dimension as its single dimension, and must be type integer. The elements are associated with the profile using the same algorithm as in H.2.4. +The count variable (row_size) contains the number of elements for each profile, and is identified by having an attribute with name "sample_dimension" whose value is the sample dimension being counted. +It must have the profile dimension as its single dimension, and must have an integer type. +The elements are associated with the profile using the same algorithm as in H.2.4. ==== @@ -682,7 +695,9 @@ When the number of vertical levels for each profile varies, and one cannot write attributes:    :featureType = "profile"; ---- -The pressure(o), temperature(o), and humidity(o) data are associated with the coordinate values time(i), z(o), lat(i), and lon(i), where i indicates which profile. The sample dimension (obs) may be the unlimited dimension or not. The profile index variable (parentIndex) is identified by having an attribute with name of "instance_dimension" whose value is the profile dimension name. It must have the sample dimension as its single dimension, and must be type integer. Each value in the profile index variable is the zero-based profile index that the element belongs to. The elements are associated with the profiles using the same algorithm as in H.2.5. +The pressure(o), temperature(o), and humidity(o) data are associated with the coordinate values time(i), z(o), lat(i), and lon(i), where i indicates which profile. The sample dimension (obs) may be the unlimited dimension or not. The profile index variable (parentIndex) is identified by having an attribute with name of "instance_dimension" whose value is the profile dimension name. +It must have the sample dimension as its single dimension, and must have an integer type. +Each value in the profile index variable is the zero-based profile index that the element belongs to. The elements are associated with the profiles using the same algorithm as in H.2.5. ==== @@ -868,7 +883,9 @@ When the number of elements for each trajectory varies, and one can control the ---- The O3(o) and NO3(o) data are associated with the coordinate values time(o), lat(o), lon(o), and alt(o). All elements for one trajectory are contiguous along the sample dimension. The sample dimension (obs) may be the unlimited dimension or not. All variables that have the instance dimension (trajectory) as their single dimension are considered to be information about that trajectory. -The count variable (row_size) contains the number of elements for each trajectory, and is identified by having an attribute with name "sample_dimension" whose value is the sample dimension being counted. It must have the trajectory dimension as its single dimension, and must be type integer. The elements are associated with the trajectories using the same algorithm as in H.2.4. +The count variable (row_size) contains the number of elements for each trajectory, and is identified by having an attribute with name "sample_dimension" whose value is the sample dimension being counted. +It must have the trajectory dimension as its single dimension, and must have an integer type. +The elements are associated with the trajectories using the same algorithm as in H.2.4. ==== @@ -929,7 +946,9 @@ When the number of elements at each trajectory vary, and the elements cannot be ---- The O3(o) and NO3(o) data are associated with the coordinate values time(o), lat(o), lon(o), and alt(o). All elements for one trajectory will have the same trajectory index value. The sample dimension (obs) may be the unlimited dimension or not. -The index variable (trajectory_index) is identified by having an attribute with name of "instance_dimension" whose value is the trajectory dimension name. It must have the sample dimension as its single dimension, and must be type integer. Each value in the trajectory_index variable is the zero-based trajectory index that the element belongs to. The elements are associated with the trajectories using the same algorithm as in H.2.5. +The index variable (trajectory_index) is identified by having an attribute with name of "instance_dimension" whose value is the trajectory dimension name. +It must have the sample dimension as its single dimension, and must have an integer type. +Each value in the trajectory_index variable is the zero-based trajectory index that the element belongs to. The elements are associated with the trajectories using the same algorithm as in H.2.5. ==== @@ -1196,9 +1215,13 @@ When the number of profiles and levels for each station varies, one can use a ra ---- The pressure(o), temperature(o), and humidity(o) data for element o of profile p at station i are associated with the coordinate values time(p), z(o), lat(i), and lon(i). -The index variable (station_index) is identified by having an attribute with name of instance_dimension whose value is the instance dimension name (station in this example). The index variable must have the profile dimension as its sole dimension, and must be type integer. Each value in the index variable is the zero-based station index that the profile belongs to i.e. profile p belongs to station i=station_index(p), as in section H.2.5. +The index variable (station_index) is identified by having an attribute with name of instance_dimension whose value is the instance dimension name (station in this example). +The index variable must have the profile dimension as its sole dimension, and must have an integer type. +Each value in the index variable is the zero-based station index that the profile belongs to i.e. profile p belongs to station i=station_index(p), as in section H.2.5. -The count variable (row_size) contains the number of elements for each profile, which must be written contiguously. The count variable is identified by having an attribute with name sample_dimension whose value is the sample dimension (obs in this example) being counted. It must have the profile dimension as its sole dimension, and must be type integer. The number of elements in profile p is recorded in row_size(p), as in section H.2.4. The sample dimension need not be the netCDF unlimited dimension,  though it commonly is. +The count variable (row_size) contains the number of elements for each profile, which must be written contiguously. The count variable is identified by having an attribute with name sample_dimension whose value is the sample dimension (obs in this example) being counted. +It must have the profile dimension as its sole dimension, and must have an integer type. +The number of elements in profile p is recorded in row_size(p), as in section H.2.4. The sample dimension need not be the netCDF unlimited dimension,  though it commonly is. ==== @@ -1408,7 +1431,11 @@ When the number of profiles and levels for each trajectory varies, one can use a ---- The pressure(o), temperature(o), and humidity(o) data for element o of profile p along trajectory i are associated with the coordinate values time(p), z(o), lat(p), and lon(p). -The index variable (trajectory_index) is identified by having an attribute with name of instance_dimension whose value is the instance dimension name (trajectory in this example). The index variable must have the profile dimension as its sole dimension, and must be type integer. Each value in the index variable is the zero-based trajectory index that the profile belongs to i.e. profile p belongs to trajectory i=trajectory_index(p), as in section H.2.5. +The index variable (trajectory_index) is identified by having an attribute with name of instance_dimension whose value is the instance dimension name (trajectory in this example). +The index variable must have the profile dimension as its sole dimension, and must have an integer type. +Each value in the index variable is the zero-based trajectory index that the profile belongs to i.e. profile p belongs to trajectory i=trajectory_index(p), as in section H.2.5. -The count variable (row_size) contains the number of elements for each profile, which must be written contiguously. The count variable is identified by having an attribute with name sample_dimension whose value is the sample dimension (obs in this example) being counted. It must have the profile dimension as its sole dimension, and must be type integer. The number of elements in profile p is recorded in row_size(p), as in section H.2.4. The sample dimension need not be the netCDF unlimited dimension,  though it commonly is. +The count variable (row_size) contains the number of elements for each profile, which must be written contiguously. The count variable is identified by having an attribute with name sample_dimension whose value is the sample dimension (obs in this example) being counted. +It must have the profile dimension as its sole dimension, and must have an integer type. +The number of elements in profile p is recorded in row_size(p), as in section H.2.4. The sample dimension need not be the netCDF unlimited dimension,  though it commonly is. ==== diff --git a/ch02.adoc b/ch02.adoc index 8c69b6be..f99f3699 100644 --- a/ch02.adoc +++ b/ch02.adoc @@ -23,6 +23,11 @@ One byte numeric data should be stored using the **`byte`** or **`unsigned byte` It is possible to treat the **`byte`** and **`short`** types as unsigned by using the NUG convention of indicating the unsigned range using the **`valid_min`**, **`valid_max`**, or **`valid_range`** attributes. +In many situations, any integer type may be used. +When the phrases "an integer type" or "any integer type" are used in this document, +it should be understood to mean **`byte`**, **`unsigned byte`**, +**`short`**, **`unsigned short`**, +**`int`**, **`unsigned int`**, **`int64`**, or **`unsigned int64`**. Strings in variables may be represented one of two ways - as atomic strings or as character arrays. diff --git a/ch09.adoc b/ch09.adoc index bb5d17c3..32eb8124 100644 --- a/ch09.adoc +++ b/ch09.adoc @@ -321,7 +321,8 @@ Table 9.4. The storage of data using the contiguous ragged representation (subsc -In this representation, the file contains a **count variable** , which must be of type integer and +In this representation, the file contains a **count variable** , +which must be an integer type and @@ -462,7 +463,11 @@ Table 9.4 The storage of data using the indexed ragged representation (subscript -In this representation, the file contains an **index variable** , which must be of type integer, and must have the sample dimension as its single dimension. The index variable contains the zero-based index of the feature to which each element belongs. This representation is identifiable by the presence of an attribute, **`instance_dimension`** , on the index variable, which names the dimension of the instance variables. For those indices of the sample dimension, into which data have not yet been written, the index variable should be pre-filled with missing values. +In this representation, the file contains an **index variable** , +which must be an integer type, and must have the sample dimension as its single dimension. +The index variable contains the zero-based index of the feature to which each element belongs. +This representation is identifiable by the presence of an attribute, **`instance_dimension`** , on the index variable, which names the dimension of the instance variables. +For those indices of the sample dimension, into which data have not yet been written, the index variable should be pre-filled with missing values.