Skip to content

Commit

Permalink
Add per-field metadata. (#49419)
Browse files Browse the repository at this point in the history
This PR adds per-field metadata that can be set in the mappings and is later
returned by the field capabilities API. This metadata is completely opaque to
Elasticsearch but may be used by tools that index data in Elasticsearch to
communicate metadata about fields with tools that then search this data. A
typical example that has been requested in the past is the ability to attach
a unit to a numeric field.

In order to not bloat the cluster state, Elasticsearch requires that this
metadata be small:
 - keys can't be longer than 20 chars,
 - values can only be numbers or strings of no more than 50 chars - no inner
   arrays or objects,
 - the metadata can't have more than 5 keys in total.

Given that metadata is opaque to Elasticsearch, field capabilities don't try to
do anything smart when merging metadata about multiple indices, the union of
all field metadatas is returned.

Here is how the meta might look like in mappings:

```json
{
  "properties": {
    "latency": {
      "type": "long",
      "meta": {
        "unit": "ms"
      }
    }
  }
}
```

And then in the field capabilities response:

```json
{
  "latency": {
    "long": {
      "searchable": true,
      "aggreggatable": true,
      "meta": {
        "unit": [ "ms" ]
      }
    }
  }
}
```

When there are no conflicts, values are arrays of size 1, but when there are
conflicts, Elasticsearch includes all unique values in this array, without
giving ways to know which index has which metadata value:

```json
{
  "latency": {
    "long": {
      "searchable": true,
      "aggreggatable": true,
      "meta": {
        "unit": [ "ms", "ns" ]
      }
    }
  }
}
```

Closes #33267
  • Loading branch information
jpountz authored Dec 18, 2019
1 parent 77d94ca commit 2d627ba
Show file tree
Hide file tree
Showing 32 changed files with 721 additions and 69 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -1229,11 +1229,11 @@ public void testFieldCaps() throws IOException {
assertEquals(2, ratingResponse.size());

FieldCapabilities expectedKeywordCapabilities = new FieldCapabilities(
"rating", "keyword", true, true, new String[]{"index2"}, null, null);
"rating", "keyword", true, true, new String[]{"index2"}, null, null, Collections.emptyMap());
assertEquals(expectedKeywordCapabilities, ratingResponse.get("keyword"));

FieldCapabilities expectedLongCapabilities = new FieldCapabilities(
"rating", "long", true, true, new String[]{"index1"}, null, null);
"rating", "long", true, true, new String[]{"index1"}, null, null, Collections.emptyMap());
assertEquals(expectedLongCapabilities, ratingResponse.get("long"));

// Check the capabilities for the 'field' field.
Expand All @@ -1242,7 +1242,7 @@ public void testFieldCaps() throws IOException {
assertEquals(1, fieldResponse.size());

FieldCapabilities expectedTextCapabilities = new FieldCapabilities(
"field", "text", true, false);
"field", "text", true, false, Collections.emptyMap());
assertEquals(expectedTextCapabilities, fieldResponse.get("text"));
}

Expand Down
17 changes: 10 additions & 7 deletions docs/reference/mapping/params.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -8,23 +8,24 @@ parameters that are used by <<mapping-types,field mappings>>:
The following mapping parameters are common to some or all field datatypes:

* <<analyzer,`analyzer`>>
* <<normalizer, `normalizer`>>
* <<mapping-boost,`boost`>>
* <<coerce,`coerce`>>
* <<copy-to,`copy_to`>>
* <<doc-values,`doc_values`>>
* <<dynamic,`dynamic`>>
* <<eager-global-ordinals,`eager_global_ordinals`>>
* <<enabled,`enabled`>>
* <<fielddata,`fielddata`>>
* <<eager-global-ordinals,`eager_global_ordinals`>>
* <<multi-fields,`fields`>>
* <<mapping-date-format,`format`>>
* <<ignore-above,`ignore_above`>>
* <<ignore-malformed,`ignore_malformed`>>
* <<index-options,`index_options`>>
* <<index-phrases,`index_phrases`>>
* <<index-prefixes,`index_prefixes`>>
* <<mapping-index,`index`>>
* <<multi-fields,`fields`>>
* <<mapping-field-meta,`meta`>>
* <<normalizer, `normalizer`>>
* <<norms,`norms`>>
* <<null-value,`null_value`>>
* <<position-increment-gap,`position_increment_gap`>>
Expand All @@ -37,8 +38,6 @@ The following mapping parameters are common to some or all field datatypes:

include::params/analyzer.asciidoc[]

include::params/normalizer.asciidoc[]

include::params/boost.asciidoc[]

include::params/coerce.asciidoc[]
Expand All @@ -49,10 +48,10 @@ include::params/doc-values.asciidoc[]

include::params/dynamic.asciidoc[]

include::params/enabled.asciidoc[]

include::params/eager-global-ordinals.asciidoc[]

include::params/enabled.asciidoc[]

include::params/fielddata.asciidoc[]

include::params/format.asciidoc[]
Expand All @@ -69,8 +68,12 @@ include::params/index-phrases.asciidoc[]

include::params/index-prefixes.asciidoc[]

include::params/meta.asciidoc[]

include::params/multi-fields.asciidoc[]

include::params/normalizer.asciidoc[]

include::params/norms.asciidoc[]

include::params/null-value.asciidoc[]
Expand Down
31 changes: 31 additions & 0 deletions docs/reference/mapping/params/meta.asciidoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
[[mapping-field-meta]]
=== `meta`

Metadata attached to the field. This metadata is opaque to Elasticsearch, it is
only useful for multiple applications that work on the same indices to share
meta information about fields such as units

[source,console]
------------
PUT my_index
{
"mappings": {
"properties": {
"latency": {
"type": "long",
"meta": {
"unit": "ms"
}
}
}
}
}
------------
// TEST

NOTE: Field metadata enforces at most 5 entries, that keys have a length that
is less than or equal to 20, and that values are strings whose length is less
than or equal to 50.

NOTE: Field metadata is updatable by submitting a mapping update. The metadata
of the update will override the metadata of the existing field.
3 changes: 3 additions & 0 deletions docs/reference/mapping/types/boolean.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -120,3 +120,6 @@ The following parameters are accepted by `boolean` fields:
the <<mapping-source-field,`_source`>> field. Accepts `true` or `false`
(default).

<<mapping-field-meta,`meta`>>::

Metadata about the field.
4 changes: 4 additions & 0 deletions docs/reference/mapping/types/date.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -137,3 +137,7 @@ The following parameters are accepted by `date` fields:
Whether the field value should be stored and retrievable separately from
the <<mapping-source-field,`_source`>> field. Accepts `true` or `false`
(default).

<<mapping-field-meta,`meta`>>::

Metadata about the field.
4 changes: 4 additions & 0 deletions docs/reference/mapping/types/keyword.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -115,6 +115,10 @@ The following parameters are accepted by `keyword` fields:
when building a query for this field.
Accepts `true` or `false` (default).

<<mapping-field-meta,`meta`>>::

Metadata about the field.

NOTE: Indexes imported from 2.x do not support `keyword`. Instead they will
attempt to downgrade `keyword` into `string`. This allows you to merge modern
mappings with legacy mappings. Long lived indexes will have to be recreated
Expand Down
4 changes: 4 additions & 0 deletions docs/reference/mapping/types/numeric.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -149,6 +149,10 @@ The following parameters are accepted by numeric types:
the <<mapping-source-field,`_source`>> field. Accepts `true` or `false`
(default).

<<mapping-field-meta,`meta`>>::

Metadata about the field.

[[scaled-float-params]]
==== Parameters for `scaled_float`

Expand Down
4 changes: 4 additions & 0 deletions docs/reference/mapping/types/text.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -143,3 +143,7 @@ The following parameters are accepted by `text` fields:

Whether term vectors should be stored for an <<mapping-index,`analyzed`>>
field. Defaults to `no`.

<<mapping-field-meta,`meta`>>::

Metadata about the field.
6 changes: 6 additions & 0 deletions docs/reference/search/field-caps.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -78,6 +78,12 @@ include::{docdir}/rest-api/common-parms.asciidoc[tag=index-ignore-unavailable]
The list of indices where this field is not aggregatable, or null if all
indices have the same definition for the field.

`meta`::
Merged metadata across all indices as a map of string keys to arrays of values.
A value length of 1 indicates that all indices had the same value for this key,
while a length of 2 or more indicates that not all indices had the same value
for this key.


[[search-field-caps-api-example]]
==== {api-examples-title}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@
import org.elasticsearch.common.xcontent.XContentFactory;
import org.elasticsearch.common.xcontent.XContentType;
import org.elasticsearch.index.IndexService;
import org.elasticsearch.index.mapper.MapperService.MergeReason;
import org.elasticsearch.plugins.Plugin;
import org.elasticsearch.test.ESSingleNodeTestCase;
import org.elasticsearch.test.InternalSettingsPlugin;
Expand All @@ -35,6 +36,7 @@
import java.io.IOException;
import java.util.Arrays;
import java.util.Collection;
import java.util.Collections;
import java.util.List;

import static org.hamcrest.Matchers.containsString;
Expand Down Expand Up @@ -353,4 +355,33 @@ public void testRejectIndexOptions() throws IOException {
MapperParsingException e = expectThrows(MapperParsingException.class, () -> parser.parse("type", new CompressedXContent(mapping)));
assertThat(e.getMessage(), containsString("index_options not allowed in field [foo] of type [scaled_float]"));
}

public void testMeta() throws Exception {
String mapping = Strings.toString(XContentFactory.jsonBuilder().startObject().startObject("_doc")
.startObject("properties").startObject("field").field("type", "scaled_float")
.field("meta", Collections.singletonMap("foo", "bar"))
.field("scaling_factor", 10.0)
.endObject().endObject().endObject().endObject());

DocumentMapper mapper = indexService.mapperService().merge("_doc",
new CompressedXContent(mapping), MergeReason.MAPPING_UPDATE);
assertEquals(mapping, mapper.mappingSource().toString());

String mapping2 = Strings.toString(XContentFactory.jsonBuilder().startObject().startObject("_doc")
.startObject("properties").startObject("field").field("type", "scaled_float")
.field("scaling_factor", 10.0)
.endObject().endObject().endObject().endObject());
mapper = indexService.mapperService().merge("_doc",
new CompressedXContent(mapping2), MergeReason.MAPPING_UPDATE);
assertEquals(mapping2, mapper.mappingSource().toString());

String mapping3 = Strings.toString(XContentFactory.jsonBuilder().startObject().startObject("_doc")
.startObject("properties").startObject("field").field("type", "scaled_float")
.field("meta", Collections.singletonMap("baz", "quux"))
.field("scaling_factor", 10.0)
.endObject().endObject().endObject().endObject());
mapper = indexService.mapperService().merge("_doc",
new CompressedXContent(mapping3), MergeReason.MAPPING_UPDATE);
assertEquals(mapping3, mapper.mappingSource().toString());
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -317,4 +317,3 @@ setup:
- match: {fields.misc.unmapped.searchable: false}
- match: {fields.misc.unmapped.aggregatable: false}
- match: {fields.misc.unmapped.indices: ["test2", "test3"]}

Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
---
"Merge metadata across multiple indices":

- skip:
version: " - 7.99.99"
reason: Metadata support was added in 7.6

- do:
indices.create:
index: test1
body:
mappings:
properties:
latency:
type: long
meta:
unit: ms
metric_type: gauge

- do:
indices.create:
index: test2
body:
mappings:
properties:
latency:
type: long
meta:
unit: ns
metric_type: gauge

- do:
indices.create:
index: test3

- do:
field_caps:
index: test3
fields: [latency]

- is_false: fields.latency.long.meta.unit

- do:
field_caps:
index: test1
fields: [latency]

- match: {fields.latency.long.meta.unit: ["ms"]}
- match: {fields.latency.long.meta.metric_type: ["gauge"]}

- do:
field_caps:
index: test1,test3
fields: [latency]

- match: {fields.latency.long.meta.unit: ["ms"]}
- match: {fields.latency.long.meta.metric_type: ["gauge"]}

- do:
field_caps:
index: test1,test2,test3
fields: [latency]

- match: {fields.latency.long.meta.unit: ["ms", "ns"]}
- match: {fields.latency.long.meta.metric_type: ["gauge"]}
Original file line number Diff line number Diff line change
Expand Up @@ -108,3 +108,39 @@

- match: { error.type: "illegal_argument_exception" }
- match: { error.reason: "Types cannot be provided in put mapping requests, unless the include_type_name parameter is set to true." }

---
"Update per-field metadata":

- skip:
version: " - 7.99.99"
reason: "Per-field meta was introduced in 7.6"

- do:
indices.create:
index: test_index
body:
mappings:
properties:
foo:
type: keyword
meta:
bar: baz

- do:
indices.put_mapping:
index: test_index
body:
properties:
foo:
type: keyword
meta:
baz: quux

- do:
indices.get_mapping:
index: test_index

- is_false: test_index.mappings.properties.foo.meta.bar
- match: { test_index.mappings.properties.foo.meta.baz: "quux" }

Loading

0 comments on commit 2d627ba

Please sign in to comment.