diff --git a/docs/reference/mapping.asciidoc b/docs/reference/mapping.asciidoc index 239614345d782..5d6245a964104 100644 --- a/docs/reference/mapping.asciidoc +++ b/docs/reference/mapping.asciidoc @@ -75,9 +75,29 @@ reindexing. You can use runtime fields in conjunction with indexed fields to balance resource usage and performance. Your index will be smaller, but with slower search performance. +[discrete] +[[mapping-manage-update]] +== Managing and updating mappings + +Explicit mappings should be defined at index creation for fields you know in advance. +You can still add _new fields_ to mappings at any time, as your data evolves. + +Use the <> to update an existing mapping. + +In most cases, you can't change mappings for fields that are already mapped. +These changes require <>. + +However, you can _update_ mappings under certain conditions: + +* You can add new fields to an existing mapping at any time, explicitly or dynamically. +* You can add new <> for existing fields. +** Documents indexed before the mapping update will not have values for the new multi-fields until they are updated or reindexed. Documents indexed after the mapping change will automatically have values for the new multi-fields. +* Some <> can be updated for existing fields of certain <>. + [discrete] [[mapping-limit-settings]] -== Settings to prevent mapping explosion +== Prevent mapping explosions + Defining too many fields in an index can lead to a mapping explosion, which can cause out of memory errors and difficult situations to recover from. diff --git a/docs/reference/query-dsl/semantic-query.asciidoc b/docs/reference/query-dsl/semantic-query.asciidoc index f3f6aca3fd07a..11e19d6356081 100644 --- a/docs/reference/query-dsl/semantic-query.asciidoc +++ b/docs/reference/query-dsl/semantic-query.asciidoc @@ -40,209 +40,9 @@ The `semantic_text` field to perform the query on. (Required, string) The query text to be searched for on the field. -`inner_hits`:: -(Optional, object) -Retrieves the specific passages that match the query. -See <> for more information. -+ -.Properties of `inner_hits` -[%collapsible%open] -==== -`from`:: -(Optional, integer) -The offset from the first matching passage to fetch. -Used to paginate through the passages. -Defaults to `0`. - -`size`:: -(Optional, integer) -The maximum number of matching passages to return. -Defaults to `3`. -==== Refer to <> to learn more about semantic search using `semantic_text` and `semantic` query. -[discrete] -[[semantic-query-passage-ranking]] -==== Passage ranking with the `semantic` query -The `inner_hits` parameter can be used for _passage ranking_, which allows you to determine which passages in the document best match the query. -For example, if you have a document that covers varying topics: - -[source,console] ------------------------------------------------------------- -POST my-index/_doc/lake_tahoe -{ - "inference_field": [ - "Lake Tahoe is the largest alpine lake in North America", - "When hiking in the area, please be on alert for bears" - ] -} ------------------------------------------------------------- -// TEST[skip: Requires inference endpoints] - -You can use passage ranking to find the passage that best matches your query: - -[source,console] ------------------------------------------------------------- -GET my-index/_search -{ - "query": { - "semantic": { - "field": "inference_field", - "query": "mountain lake", - "inner_hits": { } - } - } -} ------------------------------------------------------------- -// TEST[skip: Requires inference endpoints] - -[source,console-result] ------------------------------------------------------------- -{ - "took": 67, - "timed_out": false, - "_shards": { - "total": 1, - "successful": 1, - "skipped": 0, - "failed": 0 - }, - "hits": { - "total": { - "value": 1, - "relation": "eq" - }, - "max_score": 10.844536, - "hits": [ - { - "_index": "my-index", - "_id": "lake_tahoe", - "_score": 10.844536, - "_source": { - ... - }, - "inner_hits": { <1> - "inference_field": { - "hits": { - "total": { - "value": 2, - "relation": "eq" - }, - "max_score": 10.844536, - "hits": [ - { - "_index": "my-index", - "_id": "lake_tahoe", - "_nested": { - "field": "inference_field.inference.chunks", - "offset": 0 - }, - "_score": 10.844536, - "_source": { - "text": "Lake Tahoe is the largest alpine lake in North America" - } - }, - { - "_index": "my-index", - "_id": "lake_tahoe", - "_nested": { - "field": "inference_field.inference.chunks", - "offset": 1 - }, - "_score": 3.2726858, - "_source": { - "text": "When hiking in the area, please be on alert for bears" - } - } - ] - } - } - } - } - ] - } -} ------------------------------------------------------------- -<1> Ranked passages will be returned using the <>, with `` set to the `semantic_text` field name. - -By default, the top three matching passages will be returned. -You can use the `size` parameter to control the number of passages returned and the `from` parameter to page through the matching passages: - -[source,console] ------------------------------------------------------------- -GET my-index/_search -{ - "query": { - "semantic": { - "field": "inference_field", - "query": "mountain lake", - "inner_hits": { - "from": 1, - "size": 1 - } - } - } -} ------------------------------------------------------------- -// TEST[skip: Requires inference endpoints] - -[source,console-result] ------------------------------------------------------------- -{ - "took": 42, - "timed_out": false, - "_shards": { - "total": 1, - "successful": 1, - "skipped": 0, - "failed": 0 - }, - "hits": { - "total": { - "value": 1, - "relation": "eq" - }, - "max_score": 10.844536, - "hits": [ - { - "_index": "my-index", - "_id": "lake_tahoe", - "_score": 10.844536, - "_source": { - ... - }, - "inner_hits": { - "inference_field": { - "hits": { - "total": { - "value": 2, - "relation": "eq" - }, - "max_score": 10.844536, - "hits": [ - { - "_index": "my-index", - "_id": "lake_tahoe", - "_nested": { - "field": "inference_field.inference.chunks", - "offset": 1 - }, - "_score": 3.2726858, - "_source": { - "text": "When hiking in the area, please be on alert for bears" - } - } - ] - } - } - } - } - ] - } -} ------------------------------------------------------------- - [discrete] [[hybrid-search-semantic]] ==== Hybrid search with the `semantic` query diff --git a/docs/reference/quickstart/getting-started.asciidoc b/docs/reference/quickstart/getting-started.asciidoc index e674dda147bcc..a6d233d8b8abc 100644 --- a/docs/reference/quickstart/getting-started.asciidoc +++ b/docs/reference/quickstart/getting-started.asciidoc @@ -1,83 +1,150 @@ [[getting-started]] -== Quick start: Add data using Elasticsearch APIs +== Index and search data using {es} APIs ++++ -Basics: Add data using APIs +Basics: Index and search using APIs ++++ -In this quick start guide, you'll learn how to do the following tasks: +This quick start guide is a hands-on introduction to the fundamental concepts of Elasticsearch: <>. -* Add a small, non-timestamped dataset to {es} using Elasticsearch REST APIs. -* Run basic searches. +You'll learn how to create an index, add data as documents, work with dynamic and explicit mappings, and perform your first basic searches. -[discrete] -[[add-data]] -=== Add data - -You add data to {es} as JSON objects called documents. -{es} stores these -documents in searchable indices. +[TIP] +==== +The code examples in this tutorial are in {kibana-ref}/console-kibana.html[Console] syntax by default. +You can {kibana-ref}/console-kibana.html#import-export-console-requests[convert into other programming languages] in the Console UI. +==== [discrete] -[[add-single-document]] -==== Add a single document +[[getting-started-prerequisites]] +=== Prerequisites -Submit the following indexing request to add a single document to the -`books` index. -The request automatically creates the index. +Before you begin, you need to have a running {es} cluster. +The fastest way to get started is with a <>. +Refer to <> for other deployment options. //// [source,console] ---- PUT books +PUT my-explicit-mappings-books ---- // TESTSETUP [source,console] -------------------------------------------------- DELETE books +DELETE my-explicit-mappings-books -------------------------------------------------- // TEARDOWN //// +[discrete] +[[getting-started-index-creation]] +=== Step 1: Create an index + +Create a new index named `books`: + +[source,console] +---- +PUT /books +---- +// TEST[skip: index already setup] + +The following response indicates the index was created successfully. + +.Example response +[%collapsible] +=============== +[source,console-result] +---- +{ + "acknowledged": true, + "shards_acknowledged": true, + "index": "books" +} +---- +// TEST[skip: index already setup] +=============== + +[discrete] +[[getting-started-add-documents]] +=== Step 2: Add data to your index + +[TIP] +==== +This tutorial uses {es} APIs, but there are many other ways to +<>. +==== + +You add data to {es} as JSON objects called documents. +{es} stores these +documents in searchable indices. + +[discrete] +[[getting-started-add-single-document]] +==== Add a single document + +Submit the following indexing request to add a single document to the +`books` index. + +[TIP] +==== +If the index didn't already exist, this request would automatically create it. +==== + [source,console] ---- POST books/_doc -{"name": "Snow Crash", "author": "Neal Stephenson", "release_date": "1992-06-01", "page_count": 470} +{ + "name": "Snow Crash", + "author": "Neal Stephenson", + "release_date": "1992-06-01", + "page_count": 470 +} ---- -// TEST[s/_doc/_doc?refresh=wait_for/] +// TEST[continued] -The response includes metadata that {es} generates for the document including a unique `_id` for the document within the index. +The response includes metadata that {es} generates for the document, including a unique `_id` for the document within the index. -.Expand to see example response +.Example response [%collapsible] =============== [source,console-result] ---- { - "_index": "books", - "_id": "O0lG2IsBaSa7VYx_rEia", - "_version": 1, - "result": "created", - "_shards": { - "total": 2, - "successful": 2, - "failed": 0 + "_index": "books", <1> + "_id": "O0lG2IsBaSa7VYx_rEia", <2> + "_version": 1, <3> + "result": "created", <4> + "_shards": { <5> + "total": 2, <6> + "successful": 2, <7> + "failed": 0 <8> }, - "_seq_no": 0, - "_primary_term": 1 + "_seq_no": 0, <9> + "_primary_term": 1 <10> } ---- -// TEST[skip:TODO] +// TEST[s/O0lG2IsBaSa7VYx_rEia/*/] +<1> The `_index` field indicates the index the document was added to. +<2> The `_id` field is the unique identifier for the document. +<3> The `_version` field indicates the version of the document. +<4> The `result` field indicates the result of the indexing operation. +<5> The `_shards` field contains information about the number of <> that the indexing operation was executed on and the number that succeeded. +<6> The `total` field indicates the total number of shards for the index. +<7> The `successful` field indicates the number of shards that the indexing operation was executed on. +<8> The `failed` field indicates the number of shards that failed during the indexing operation. '0' indicates no failures. +<9> The `_seq_no` field holds a monotonically increasing number incremented for each indexing operation on a shard. +<10> The `_primary_term` field is a monotonically increasing number incremented each time a primary shard is assigned to a different node. =============== [discrete] -[[add-multiple-documents]] +[[getting-started-add-multiple-documents]] ==== Add multiple documents -Use the `_bulk` endpoint to add multiple documents in one request. Bulk data -must be newline-delimited JSON (NDJSON). Each line must end in a newline -character (`\n`), including the last line. +Use the <> to add multiple documents in one request. Bulk data +must be formatted as newline-delimited JSON (NDJSON). [source,console] ---- @@ -97,7 +164,7 @@ POST /_bulk You should receive a response indicating there were no errors. -.Expand to see example response +.Example response [%collapsible] =============== [source,console-result] @@ -193,31 +260,218 @@ You should receive a response indicating there were no errors. =============== [discrete] -[[qs-search-data]] -=== Search data +[[getting-started-mappings-and-data-types]] +=== Step 3: Define mappings and data types + +<> define how data is stored and indexed in {es}, like a schema in a relational database. + +[discrete] +[[getting-started-dynamic-mapping]] +==== Use dynamic mapping + +When using dynamic mapping, {es} automatically creates mappings for new fields by default. +The documents we've added so far have used dynamic mapping, because we didn't specify a mapping when creating the index. + +To see how dynamic mapping works, add a new document to the `books` index with a field that doesn't appear in the existing documents. + +[source,console] +---- +POST /books/_doc +{ + "name": "The Great Gatsby", + "author": "F. Scott Fitzgerald", + "release_date": "1925-04-10", + "page_count": 180, + "language": "EN" <1> +} +---- +// TEST[continued] +<1> The new field. + +View the mapping for the `books` index with the <>. The new field `new_field` has been added to the mapping with a `text` data type. + +[source,console] +---- +GET /books/_mapping +---- +// TEST[continued] + +.Example response +[%collapsible] +=============== +[source,console-result] +---- +{ + "books": { + "mappings": { + "properties": { + "author": { + "type": "text", + "fields": { + "keyword": { + "type": "keyword", + "ignore_above": 256 + } + } + }, + "name": { + "type": "text", + "fields": { + "keyword": { + "type": "keyword", + "ignore_above": 256 + } + } + }, + "new_field": { + "type": "text", + "fields": { + "keyword": { + "type": "keyword", + "ignore_above": 256 + } + } + }, + "page_count": { + "type": "long" + }, + "release_date": { + "type": "date" + } + } + } + } +} +---- +// TEST[continued] +=============== + +[discrete] +[[getting-started-explicit-mapping]] +==== Define explicit mapping -Indexed documents are available for search in near real-time. +Create an index named `my-explicit-mappings-books` with explicit mappings. +Pass each field's properties as a JSON object. This object should contain the <> and any additional <>. + +[source,console] +---- +PUT /my-explicit-mappings-books +{ + "mappings": { + "dynamic": false, <1> + "properties": { <2> + "name": { "type": "text" }, + "author": { "type": "text" }, + "release_date": { "type": "date", "format": "yyyy-MM-dd" }, + "page_count": { "type": "integer" } + } + } +} +---- +// TEST[continued] +<1> Disables dynamic mapping for the index. Documents containing fields not defined in the mapping will be rejected. +<2> The `properties` object defines the fields and their data types for documents in this index. + +.Example response +[%collapsible] +=============== +[source,console-result] +---- +{ + "acknowledged": true, + "shards_acknowledged": true, + "index": "my-explicit-mappings-books" +} +---- +// TEST[skip:already created in setup] +=============== + +[discrete] +[[getting-started-combined-mapping]] +==== Combine dynamic and explicit mappings + +Explicit mappings are defined at index creation, and documents must conform to these mappings. +You can also use the <>. +When an index has the `dynamic` flag set to `true`, you can add new fields to documents without updating the mapping. + +This allows you to combine explicit and dynamic mappings. +Learn more about <>. + +[discrete] +[[getting-started-search-data]] +=== Step 4: Search your index + +Indexed documents are available for search in near real-time, using the <>. +// TODO: You'll find more detailed quick start guides in TODO [discrete] -[[search-all-documents]] +[[getting-started-search-all-documents]] ==== Search all documents Run the following command to search the `books` index for all documents: + [source,console] ---- GET books/_search ---- // TEST[continued] -The `_source` of each hit contains the original -JSON object submitted during indexing. +.Example response +[%collapsible] +=============== +[source,console-result] +---- +{ + "took": 2, <1> + "timed_out": false, <2> + "_shards": { <3> + "total": 5, + "successful": 5, + "skipped": 0, + "failed": 0 + }, + "hits": { <4> + "total": { <5> + "value": 7, + "relation": "eq" + }, + "max_score": 1, <6> + "hits": [ + { + "_index": "books", <7> + "_id": "CwICQpIBO6vvGGiC_3Ls", <8> + "_score": 1, <9> + "_source": { <10> + "name": "Brave New World", + "author": "Aldous Huxley", + "release_date": "1932-06-01", + "page_count": 268 + } + }, + ... (truncated) + ] + } +} +---- +// TEST[continued] +<1> The `took` field indicates the time in milliseconds for {es} to execute the search +<2> The `timed_out` field indicates whether the search timed out +<3> The `_shards` field contains information about the number of <> that the search was executed on and the number that succeeded +<4> The `hits` object contains the search results +<5> The `total` object provides information about the total number of matching documents +<6> The `max_score` field indicates the highest relevance score among all matching documents +<7> The `_index` field indicates the index the document belongs to +<8> The `_id` field is the document's unique identifier +<9> The `_score` field indicates the relevance score of the document +<10> The `_source` field contains the original JSON object submitted during indexing +=============== [discrete] -[[qs-match-query]] +[[getting-started-match-query]] ==== `match` query You can use the <> to search for documents that contain a specific value in a specific field. -This is the standard query for performing full-text search, including fuzzy matching and phrase searches. +This is the standard query for full-text searches. Run the following command to search the `books` index for documents containing `brave` in the `name` field: @@ -232,4 +486,65 @@ GET books/_search } } ---- -// TEST[continued] \ No newline at end of file +// TEST[continued] + +.Example response +[%collapsible] +=============== +[source,console-result] +---- +{ + "took": 9, + "timed_out": false, + "_shards": { + "total": 5, + "successful": 5, + "skipped": 0, + "failed": 0 + }, + "hits": { + "total": { + "value": 1, + "relation": "eq" + }, + "max_score": 0.6931471, <1> + "hits": [ + { + "_index": "books", + "_id": "CwICQpIBO6vvGGiC_3Ls", + "_score": 0.6931471, + "_source": { + "name": "Brave New World", + "author": "Aldous Huxley", + "release_date": "1932-06-01", + "page_count": 268 + } + } + ] + } +} +---- +// TEST[continued] +<1> The `max_score` is the score of the highest-scoring document in the results. In this case, there is only one matching document, so the `max_score` is the score of that document. +=============== + +[discrete] +[[getting-started-delete-indices]] +=== Step 5: Delete your indices (optional) + +When following along with examples, you might want to delete an index to start from scratch. +You can delete indices using the <>. + +For example, run the following command to delete the indices created in this tutorial: + +[source,console] +---- +DELETE /books +DELETE /my-explicit-mappings-books +---- +// TEST[skip:handled by setup/teardown] + +[CAUTION] +==== +Deleting an index permanently deletes its documents, shards, and metadata. +==== diff --git a/docs/reference/quickstart/index.asciidoc b/docs/reference/quickstart/index.asciidoc index 6bfed4c198c75..2d9114882254f 100644 --- a/docs/reference/quickstart/index.asciidoc +++ b/docs/reference/quickstart/index.asciidoc @@ -15,7 +15,7 @@ Get started <> , or see our <>. Learn how to add data to {es} and perform basic searches. +* <>. Learn about indices, documents, and mappings, and perform a basic search. [discrete] [[quickstart-python-links]] @@ -26,4 +26,4 @@ If you're interested in using {es} with Python, check out Elastic Search Labs: * https://github.com/elastic/elasticsearch-labs[`elasticsearch-labs` repository]: Contains a range of Python https://github.com/elastic/elasticsearch-labs/tree/main/notebooks[notebooks] and https://github.com/elastic/elasticsearch-labs/tree/main/example-apps[example apps]. * https://www.elastic.co/search-labs/tutorials/search-tutorial/welcome[Tutorial]: This walks you through building a complete search solution with {es} from the ground up using Flask. -include::getting-started.asciidoc[] \ No newline at end of file +include::getting-started.asciidoc[]