Skip to content

Latest commit

 

History

History
427 lines (304 loc) · 11 KB

search.md

File metadata and controls

427 lines (304 loc) · 11 KB
title icon
Built-in search plugin
material/magnify

Built-in search plugin

The search plugin adds a search bar to the header, allowing users to search your documentation. It's powered by lunr.js, a lightweight full-text search engine for the browser, elimininating the need for external services, and even works when building offline-capable documentation.

Objective

How it works

The plugin scans the generated HTML and builds a search index from all pages and sections by extracting the section titles and contents. It preserves some inline formatting like code blocks and lists, but removes all other formatting, so the search index is as small as possible.

When a user visits your site, the search index is shipped to the browser, indexed with lunr.js and made available for fast and simple querying – no server needed. This ensures that the search index is always up to date with your documentation, yielding accurate results.

When to use it

It's generally recommended to use the plugin, as interactive search functionality is a vital part of every good documentation. Additionally, the plugin integrates perfectly with several of the other built-in plugins that Material for MkDocs offers:

  • :material-connection:   Built-in offline plugin


    The offline plugin adds support for building offline-capable documentation, so you can distribute the [site directory][mkdocs.site_dir] as a .zip file that can be downloaded.


    Your documentation can work without connectivity to the internet

  • :material-file-tree:   Built-in meta plugin


    The meta plugin makes it easy to [boost][meta.search.boost] specific sections in search results or to [exclude][meta.search.exclude] them entirely from being indexed, giving more granular control over search.


    Simpler organization and management of search in different subsections

Configuration

As with all built-in plugins, getting started with the search plugin is straightforward. Just add the following lines to mkdocs.yml, and your users will be able to search your documentation:

plugins:
  - search

The search plugin is built into Material for MkDocs and doesn't need to be installed.

General

The following settings are available:


Use this setting to enable or disable the plugin when building your project. It's normally not necessary to specify this setting, but if you want to disable the plugin, use:

plugins:
  - search:
      enabled: false

Search

The following settings are available for search:


Use this setting to specify the language of the search index, enabling stemming support for other languages than English. The default value is automatically computed from the site language, but can be explicitly set to another language or even multiple languages with:

=== "Set language"

``` yaml
plugins:
  - search:
      lang: en
```

=== "Add further languages"

``` yaml
plugins:
  - search:
      lang: # (1)!
        - en
        - de
```

1.  Be aware that including support for further languages increases the
    base JavaScript payload by around 20kb and by another 15-30kb per
    language, all before `gzip`.

Language support is provided by lunr languages, a collection of language-specific stemmers and stop words for lunr.js maintained by the Open Source community.


The following languages are currently supported by lunr languages:

  • ar – Arabic
  • da – Danish
  • de – German
  • du – Dutch
  • en – English
  • es – Spanish
  • fi – Finnish
  • fr – French
  • hi – Hindi
  • hu – Hungarian
  • hy – Armenian
  • it – Italian
  • ja – Japanese
  • kn - Kannada
  • ko – Korean
  • no – Norwegian
  • pt – Portuguese
  • ro – Romanian
  • ru – Russian
  • sa – Sanskrit
  • sv – Swedish
  • ta – Tamil
  • te – Telugu
  • th – Thai
  • tr – Turkish
  • vi – Vietnamese
  • zh – Chinese

If lunr languages doesn't provide support for the selected site language, the plugin falls back to another language that yields the best stemming results. If you discover that the search results are not satisfactory, you can contribute to lunr languages by adding support for your language.


Use this setting to specify the separator used to split words when building the search index on the client side. The default value is automatically computed from the site language, but can also be explicitly set to another value with:

plugins:
  - search:
      separator: '[\s\-,:!=\[\]()"/]+|(?!\b)(?=[A-Z][a-z])|\.(?!\d)|&[lg]t;'

Separators support positive and negative lookahead assertions, which allows for rather complex expressions that yield precise control over how words are split when building the search index.

Broken into its parts, this separator induces the following behavior:

=== "Special characters"

```
[\s\-,:!=\[\]()"/]+
```

The first part of the expression inserts token boundaries for each
document before and after whitespace, hyphens, commas, brackets and
other special characters. If several of those special characters are
adjacent, they are treated as one.

=== "Case changes"

```
(?!\b)(?=[A-Z][a-z])
```

Many programming languages have naming conventions like `PascalCase` or
`camelCase`. By adding this subexpression to the separator,
[words are split at case changes], tokenizing the word `PascalCase`
into `Pascal` and `Case`.

=== "Version strings"

```
\.(?!\d)
```

When adding `.` to the separator, version strings like `1.2.3` are split
into `1`, `2` and `3`, which makes them undiscoverable via search. When
using this subexpression, a small lookahead is introduced which will
[preserve version strings] and keep them discoverable.

=== "HTML/XML tags"

```
&[lg]t;
```

If your documentation includes HTML/XML code examples, you may want to allow
users to find [specific tag names]. Unfortunately, the `<` and `>` control
characters are encoded in code blocks as `&lt;` and `&gt;`. Adding this
subexpression to the separator allows for just that.

Use this setting to specify the pipeline functions that are used to filter and expand tokens after tokenizing them with the [separator][config.separator] and before adding them to the search index. The default value is automatically computed from the site language, but can also be explicitly set with:

plugins:
  - search:
      pipeline:
        - stemmer
        - stopWordFilter
        - trimmer

The following pipeline functions can be used:

  • stemmer – Stem tokens to their root form, e.g. running to run
  • stopWordFilter – Filter common words according, e.g. a, the, etc.
  • trimmer – Trim whitespace from tokens

Segmentation

The plugin supports text segmentation of Chinese via jieba, a popular Chinese text segmentation library. Other languages like Japanese and Korean are currently segmented on the client side, but we're considering to move this functionality into the plugin in the future.

The following settings are available for segmentation:


Use this setting to specify a custom dictionary to be used by jieba for segmenting text, replacing the default dictionary. jieba comes with several dictionaries, which can be used with:

plugins:
  - search:
      jieba_dict: dict.txt

The following dictionaries are provided by jieba:

The provided path is resolved from the root directory.


Use this setting to specify an additional user dictionary to be used by jieba for segmenting text, augmenting the default dictionary. User dictionaries are ideal for tuning the segmenter:

plugins:
  - search:
      jieba_dict_user: user_dict.txt

The provided path is resolved from the root directory.

Usage

Metadata

The following properties are available:


Use this property to increase or decrease the relevance of a page in the search results, giving more weight to them. Use values above 1 to rank up and values below 1 to rank down:

=== ":material-arrow-up-circle: Rank up"

``` yaml
---
search:
  boost: 2 # (1)!
---

# Page title
...
```

1.  When boosting pages, always start with low values.

=== ":material-arrow-down-circle: Rank down"

``` yaml
---
search:
  boost: 0.5
---

# Page title
...
```

Use this property to exclude a page from the search results. Note that this will not only remove the page, but also all subsections of the page from the search results:

---
search:
  exclude: true
---

# Page title
...