Skip to content

Commit

Permalink
refactor: the data management section under the operations section (#…
Browse files Browse the repository at this point in the history
  • Loading branch information
nicecui authored Sep 26, 2024
1 parent 12b58b7 commit 4f487c8
Show file tree
Hide file tree
Showing 40 changed files with 457 additions and 449 deletions.
50 changes: 1 addition & 49 deletions docs/contributor-guide/frontend/table-sharding.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,55 +4,7 @@ The sharding of stored data is essential to any distributed database. This docum

## Partition

In GreptimeDB, logically, data is sharded in partitions. Because GreptimeDB is using "table" to
group data and SQL to query them, we borrow the word "partition", which is a concept commonly used
in OLTP databases.

In GreptimeDB, a table can be horizontally partitioned in multiple ways and it uses the same
partitioning types (and corresponding syntax) as in MySQL. Currently, GreptimeDB supports "RANGE COLUMNS partitioning".

Each partition includes only a portion of the data from the table, and is
grouped by some column(s) value range. For example, we can partition a table in GreptimeDB like
this:

```sql
CREATE TABLE (...)
PARTITION ON COLUMNS (<COLUMN LIST>) (
<RULE LIST>
);
```

The syntax mainly consists of two parts:
- `PARTITION ON COLUMNS` followed by a comma-separated list of column names, which specifies which columns might be used for partitioning. The partition list specified here is only used as an "allow list", and in reality only a portion of the columns specified here will be used for partitioning.
- `RULE LIST` is a list of multiple partition rules, each of which is a combination of a partition name and a partition condition. The expressions here can use `=`, `!=`, `>`, `>=`, `<`, `<=`, `AND`, `OR`, column name and literals.

Here is a concrete example:

```sql
CREATE TABLE my_table (
a INT PRIMARY KEY,
b STRING,
ts TIMESTAMP TIME INDEX,
)
PARTITION ON COLUMNS (a) (
a < 10,
a >= 10 AND a < 20,
a >= 20,
);
```

The above `my_table` has 3 partitions. The first partition contains rows where "a < 10", the second partition contains rows where "10 \<= a < 20", and the third partition contains all rows where "a >= 20".

:::warning Important

1. The ranges of all partitions must not overlap.
2. The columns used for partitioning must be specified in `ON COLUMNS`

:::

:::tip Note
Currently expressions are not supported in "PARTITION BY RANGE" syntax, you can only supply column names.
:::
For the syntax of creating a partitioned table, please refer to the [Table Sharding](/user-guide/operations/data-management/table-sharding.md) section in the User Guide.

## Region

Expand Down
2 changes: 1 addition & 1 deletion docs/faq-and-others/faq.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ Please check out our initial version on [GitHub Repo](https://github.com/Greptim

Yes, GreptimeDB is a schemaless database without need for creating tables in advance. The table and columns will be created automatically when writing data with protocol gRPC, InfluxDB, OpentsDB, Prometheus remote write.

For more information, refer to [this document](/user-guide/table-management.md#create-table).
For more information, refer to [this document](/user-guide/operations/data-management/basic-table-operations.md#create-table).

### How do you measure the passing rate of PromQL compatibility tests? Is there any testing framework?

Expand Down
4 changes: 2 additions & 2 deletions docs/reference/sql/admin.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,9 +13,9 @@ GreptimeDB provides some administration functions to manage the database and dat

* `flush_table(table_name)` to flush a table's memtables into SST file by table name.
* `flush_region(region_id)` to flush a region's memtables into SST file by region id. Find the region id through [PARTITIONS](./information-schema/partitions.md) table.
* `compact_table(table_name, [type], [options])` to schedule a compaction task for a table by table name, read [compaction](/user-guide/operations/compaction.md#strict-window-compaction-strategy-swcs-and-manual-compaction) for more details.
* `compact_table(table_name, [type], [options])` to schedule a compaction task for a table by table name, read [compaction](/user-guide/operations/data-management/compaction.md#strict-window-compaction-strategy-swcs-and-manual-compaction) for more details.
* `compact_region(region_id)` to schedule a compaction task for a region by region id.
* `migrate_region(region_id, from_peer, to_peer, [timeout])` to migrate regions between datanodes, please read the [Region Migration](/user-guide/operations/region-migration.md).
* `migrate_region(region_id, from_peer, to_peer, [timeout])` to migrate regions between datanodes, please read the [Region Migration](/user-guide/operations/data-management/region-migration.md).
* `procedure_state(procedure_id)` to query a procedure state by its id.
* `flush_flow(flow_name)` to flush a flow's output into the sink table.

Expand Down
150 changes: 0 additions & 150 deletions docs/user-guide/cluster.md

This file was deleted.

4 changes: 2 additions & 2 deletions docs/user-guide/concepts/data-model.md
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,7 @@ CREATE TABLE access_logs (
- `remote_addr`, `http_status`, `http_method`, `http_refer` and `user_agent` are tags.
- `request` is a field that enables full-text index by the [`FULLTEXT` column option](/reference/sql/create.md#fulltext-column-option).

To learn how to indicate `Tag`, `Timestamp`, and `Field` columns, Please refer to [table management](../table-management.md#create-a-table) and [CREATE statement](/reference/sql/create.md).
To learn how to indicate `Tag`, `Timestamp`, and `Field` columns, Please refer to [table management](/user-guide/operations/data-management/basic-table-operations.md#create-a-table) and [CREATE statement](/reference/sql/create.md).

Of course, you can place metrics and logs in a single table at any time, which is also a key capability provided by GreptimeDB.

Expand All @@ -95,4 +95,4 @@ GreptimeDB is designed on top of Table for the following reasons:
The multi-value model is used to model data sources, where a metric can have multiple values represented by fields.
The advantage of the multi-value model is that it can write or read multiple values to the database at once, reducing transfer traffic and simplifying queries. In contrast, the single-value model requires splitting the data into multiple records. Read the [blog](https://greptime.com/blogs/2024-05-09-prometheus) for more detailed benefits of multi-value mode.

GreptimeDB uses SQL to manage table schema. Please refer to [table management](/user-guide/table-management.md) for more information. However, our definition of schema is not mandatory and leans towards a **schemaless** approach, similar to MongoDB. For more details, see [Automatic Schema Generation](/user-guide/ingest-data/overview.md#automatic-schema-generation).
GreptimeDB uses SQL to manage table schema. Please refer to [table management](/user-guide/operations/data-management/basic-table-operations.md) for more information. However, our definition of schema is not mandatory and leans towards a **schemaless** approach, similar to MongoDB. For more details, see [Automatic Schema Generation](/user-guide/ingest-data/overview.md#automatic-schema-generation).
2 changes: 1 addition & 1 deletion docs/user-guide/ingest-data/for-iot/sql.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ The above statement will create a table with the following schema:
```

For more information about the `CREATE TABLE` statement,
please refer to [table management](/user-guide/table-management.md#create-a-table).
please refer to [table management](/user-guide/operations/data-management/basic-table-operations.md#create-a-table).

## Insert data

Expand Down
6 changes: 6 additions & 0 deletions docs/user-guide/manage-data/overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -313,3 +313,9 @@ it will take precedence over the database TTL policy.
Otherwise, the database TTL policy will be applied to the table.

For more information about TTL policies, please refer to the [CREATE](/reference/sql/create.md) statement.


## More data management operations

For more advanced data management operations, such as basic table operations, table sharding and region migration, please refer to the [Data Management](/user-guide/operations/data-management/overview.md) in the administration section.

10 changes: 1 addition & 9 deletions docs/user-guide/operations/admin.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ This document addresses strategies and practices used in the operation of Grepti
* Database Configuration, please read the [Configuration](/user-guide/deployments/configuration.md) reference.
* [Monitoring metrics](/user-guide/operations/monitoring/export-metrics.md) and [Tracing](/user-guide/operations/monitoring/tracing.md) for GreptimeDB.
* GreptimeDB [Disaster Recovery](/user-guide/operations/disaster-recovery/overview.md).
* Cluster Failover for GreptimeDB by [Setting Remote WAL](./remote-wal/quick-start.md).

### Runtime information

Expand All @@ -33,15 +34,6 @@ ORDER BY datanode_id ASC

The `INFORMATION_SCHEMA` database provides access to system metadata, such as the name of a database or table, the data type of a column, etc. Please read the [reference](/reference/sql/information-schema/overview.md).

## Data management

* [The Storage Location](/user-guide/concepts/storage-location.md).
* Cluster Failover for GreptimeDB by [Setting Remote WAL](./remote-wal/quick-start.md).
* [Flush and Compaction for Table & Region](/reference/sql/admin.md#admin-functions).
* Partition the table by regions, read the [Table Sharding](/contributor-guide/frontend/table-sharding.md) reference.
* [Migrate the Region](./region-migration.md) for Load Balance.
* [Expire Data by Setting TTL](/user-guide/concepts/features-that-you-concern.md#can-i-set-ttl-or-retention-policy-for-different-tables-or-measurements).

## Best Practices

* [Performance Tuning Tips](/user-guide/operations/performance-tuning-tips.md)
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Table Management
# Basic Table Operations

[Data Model](./concepts/data-model.md) should be read before this guide.
[Data Model](/user-guide/concepts/data-model.md) should be read before this guide.

GreptimeDB provides table management functionalities via SQL. The following guide
uses [MySQL Command-Line Client](https://dev.mysql.com/doc/refman/8.0/en/mysql.html) to demonstrate it.
Expand Down Expand Up @@ -319,4 +319,5 @@ The specified time zone in the SQL client session will affect the default timest
If you set the default value of a timestamp column to a string without a time zone,
the client's time zone information will be automatically added.

For more information about the effect of the client time zone, please refer to the [time zone](./ingest-data/for-iot/sql.md#time-zone) section in the write data document.
For more information about the effect of the client time zone, please refer to the [time zone](/user-guide/ingest-data/for-iot/sql.md#time-zone) section in the write data document.

File renamed without changes.
10 changes: 10 additions & 0 deletions docs/user-guide/operations/data-management/overview.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
# Overview

* [Storage Location](/user-guide/concepts/storage-location.md)
* [Basic SQL Table Operations](basic-table-operations.md): Learn how to create, alter, and drop tables
* [Update or Delete Data](/user-guide/manage-data/overview.md)
* [Expire Data by Setting TTL](/user-guide/manage-data/overview.md#manage-data-retention-with-ttl-policies)
* [Table Sharding](table-sharding.md): Partition tables by regions
* [Region Migration](region-migration.md): Migrate regions for load balancing
* [Region Failover](/user-guide/operations/data-management/region-failover.md)
* [Compaction](compaction.md)
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Region Failover

Region Failover provides the ability to recover regions from region failures without losing data. This is implemented via [Region Migration](/user-guide/operations/region-migration.md).
Region Failover provides the ability to recover regions from region failures without losing data. This is implemented via [Region Migration](/user-guide/operations/data-management/region-migration.md).

## Enable the Region Failover

Expand Down
Loading

0 comments on commit 4f487c8

Please sign in to comment.