Skip to content

Commit

Permalink
[SPARK-33877][SQL] SQL reference documents for INSERT w/ a column list
Browse files Browse the repository at this point in the history
We support a column list of INSERT for Spark v3.1.0 (See: SPARK-32976 (#29893)). So, this PR targets at documenting it in the SQL documents.

### What changes were proposed in this pull request?

improve doc
### Why are the changes needed?

### Does this PR introduce _any_ user-facing change?

doc
### How was this patch tested?

passing GA doc gen.

![image](https://user-images.githubusercontent.com/8326978/102954876-8994fa00-450f-11eb-81f9-931af6d1f69b.png)
![image](https://user-images.githubusercontent.com/8326978/102954900-99acd980-450f-11eb-9733-115ad37d2319.png)

![image](https://user-images.githubusercontent.com/8326978/102954935-af220380-450f-11eb-9aaa-fdae0725d41e.png)
![image](https://user-images.githubusercontent.com/8326978/102954949-bc3ef280-450f-11eb-8a0d-d7b688efa7bb.png)

Closes #30888 from yaooqinn/SPARK-33877.

Authored-by: Kent Yao <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
  • Loading branch information
yaooqinn authored and dongjoon-hyun committed Dec 23, 2020
1 parent ec1560a commit a3dd8da
Show file tree
Hide file tree
Showing 2 changed files with 80 additions and 4 deletions.
41 changes: 39 additions & 2 deletions docs/sql-ref-syntax-dml-insert-into.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ The `INSERT INTO` statement inserts new rows into a table. The inserted rows can
### Syntax

```sql
INSERT INTO [ TABLE ] table_identifier [ partition_spec ]
INSERT INTO [ TABLE ] table_identifier [ partition_spec ] [ ( column_list ) ]
{ VALUES ( { value | NULL } [ , ... ] ) [ , ( ... ) ] | query }
```

Expand All @@ -40,11 +40,20 @@ INSERT INTO [ TABLE ] table_identifier [ partition_spec ]

* **partition_spec**

An optional parameter that specifies a comma separated list of key and value pairs
An optional parameter that specifies a comma-separated list of key and value pairs
for partitions.

**Syntax:** `PARTITION ( partition_col_name = partition_col_val [ , ... ] )`

* **column_list**

An optional parameter that specifies a comma-separated list of columns belonging to the `table_identifier` table.

**Note:**The current behaviour has some limitations:
- All specified columns should exist in the table and not be duplicated from each other. It includes all columns except the static partition columns.
- The size of the column list should be exactly the size of the data from `VALUES` clause or query.
- The order of the column list is alterable and determines how the data from `VALUES` clause or query to be inserted by position.

* **VALUES ( { value `|` NULL } [ , ... ] ) [ , ( ... ) ]**

Specifies the values to be inserted. Either an explicitly specified value or a NULL can be inserted.
Expand Down Expand Up @@ -198,6 +207,34 @@ SELECT * FROM students;
+-------------+--------------------------+----------+
```

#### Insert with a column list

```sql
INSERT INTO students (address, name, student_id) VALUES
('Hangzhou, China', 'Kent Yao', 11215016);

SELECT * FROM students WHERE name = 'Kent Yao';
+---------+----------------------+----------+
| name| address|student_id|
+---------+----------------------+----------+
|Kent Yao | Hangzhou, China| 11215016|
+---------+----------------------+----------+
```

#### Insert with both a partition spec and a column list

```sql
INSERT INTO students PARTITION (student_id = 11215017) (address, name) VALUES
('Hangzhou, China', 'Kent Yao Jr.');

SELECT * FROM students WHERE student_id = 11215017;
+------------+----------------------+----------+
| name| address|student_id|
+------------+----------------------+----------+
|Kent Yao Jr.| Hangzhou, China| 11215017|
+------------+----------------------+----------+
```

### Related Statements

* [INSERT OVERWRITE statement](sql-ref-syntax-dml-insert-overwrite-table.html)
Expand Down
43 changes: 41 additions & 2 deletions docs/sql-ref-syntax-dml-insert-overwrite-table.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ The `INSERT OVERWRITE` statement overwrites the existing data in the table using
### Syntax

```sql
INSERT OVERWRITE [ TABLE ] table_identifier [ partition_spec [ IF NOT EXISTS ] ]
INSERT OVERWRITE [ TABLE ] table_identifier [ partition_spec [ IF NOT EXISTS ] ] [ ( column_list ) ]
{ VALUES ( { value | NULL } [ , ... ] ) [ , ( ... ) ] | query }
```

Expand All @@ -40,11 +40,22 @@ INSERT OVERWRITE [ TABLE ] table_identifier [ partition_spec [ IF NOT EXISTS ] ]

* **partition_spec**

An optional parameter that specifies a comma separated list of key and value pairs
An optional parameter that specifies a comma-separated list of key and value pairs
for partitions.

**Syntax:** `PARTITION ( partition_col_name [ = partition_col_val ] [ , ... ] )`

* **column_list**

An optional parameter that specifies a comma-separated list of columns belonging to the `table_identifier` table.

**Note**

The current behaviour has some limitations:
- All specified columns should exist in the table and not be duplicated from each other. It includes all columns except the static partition columns.
- The size of the column list should be exactly the size of the data from `VALUES` clause or query.
- The order of the column list is alterable and determines how the data from `VALUES` clause or query to be inserted by position.

* **VALUES ( { value `|` NULL } [ , ... ] ) [ , ( ... ) ]**

Specifies the values to be inserted. Either an explicitly specified value or a NULL can be inserted.
Expand Down Expand Up @@ -169,6 +180,34 @@ SELECT * FROM students;
+-----------+-------------------------+----------+
```

#### Insert with a column list

```sql
INSERT OVERWRITE students (address, name, student_id) VALUES
('Hangzhou, China', 'Kent Yao', 11215016);

SELECT * FROM students WHERE name = 'Kent Yao';
+---------+----------------------+----------+
| name| address|student_id|
+---------+----------------------+----------+
|Kent Yao | Hangzhou, China| 11215016|
+---------+----------------------+----------+
```

#### Insert with both a partition spec and a column list

```sql
INSERT OVERWRITE students PARTITION (student_id = 11215016) (address, name) VALUES
('Hangzhou, China', 'Kent Yao Jr.');

SELECT * FROM students WHERE student_id = 11215016;
+------------+----------------------+----------+
| name| address|student_id|
+------------+----------------------+----------+
|Kent Yao Jr.| Hangzhou, China| 11215016|
+------------+----------------------+----------+
```

### Related Statements

* [INSERT INTO statement](sql-ref-syntax-dml-insert-into.html)
Expand Down

0 comments on commit a3dd8da

Please sign in to comment.