Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-48824][SQL] Add Identity Column SQL syntax #47614

Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 24 additions & 0 deletions common/utils/src/main/resources/error/error-conditions.json
Original file line number Diff line number Diff line change
Expand Up @@ -1583,6 +1583,30 @@
],
"sqlState" : "42601"
},
"IDENTITY_COLUMNS_DUPLICATED_SEQUENCE_GENERATOR_OPTION" : {
"message" : [
"Duplicated IDENTITY column sequence generator option: <sequenceGeneratorOption>."
],
"sqlState" : "42601"
},
"IDENTITY_COLUMNS_ILLEGAL_STEP" : {
"message" : [
"IDENTITY column step cannot be 0."
],
"sqlState" : "42611"
},
"IDENTITY_COLUMNS_UNSUPPORTED_DATA_TYPE" : {
"message" : [
"DataType <dataType> is not supported for IDENTITY columns."
],
"sqlState" : "428H2"
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 Always nice to see thoughtful pick for SQLSTATE :-)

"IDENTITY_COLUMN_WITH_DEFAULT_VALUE" : {
"message" : [
"A column cannot have both a default value and an identity column specification but column <colName> has default value: (<defaultValue>) and identity column specification: (<identityColumnSpec>)."
],
"sqlState" : "42623"
},
"ILLEGAL_DAY_OF_WEEK" : {
"message" : [
"Illegal input for day of week: <string>."
Expand Down
2 changes: 2 additions & 0 deletions docs/sql-ref-ansi-compliance.md
Original file line number Diff line number Diff line change
Expand Up @@ -536,12 +536,14 @@ Below is a list of all the keywords in Spark SQL.
|HOUR|non-reserved|non-reserved|non-reserved|
|HOURS|non-reserved|non-reserved|non-reserved|
|IDENTIFIER|non-reserved|non-reserved|non-reserved|
|IDENTITY|non-reserved|non-reserved|non-reserved|
|IF|non-reserved|non-reserved|not a keyword|
|IGNORE|non-reserved|non-reserved|non-reserved|
|IMMEDIATE|non-reserved|non-reserved|non-reserved|
|IMPORT|non-reserved|non-reserved|non-reserved|
|IN|reserved|non-reserved|reserved|
|INCLUDE|non-reserved|non-reserved|non-reserved|
|INCREMENT|non-reserved|non-reserved|non-reserved|
|INDEX|non-reserved|non-reserved|non-reserved|
|INDEXES|non-reserved|non-reserved|non-reserved|
|INNER|reserved|strict-non-reserved|reserved|
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -256,12 +256,14 @@ BINARY_HEX: 'X';
HOUR: 'HOUR';
HOURS: 'HOURS';
IDENTIFIER_KW: 'IDENTIFIER';
IDENTITY: 'IDENTITY';
IF: 'IF';
IGNORE: 'IGNORE';
IMMEDIATE: 'IMMEDIATE';
IMPORT: 'IMPORT';
IN: 'IN';
INCLUDE: 'INCLUDE';
INCREMENT: 'INCREMENT';
INDEX: 'INDEX';
INDEXES: 'INDEXES';
INNER: 'INNER';
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -1288,7 +1288,22 @@ colDefinitionOption
;

generationExpression
: GENERATED ALWAYS AS LEFT_PAREN expression RIGHT_PAREN
: GENERATED ALWAYS AS LEFT_PAREN expression RIGHT_PAREN #generatedColumn
| GENERATED (ALWAYS | BY DEFAULT) AS IDENTITY identityColSpec? #identityColumn
;

identityColSpec
: LEFT_PAREN sequenceGeneratorOption* RIGHT_PAREN
;

sequenceGeneratorOption
: START WITH start=sequenceGeneratorStartOrStep
| INCREMENT BY step=sequenceGeneratorStartOrStep
;

sequenceGeneratorStartOrStep
: MINUS? INTEGER_VALUE
| MINUS? BIGINT_LITERAL
;

complexColTypeList
Expand Down Expand Up @@ -1578,11 +1593,13 @@ ansiNonReserved
| HOUR
| HOURS
| IDENTIFIER_KW
| IDENTITY
| IF
| IGNORE
| IMMEDIATE
| IMPORT
| INCLUDE
| INCREMENT
| INDEX
| INDEXES
| INPATH
Expand Down Expand Up @@ -1929,12 +1946,14 @@ nonReserved
| HOUR
| HOURS
| IDENTIFIER_KW
| IDENTITY
| IF
| IGNORE
| IMMEDIATE
| IMPORT
| IN
| INCLUDE
| INCREMENT
| INDEX
| INDEXES
| INPATH
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

package org.apache.spark.sql.connector.catalog;
zhipengmao-db marked this conversation as resolved.
Show resolved Hide resolved
import org.apache.spark.annotation.Evolving;

import java.util.Objects;

/**
* Identity column specification.
*/
@Evolving
public class IdentityColumnSpec {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's add @Evolving

private final long start;
private final long step;
private final boolean allowExplicitInsert;

/**
* Creates an identity column specification.
* @param start the start value to generate the identity values
* @param step the step value to generate the identity values
* @param allowExplicitInsert whether the identity column allows explicit insertion of values
*/
public IdentityColumnSpec(long start, long step, boolean allowExplicitInsert) {
this.start = start;
this.step = step;
this.allowExplicitInsert = allowExplicitInsert;
}

/**
* @return the start value to generate the identity values
*/
public long getStart() {
return start;
}

/**
* @return the step value to generate the identity values
*/
public long getStep() {
return step;
}

/**
* @return whether the identity column allows explicit insertion of values
*/
public boolean isAllowExplicitInsert() {
return allowExplicitInsert;
}

@Override
public boolean equals(Object o) {
if (this == o) return true;
if (o == null || getClass() != o.getClass()) return false;
IdentityColumnSpec that = (IdentityColumnSpec) o;
return start == that.start &&
step == that.step &&
allowExplicitInsert == that.allowExplicitInsert;
}

@Override
public int hashCode() {
return Objects.hash(start, step, allowExplicitInsert);
}

@Override
public String toString() {
return "IdentityColumnSpec{" +
"start=" + start +
", step=" + step +
", allowExplicitInsert=" + allowExplicitInsert +
"}";
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -556,6 +556,25 @@ private[sql] object QueryParsingErrors extends DataTypeErrorsBase {
ctx)
}

def identityColumnUnsupportedDataType(
ctx: IdentityColumnContext,
dataType: String): Throwable = {
new ParseException("IDENTITY_COLUMNS_UNSUPPORTED_DATA_TYPE", Map("dataType" -> dataType), ctx)
}

def identityColumnIllegalStep(ctx: IdentityColSpecContext): Throwable = {
new ParseException("IDENTITY_COLUMNS_ILLEGAL_STEP", Map.empty, ctx)
}

def identityColumnDuplicatedSequenceGeneratorOption(
ctx: IdentityColSpecContext,
sequenceGeneratorOption: String): Throwable = {
new ParseException(
"IDENTITY_COLUMNS_DUPLICATED_SEQUENCE_GENERATOR_OPTION",
Map("sequenceGeneratorOption" -> sequenceGeneratorOption),
ctx)
}

def createViewWithBothIfNotExistsAndReplaceError(ctx: CreateViewContext): Throwable = {
new ParseException(errorClass = "_LEGACY_ERROR_TEMP_0052", ctx)
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ static Column create(
boolean nullable,
String comment,
String metadataInJSON) {
return new ColumnImpl(name, dataType, nullable, comment, null, null, metadataInJSON);
return new ColumnImpl(name, dataType, nullable, comment, null, null, null, metadataInJSON);
}

static Column create(
Expand All @@ -63,7 +63,8 @@ static Column create(
String comment,
ColumnDefaultValue defaultValue,
String metadataInJSON) {
return new ColumnImpl(name, dataType, nullable, comment, defaultValue, null, metadataInJSON);
return new ColumnImpl(name, dataType, nullable, comment, defaultValue,
null, null, metadataInJSON);
}

static Column create(
Expand All @@ -74,7 +75,18 @@ static Column create(
String generationExpression,
String metadataInJSON) {
return new ColumnImpl(name, dataType, nullable, comment, null,
generationExpression, metadataInJSON);
generationExpression, null, metadataInJSON);
}

static Column create(
String name,
DataType dataType,
boolean nullable,
String comment,
IdentityColumnSpec identityColumnSpec,
String metadataInJSON) {
return new ColumnImpl(name, dataType, nullable, comment, null,
null, identityColumnSpec, metadataInJSON);
}

/**
Expand Down Expand Up @@ -113,6 +125,12 @@ static Column create(
@Nullable
String generationExpression();

/**
* Returns the identity column specification of this table column. Null means no identity column.
*/
@Nullable
IdentityColumnSpec identityColumnSpec();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to double check, only one of the defaultValue, generationExpression and identityColumnSpec can be non-null, right?

Copy link
Contributor Author

@zhipengmao-db zhipengmao-db Sep 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great catch! Yes only one of the defaultValue, generationExpression and identityColumnSpec can be non-null.
We blocked generation expression to be specified with identity column, but we did not block defaultValue to be specified with identity column. Will provide a fix in this PR.


/**
* Returns the column metadata in JSON format.
*/
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -59,5 +59,23 @@ public enum TableCatalogCapability {
* {@link TableCatalog#createTable}.
* See {@link Column#defaultValue()}.
*/
SUPPORT_COLUMN_DEFAULT_VALUE
SUPPORT_COLUMN_DEFAULT_VALUE,

/**
* Signals that the TableCatalog supports defining identity columns upon table creation in SQL.
* <p>
* Without this capability, any create/replace table statements with an identity column defined
* in the table schema will throw an exception during analysis.
* <p>
* An identity column is defined with syntax:
* {@code colName colType GENERATED ALWAYS AS IDENTITY(identityColumnSpec)}
* or
* {@code colName colType GENERATED BY DEFAULT AS IDENTITY(identityColumnSpec)}
* identityColumnSpec is defined with syntax: {@code [START WITH start | INCREMENT BY step]*}
* <p>
* IdentitySpec is included in the column definition for APIs like
* {@link TableCatalog#createTable}.
* See {@link Column#identityColumnSpec()}.
*/
SUPPORTS_CREATE_TABLE_WITH_IDENTITY_COLUMNS
}
Loading