Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Python Schema] Python schema support custom Avro configurations for Enum type #12642

Merged

Conversation

gaoran10
Copy link
Contributor

@gaoran10 gaoran10 commented Nov 5, 2021

Motivation

Currently, the Python client didn't support setting configurations required, default, required_default for Enum type in Record.

Modifications

Modify the _Enum class to CustomEnum class, the _Enum wasn't exposed to users, the new class CustomEnum will be exposed to users, they could set Avro definition configurations required, default, required_default.

How to use

class Color(Enum):
    red = 1
    green = 2
    blue = 3

class NestedObj(Record):
    a = Integer()
    color = CustomEnum(Color, required_default=True, default=Color.blue)

The schema definition will be like this

{
	'type': 'record', 
	'name': 'NestedObj', 
	'fields': [
		{'name': 'a', 'type': ['null', 'int']},
		{'name': 'color', 'default': 'blue', 'type': ['null', {'type': 'enum', 'name': 'Color', 'symbols': ['red', 'green', 'blue']}]}
	]
}

The old way of use will also be preserved.

This feature could work well with Java client.

Verifying this change

Make some changes in existing tests to verify this new feature.

Does this pull request potentially affect one of the following parts:

If yes was chosen, please highlight the changes

  • Dependencies (does it add or upgrade a dependency): (no)
  • The public API: (no)
  • The schema: (no)
  • The default values of configurations: (no)
  • The wire protocol: (no)
  • The rest endpoints: (no)
  • The admin cli options: (no)
  • Anything that affects deployment: (no)

Documentation

Check the box below and label this PR (if you have committer privilege).

Need to update docs?

  • doc-required

    (If you need help on updating docs, create a doc issue)

  • no-need-doc

    (Please explain why)

  • doc

    (If this PR contains doc changes)

@github-actions
Copy link

github-actions bot commented Nov 5, 2021

@gaoran10:Thanks for your contribution. For this PR, do we need to update docs?
(The PR template contains info about doc, which helps others know more about the changes. Can you provide doc-related info in this and future PR descriptions? Thanks)

@codelipenghui codelipenghui added this to the 2.10.0 milestone Nov 5, 2021
@codelipenghui codelipenghui merged commit e7389ed into apache:master Nov 7, 2021
codelipenghui pushed a commit that referenced this pull request Nov 7, 2021
…Enum type (#12642)

### Motivation

Currently, the Python client didn't support setting configurations `required`, `default`, `required_default` for Enum type in Record.

### Modifications

Modify the `_Enum` class to `CustomEnum` class, the `_Enum` wasn't exposed to users, the new class `CustomEnum` will be exposed to users, they could set Avro definition configurations `required`, `default`, `required_default`.

### How to use

```
class Color(Enum):
    red = 1
    green = 2
    blue = 3

class NestedObj(Record):
    a = Integer()
    color = CustomEnum(Color, required_default=True, default=Color.blue)
```

The schema definition will be like this
```
{
	'type': 'record',
	'name': 'NestedObj',
	'fields': [
		{'name': 'a', 'type': ['null', 'int']},
		{'name': 'color', 'default': 'blue', 'type': ['null', {'type': 'enum', 'name': 'Color', 'symbols': ['red', 'green', 'blue']}]}
	]
}
```

The old way of use will also be preserved.

This feature could work well with Java client.

(cherry picked from commit e7389ed)
@codelipenghui codelipenghui added the cherry-picked/branch-2.8 Archived: 2.8 is end of life label Nov 7, 2021
@gaoran10 gaoran10 deleted the gaoran/python-schema-support-custom-enum branch November 7, 2021 14:22
zeo1995 pushed a commit to zeo1995/pulsar that referenced this pull request Nov 7, 2021
* up/master: (55 commits)
  [broker] remove useless method "PersistentTopic#getPersistentTopic" (apache#12655)
  [Python Schema] Python schema support custom Avro configurations for Enum type (apache#12642)
  Allow to configure different implementations for Pulsar functions state store (apache#12646)
  Remove replicator global test from the quarantine group (apache#12648)
  [Java Client] Remove invalid call to Thread.currentThread().interrupt(); (apache#12652)
  k8s runtime: force deletion to avoid hung function worker during connector restart (apache#12504)
  [Broker] Optimize exception information for schemas (apache#12647)
  Close Zk database on unit tests (apache#12649)
  Fix call sync method in an async callback when enabling geo replicator. (apache#12590)
  [pulsar-broker] Add git branch information for PulsarVersion (apache#12541)
  PulsarAdmin: Fix last exit code storage (apache#12581)
  Add @test annotation to test methods (apache#12640)
  Upgrade debezium to 1.7.1 (apache#12644)
  [ML] Avoid passing OpAddEntry across a thread boundary in asyncAddEntry (apache#12606)
  [Functions] Prevent NPE while stopping a non started Pulsar LogAppender (apache#12643)
  Update io-debezium-source.md (apache#12638)
  Add missing cmds on pulsar-admin document page (apache#12634)
  Clean up the metadata of the non-persistent partitioned topics. (apache#12550)
  modify check waitingForPingResponse with volatile (apache#12615)
  [pulsar-admin] Check backlog quota policy for namespace (apache#12512)
  ...
zeo1995 pushed a commit to zeo1995/pulsar that referenced this pull request Nov 7, 2021
* up/master: (55 commits)
  [broker] remove useless method "PersistentTopic#getPersistentTopic" (apache#12655)
  [Python Schema] Python schema support custom Avro configurations for Enum type (apache#12642)
  Allow to configure different implementations for Pulsar functions state store (apache#12646)
  Remove replicator global test from the quarantine group (apache#12648)
  [Java Client] Remove invalid call to Thread.currentThread().interrupt(); (apache#12652)
  k8s runtime: force deletion to avoid hung function worker during connector restart (apache#12504)
  [Broker] Optimize exception information for schemas (apache#12647)
  Close Zk database on unit tests (apache#12649)
  Fix call sync method in an async callback when enabling geo replicator. (apache#12590)
  [pulsar-broker] Add git branch information for PulsarVersion (apache#12541)
  PulsarAdmin: Fix last exit code storage (apache#12581)
  Add @test annotation to test methods (apache#12640)
  Upgrade debezium to 1.7.1 (apache#12644)
  [ML] Avoid passing OpAddEntry across a thread boundary in asyncAddEntry (apache#12606)
  [Functions] Prevent NPE while stopping a non started Pulsar LogAppender (apache#12643)
  Update io-debezium-source.md (apache#12638)
  Add missing cmds on pulsar-admin document page (apache#12634)
  Clean up the metadata of the non-persistent partitioned topics. (apache#12550)
  modify check waitingForPingResponse with volatile (apache#12615)
  [pulsar-admin] Check backlog quota policy for namespace (apache#12512)
  ...
eolivelli pushed a commit to eolivelli/pulsar that referenced this pull request Nov 29, 2021
…Enum type (apache#12642)

### Motivation

Currently, the Python client didn't support setting configurations `required`, `default`, `required_default` for Enum type in Record.

### Modifications

Modify the `_Enum` class to `CustomEnum` class, the `_Enum` wasn't exposed to users, the new class `CustomEnum` will be exposed to users, they could set Avro definition configurations `required`, `default`, `required_default`.

### How to use

```
class Color(Enum):
    red = 1
    green = 2
    blue = 3

class NestedObj(Record):
    a = Integer()
    color = CustomEnum(Color, required_default=True, default=Color.blue)
```

The schema definition will be like this
```
{
	'type': 'record', 
	'name': 'NestedObj', 
	'fields': [
		{'name': 'a', 'type': ['null', 'int']},
		{'name': 'color', 'default': 'blue', 'type': ['null', {'type': 'enum', 'name': 'Color', 'symbols': ['red', 'green', 'blue']}]}
	]
}
```


The old way of use will also be preserved.

This feature could work well with Java client.
codelipenghui pushed a commit that referenced this pull request Dec 20, 2021
…Enum type (#12642)

### Motivation

Currently, the Python client didn't support setting configurations `required`, `default`, `required_default` for Enum type in Record.

### Modifications

Modify the `_Enum` class to `CustomEnum` class, the `_Enum` wasn't exposed to users, the new class `CustomEnum` will be exposed to users, they could set Avro definition configurations `required`, `default`, `required_default`.

### How to use

```
class Color(Enum):
    red = 1
    green = 2
    blue = 3

class NestedObj(Record):
    a = Integer()
    color = CustomEnum(Color, required_default=True, default=Color.blue)
```

The schema definition will be like this
```
{
	'type': 'record',
	'name': 'NestedObj',
	'fields': [
		{'name': 'a', 'type': ['null', 'int']},
		{'name': 'color', 'default': 'blue', 'type': ['null', {'type': 'enum', 'name': 'Color', 'symbols': ['red', 'green', 'blue']}]}
	]
}
```

The old way of use will also be preserved.

This feature could work well with Java client.

(cherry picked from commit e7389ed)
@codelipenghui codelipenghui added the cherry-picked/branch-2.9 Archived: 2.9 is end of life label Dec 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cherry-picked/branch-2.8 Archived: 2.8 is end of life cherry-picked/branch-2.9 Archived: 2.9 is end of life doc-not-needed Your PR changes do not impact docs release/2.8.2 release/2.9.2
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants