-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Kernel] [CC Refactor #1] Add ConfigurationProvider, TableIdentifier, TableBuilder APIs #3781
[Kernel] [CC Refactor #1] Add ConfigurationProvider, TableIdentifier, TableBuilder APIs #3781
Conversation
@@ -181,14 +197,6 @@ public CloseableIterator<ColumnarBatch> getChanges( | |||
}); | |||
} | |||
|
|||
protected Path getDataPath() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move protected methods to be below public methods
|
||
package io.delta.kernel; | ||
|
||
/** Identifier for a table. e.g. <catalog> / <schema> / <tableName> */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@since 3.3.0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Evolving
|
||
package io.delta.kernel; | ||
|
||
/** Identifier for a table. e.g. <catalog> / <schema> / <tableName> */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[error] /home/runner/work/delta/delta/kernel/kernel-api/src/main/java/io/delta/kernel/TableIdentifier.java:19:1: error: unknown tag: catalog
[error] /** Identifier for a table. e.g. <catalog> / <schema> / <tableName> */
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are some qns (not blocking)
/** Returns the name of the table. */ | ||
public String getName() { | ||
return name; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
toString (for printing into logs). equals, hashCode (may add if needed later, but I do see this is useful in near future).
* <p>These values are not Delta table properties. Delta table properties are stored in the Delta | ||
* log metadata and apply globally to all readers and writers interacting with the table. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you give an example of where this gets used or what sort of configuration can it contain?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First example usage
The implementation of
CommitCoordinatorClient Engine::getCommitCoordinatorClient(
String commitCoordinatorName,
ConfigurationProvider sessionConfig,
Map<String, String> commitCoordinatorConf);
will use it to lookup which builder to use for the given commitCoordinatorName
.
It will lookup a value like io.delta.kernel.commitCoordinatorBuilder.unity-catalog.impl
Second example usage
Then, the actual builder referenced above will lookup various keys / configurations needed to instantiate the client.
* @param key the configuration key | ||
* @return {@code true} if the key exists, else {@code false} | ||
*/ | ||
boolean contains(String key); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
contains
is tricky to define for conf. There is always some default conf and what does it mean when call contains
.
Is it better to just have Optional<String> get(String key)
and avoid the multiple APIs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense. Thanks!
* configuration is provided or expected. | ||
*/ | ||
public class EmptyConfigurationProvider implements ConfigurationProvider { | ||
@Override |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can define just one static instance and use it everywhere rather than creating a new instance everytime.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that could be even in ConfigurationProvider
(avoids a new file).
import java.util.Optional; | ||
|
||
public class TableBuilderImpl implements TableBuilder { | ||
private final Engine engine; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we avoid this kind of pattern where Engine
is stored in class member variables.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense. We will have to think about how to work around this, then
// Table.java
static TableBuilder builder() { return new TableBuilderImpl(); }
// TableBuilder.java
public Table build(Engine engine, String path)
requireNonNull(engine, "Expected non-null value for 'engine'"); | ||
requireNonNull(path, "Expected non-null value for 'path'"); | ||
|
||
this.engine = engine; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this.engine = engine; | |
this.engine = requireNonNull(engine, "blah blah"); |
* @since 3.3.0 | ||
*/ | ||
@Evolving | ||
public interface ConfigurationProvider { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the lifecycle of this object? Can Kernel maintain a reference to it? expected to release it after some event? can Engine update it while Kernel is using it (i.e return different values at different times for the same key)?
/** The name of the table. */ | ||
private final String name; | ||
|
||
/** The namespace of the table. e.g. <catalog> / <schema> */ | ||
private final String[] namespace; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: The order of constructor arguments & getters is namespace
→ name
. I would recommend using the same order in fields.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Will do! I'm still working on some more PRs to fully flesh out this design. Will get to this later.
this.namespace = namespace; | ||
this.name = name; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
requireNonNull
seems missing. Also, it would be nice to ensure non-empty in my opinion.
requireNonNull(engine, "Expected non-null value for 'engine'"); | ||
requireNonNull(path, "Expected non-null value for 'path'"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The existing message pattern of requireNonNull
is {variable} is null
. I would recommend using the same pattern for consistency.
private Optional<ConfigurationProvider> configProviderOpt = Optional.empty(); | ||
private Optional<TableIdentifier> tableIdOpt = Optional.empty(); | ||
|
||
/** Note: This is not for a public API. It's for internal testing use only. */ | ||
private Optional<Clock> clockOpt = Optional.empty(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no need to add opt
suffix in my opinion. It's obvious from the type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hear that. This is certainly one of my personal preferences. I like variables being fully self-describing. The value isn't a Clock
, but distinctly a Clock Optional
. With this current naming convention, if we did a .map
, I could refer to the inner value as clock
, but if I had the outer variable also named clock
, this would get confusing.
* @since 3.3.0 | ||
*/ | ||
@Evolving | ||
public interface ConfigurationProvider { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
UC Commit Coordinator Client needs to be able to iterate through all configs
NOTE: I won't be merging any of these PRs until the entire e2e implementation is complete and all PRs are approved
This is part of some stacked PRs:
Which Delta project/connector is this regarding?
Description
New Interfaces/Classes
ConfigurationProvider
interface, that will let kernel get per-session configurations in the futureTableIdentifier
class, that kernel will pass on to Commit Coordinator servicesTableBuilder
interface, that that constructs kernelTable
instances, with optionalconfig
andtableId
paramsCleanup
TableImpl.forPath(engine, tablePath, () => System.currentTimeMillis)
--> the System.currentTimeMilis is the default clock, anyways. No need to point to an internal API for that, too.How was this patch tested?
Existing UTs.
Does this PR introduce any user-facing changes?
Yes. New
ConfigurationProvider
interface, newTableIdentifier
class, newTableBuilder
interface