Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] REST Based Catalog #3424

Closed
wants to merge 10 commits into from

Conversation

kbendick
Copy link
Contributor

There are many users who are coming to the data space for the first time, or otherwise aren't used to used HMS.

For many of them, implementing a REST server and then choosing how to handle things on the server side is likely their preferred method. This would also allow people to plug in their own implementation around a somewhat known interface.

I'm opening this up for comment as I've begun some work on a REST catalog API, as well as an example implementation.

I'm seeking feedback and will also have a proposal eventually after it's been discussed in the community a bit. 🙂


}

private CreateNamespaceRequest(String namespaceName, Map<String, String> properties) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Catalog API allows a namespace that is any number of levels. Should this be Namespace, List<String>, or String[]?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes sense to use Namespace to me, but in the JSON, I'd prefer it to be a List<String> or String[] so users can specify as many levels as they'd like.

Will update.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will come back and update these things after we discuss the proposed specification 👍

*/
public class CreateNamespaceRequest implements Serializable {

// TODO - Use protected so users can extend this for their own impls. Or an interface.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are user implementations to be supported? I was thinking that we would treat the REST API as a spec so it wouldn't necessarily make sense to allow customization.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. I actually meant more what you said (about it as a spec). Bad phrasing on my part. I have been working through various versions and different code pathways, and ultimately decided to introduce this initially with CreateNamespaceRequest to open discussion.

I'll update this to be more clear that this designed to be a spec.

*/
public class CreateNamespaceResponse {
private Namespace namespace;
private Map<String, String> properties;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these the properties that the namespace has after create? For example, this may include a "created-at" property that was added by the server automatically.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that was my thinking (about server side additions). I was thinking that this would be either:

  1. Properties the server added, such as created-at.
  2. The entire properties of the namespace, including the ones the user provided, to give a full picture.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I meant #2. All properties of the namespace, including the ones the user sent and the ones that were added automatically. We should document this in the class Javadoc.

}

void setNamespaceName(String name) {
this.namespaceName = name;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why include setters and a builder? Should this be immutable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I included setters and getters so that it would be a proper bean for serialization purposes.

I admittedly don't love them. Going through, I see that the Aliyun is using springboot and annotations instead directly in their code.

Open to suggestions / ideas around that. And possibly this ultimately won't wind up needing to be a bean at all (not a huge fan of beans in general though I recognize they have their time and place - though that might not be now).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, looks like this is probably needed because there is no serializer and it needs to adhere to the bean interface.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this depends on how we want to specify the protocol. I prefer documenting the JSON objects that will be sent, which probably means going through and creating serializers/deserializers for each object.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. I also prefer documenting the JSON objects. I had thought that the getters and setters were still needed for the serializers though.

But I'm relatively green with the Apache HTTP library that I chose for the reference calls, so maybe I can get rid of this.

I'll play around with it, but definitely agreed that I prefer to specify the protocol and to keep it language agnostic (e.g. so people can use another language on the server side in addition to the clarity that documenting the JSON objects brings over the Java classes).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Admittedly, as I come from more of a Scala background (partcularly when implementing REST API clients), in some places I'm not sure what needs to be a bean and what doesn't for serialization. I've typically relied on case classes, which can pretty effortlessly be converted to and from JSON with much work, or more modern over the network serialization standards such as protobuf, thrift, or even Avro.

So some things might presently be beans that don't need to be. I'm happy to clean that up as I go along and come to be more familiar with some of these java bean related things =)

}

@Override
protected String defaultWarehouseLocation(TableIdentifier tableIdentifier) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that we want to implement BaseMetastoreCatalog, since it has conventions that we don't need, like this method. This method is a way for the catalog to determine the default location for a table. But shouldn't this be done on the server side for a REST-based catalog? That makes the most sense to me because we want to keep the client as simple as possible and do all of the customization in the server that is easier to change. Deployed client Jars are hard to update!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense to me. For the REST catalog, especially as a spec, the server should have a bit more control over what's done. I had a version that didn't use BaseMetastoreCatalog (and still do), but decided to forgo it initially given the deviation from the norm.

I'll work on switching it back as I'm fleshing out ideas for the spec.

.withProperties(metadata)
.build();

String path = properties.getOrDefault("create-namespace-path", "databases");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the default here need a starting /?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. I don't think it does, but I will add in a check.

Even if it doesn't, it seems a bit weird to have a path that doesn't start with a /. So I'll add some normalization code (or use some from elsewhere) to allow for that.

import java.io.Closeable;

/**
* Eventual Interface for pluggable RestClients.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean to allow customizing the HTTP client implementation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. That's what I mean. People might have a preference on the HTTP client implementation. Particularly if they use HTTP2 or HTTP3 even. So long as it conforms to the spec, it should in theory be fine.

But that was my thinking.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can actually learn a bit from the HTTP client in AWS SDK v2, it supports plugging in any HTTP client implementation. We can borrow the same design pattern here.

* Exception thrown by the REST client and catalog, to wrap a checked exception
* in an unchecked exception.
*/
public class UncheckedRestException extends RuntimeException {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't the other rest exception handle checked exceptions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely. Good catch. I had a few different branches where I was working out some possible ideas, and somehow I managed to leave both of those in.

I'll remove this one. 👍

};
}

static String getResponseBody(CloseableHttpResponse response) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be private? Also, try not to use get in names. If you can't replace it with something more specific (like find, fetch, create, etc.) then it's likely that it is just a filler word that can be omitted.

icebergException = response.getHeader("X-Iceberg-Exception").getValue();
} catch (Exception e) {
// TODO - Better error message and handling. Will be refactoring this anyway.
LOG.error("Encountered an error when getting the X-Iceberg-Exception header", e);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of using a header, what about having a way to encode an exception in the JSON payload? That seems to be the approach outlined here: https://www.baeldung.com/rest-api-error-handling-best-practices

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. That's more in line with how I've previously done it as well. The Facebook example is most in line with what I've previously used, where there's a dedicted error payload in the response body and then possibly even further error codes beyond just the normal 4xx, and those error codes map to what happened.

{
    "error": {
        "message": "Missing Bearer header.",
        "type": "OAuthException",
        "code": 40101,  // We document what 401xx error codes correspond to as part of the spec
        "trace_id": "AWswcVwbcqfgrSgjG80MtqJ"
    },
    data: {
         // Successful response payload 
    }
}

I will refactor to use an approach like that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1. I'm not sure that we need error codes, but we can always add them if people find them useful.

public static class NamespaceSerializer extends JsonSerializer<Namespace> {
@Override
public void serialize(Namespace namespace, JsonGenerator gen, SerializerProvider serializers) throws IOException {
gen.writeString(namespace.toString());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should probably be a list of strings to avoid losing the hierarchy when . is in one of the namespace identifiers.

Copy link
Contributor Author

@kbendick kbendick Nov 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah agreed. Let's sort it out in the spec for a bit and then make the necessary updates to the sereializers.

As mentioned in the community sync up, given that Trino needs some kind of JSON representation, I'm also going to meet with Jack to discuss and get his expertise on it too. 🙂


public static class NamespaceDeserializer extends JsonDeserializer<Namespace> {
@Override
public Namespace deserialize(JsonParser p, DeserializationContext ctxt) throws IOException {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you update these to use full word names? It's generally more readable to use parser rather than p. Also, I don't think that it is any easier to type ctxt than to type context, and it is definitely less readable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

100% agreed. After rebasing, I encountered a lot of HiddenField checkstyle errors and probably went a little overboard on changing the names.

public static class SchemaSerializer extends JsonSerializer<Schema> {
@Override
public void serialize(Schema schema, JsonGenerator gen, SerializerProvider serializers)
throws IOException {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this actually throw IOException? If not, can we remove it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this does throw IOException. Specifically, SchemaParser.toJson does.

I can wrap it with the UncheckedIOException or something else though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, my mistake. The serialize method itself declares the IOException. It looks like there's a delegate that can be used, so either I'll figure out the existing delegate or wrap with my own to avoid all of the checked exceptions if possible.

public SortOrder deserialize(JsonParser p, DeserializationContext context)
throws IOException {
JsonNode jsonNode = p.getCodec().readTree(p);
Schema schema = (Schema) context.getAttribute("schema");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What writes the schema into the sort order? It looks like these are going to require some context.

Copy link
Contributor Author

@kbendick kbendick Nov 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SortOrderParser.fromJson(Schema schema, JsonNode node) call that returns the final SortOrder is what writes the schema into it.

Though I do feel it could be made much easier to grok.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That method constructs a SortOrder for a given schema that must be passed in. It is on the deserialization side, not the serialization side. For the write side, SortOrderParser.toJson produces JSON but doesn't write the schema into it.

Copy link
Contributor Author

@kbendick kbendick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback Ryan. I will use it to iterate on this. As mentioned, I had a few different ideas for the spec that I was working on, so I will clean up and then add in some of those as well.

Comment on lines 49 to 53
String getNamespaceName() {
return namespaceName;
}

void setNamespaceName(String name) {
this.namespaceName = name;
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are beans for serialization purposes (typically we avoid the word get in methods as usually there's a more descriptive verb).

There's code gen tools, i.e. lombock, but for something that is to be used as part of an AP / spec, it might be wise to avoid that if possible. Very open to suggestions on that idea.

*/
public class CreateNamespaceRequest implements Serializable {

// TODO - Use protected so users can extend this for their own impls. Or an interface.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. I actually meant more what you said (about it as a spec). Bad phrasing on my part. I have been working through various versions and different code pathways, and ultimately decided to introduce this initially with CreateNamespaceRequest to open discussion.

I'll update this to be more clear that this designed to be a spec.

}

void setNamespaceName(String name) {
this.namespaceName = name;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I included setters and getters so that it would be a proper bean for serialization purposes.

I admittedly don't love them. Going through, I see that the Aliyun is using springboot and annotations instead directly in their code.

Open to suggestions / ideas around that. And possibly this ultimately won't wind up needing to be a bean at all (not a huge fan of beans in general though I recognize they have their time and place - though that might not be now).

*/
public class CreateNamespaceResponse {
private Namespace namespace;
private Map<String, String> properties;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that was my thinking (about server side additions). I was thinking that this would be either:

  1. Properties the server added, such as created-at.
  2. The entire properties of the namespace, including the ones the user provided, to give a full picture.

}

@Override
protected String defaultWarehouseLocation(TableIdentifier tableIdentifier) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense to me. For the REST catalog, especially as a spec, the server should have a bit more control over what's done. I had a version that didn't use BaseMetastoreCatalog (and still do), but decided to forgo it initially given the deviation from the norm.

I'll work on switching it back as I'm fleshing out ideas for the spec.

.withProperties(metadata)
.build();

String path = properties.getOrDefault("create-namespace-path", "databases");
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. I don't think it does, but I will add in a check.

Even if it doesn't, it seems a bit weird to have a path that doesn't start with a /. So I'll add some normalization code (or use some from elsewhere) to allow for that.

import java.io.Closeable;

/**
* Eventual Interface for pluggable RestClients.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. That's what I mean. People might have a preference on the HTTP client implementation. Particularly if they use HTTP2 or HTTP3 even. So long as it conforms to the spec, it should in theory be fine.

But that was my thinking.

* Exception thrown by the REST client and catalog, to wrap a checked exception
* in an unchecked exception.
*/
public class UncheckedRestException extends RuntimeException {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely. Good catch. I had a few different branches where I was working out some possible ideas, and somehow I managed to leave both of those in.

I'll remove this one. 👍

}
}

public static class TableMetadataSerializer extends JsonSerializer<TableMetadata> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll need a deserializer as well. Currently that depends on having a metadata file. I have a few ideas around this and will push something during my next round of clean up on this. Possibly placing it in a separate PR as well.

Eventually, I plan on breaking this into smaller PRs to be more easily digested, but this is to provide a larger picture overview.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, there's no need to only read from a metadata file.

public <T> T execute(
Method method, String path, Object body, Class<T> responseType,
Consumer<CloseableHttpResponse> errorHandler) {
HttpUriRequestBase request = new HttpUriRequestBase(method.name(), URI.create(String.format("%s/%s", baseUrl,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style: Prefer line wrapping as high in the parse tree as possible. Here, wrapping within the String.format within URI.create within new HttpUriRequestBase makes it hard to distinguish what level path is an argument to. Instead, wrapping each arg to a new line or just adding a newline for URI.create(...) is more readable.

try {
request.setEntity(new StringEntity(mapper.writeValueAsString(body)));
} catch (JsonProcessingException e) {
throw new UncheckedRestException(e, "Failed to write request body");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be good to have an exception that distinguishes problems before sending the request, like this one.


try (CloseableHttpResponse response = httpClient.execute(request)) {
if (response.getCode() != HttpStatus.SC_OK) {
errorHandler.accept(response);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this return after handling the response in case it doesn't throw an exception?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1. I decided to use the Apache HTTP library, as we're in an Apache project. But I'm more familiar with other HTTP libraries. I will return here and then consider more about libraries etc. I'm a big fan of OkHTTP, but I recognize that might not be in-line with the fact that this is an Apache repo.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can use whatever we want as long as the license is compatible. That said, if the library is already included with Spark and Flink, we generally prefer to use it so we don't need to shade and relocate it. That's why Apache HTTP is most often used.

HttpEntity entity = response.getEntity();
return mapper.readValue(entity.getContent(), responseType);
} catch (IOException e) {
throw new UncheckedIOException(e);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What throws the IOException? Can we add context about what failed here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I can add context. Currently, I believe it's either mapper.readValue or response.getEntity that throws it.

I'll double check and then add a note. I can also wrap and throw individually if multiple things are throwing it.

return mapper.readValue(entity.getContent(), responseType);
} catch (IOException e) {
throw new UncheckedIOException(e);
} catch (Exception e) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something throws Exception?

private Consumer<CloseableHttpResponse> defaultErrorHandler = ErrorHandlers.databaseErrorHandler();

private Builder() {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style: no blank line in empty methods.

private Configuration hadoopConf;
private FileIO fileIO;

// TODO - Refactor to use interface, which doesn't currently have POST etc embedded into it.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it seems like you already know enough about the methods in HttpClient to fill out the interface.

/**
* Represents a REST request to create a namespace / database.
*/
public class CreateNamespaceRequest implements Serializable {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why Serializable? Is this needed for the JSON bean serialization?

@kbendick kbendick force-pushed the rest-interface-list-namespaces branch from afc2862 to bcffe95 Compare November 15, 2021 06:09
@github-actions github-actions bot added the spark label Nov 15, 2021
* Represents a REST request to create a namespace / database.
*
* Possible example POST body (outer format object possibly only used on Response):
* { "data": { "namespace": ["prod", "accounting"], properties: {} }}
Copy link
Contributor Author

@kbendick kbendick Nov 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently it's actually as follows, without using an array for Namespace.

{ "namespace": "prod.accounting", "properties": { "key1": "value1" } }

Now that I have the mock server going, I'm going to enter an OpenAPI spec and then run the tests over the spec (as I've done for a create namespace based one).

Will put the spec in this PR as a separate document.

@kbendick kbendick force-pushed the rest-interface-list-namespaces branch from 4d05922 to 89d3cec Compare November 15, 2021 09:31
@kbendick kbendick force-pushed the rest-interface-list-namespaces branch from 1e0d30f to 5b74cb5 Compare November 16, 2021 19:20
* structure, including the possibility of richer metadata on errors, such as tracing telemetry or follw
* up user instruction.
*
* {
Copy link
Contributor

@jackye1995 jackye1995 Nov 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you need <pre> HTML tag for these code blocks to show properly in javadoc.


// TODO - Possibly authenticate with the server initially and then have the server return some of this information
Preconditions.checkNotNull(
properties.getOrDefault("baseUrl", null),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it be possible to reuse CatalogProperties.URI for this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh that's a good call. Will update.

this.baseUrl = properties.get("baseUrl");
this.fileIO = initializeFileIO(properties);

this.mapper = new ObjectMapper();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is recommended to use object mapper as a singleton. We can reuse JsonUtil.mapper() and add util methods around that to register modules.

And actually why not directly register the serdes in that mapper directly, instead of only registering it when initializing the REST catalog? I think it does not hurt to do that even for people not using REST catalog.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would certainly make it simpler. And I don't see the harm in registering the serdes directly, which would make a lot of things simpler. Good suggestion. 👍

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

public class RestCatalog extends BaseMetastoreCatalog implements Closeable, SupportsNamespaces, Configurable {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's probably better to directly implement it from interface instead of extending the base implementation. All the calls are delegated to the server anyway, there is not much benefit I see in inheriting the base logic. Some things like default location are really just util methods but not formalized idea that should not be inherited for REST catalog implementation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I am in agreement with that, BaseMetastoreCatalog doesn't provide much value as mentioned.

I stepped away from this PR for a bit to just focus on fleshing out the spec in the other PR. I will refactor to remove BaseMetastoreCatalog from here.

.build();

// TODO - This should come from the server side.
String path = properties.getOrDefault("create-namespace-path", "namespace");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is this about?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is left over from a previous idea I was toying with.

I will remove this and others like it. It would be pretty hard to deal with it from the iceberg library if users were rewriting the path, given that it contains path variables and things at times. This is just an artifact I forgot to remove. Good catch.

@nastra nastra self-requested a review November 18, 2021 16:59
nastra added a commit to nastra/iceberg that referenced this pull request Dec 8, 2021
* Use true immutable objects that are type-safe, thread-safe, null-safe
* Get builder classes for free
* Get JSON serialization/deserialization for free (this will also be helpful for the REST-based Catalog apache#3424)

This is relying on https://immutables.github.io/ (Apache License 2.0), which allows generating immutable objects and builders via annotation processing.
* Immutable objects are serialization ready (including JSON and its binary forms)
* Supports lazy, derived and optional attributes
* Immutable objects are constructed once, in a consistent state, and can be safely shared
  * Will fail if mandatory attributes are missing
  * Cannot be sneakily modified when passed to other code
* Immutable objects are naturally thread-safe and can therefore be safely shared among threads
  * No excessive copying
  * No excessive synchronization
* Object definitions are pleasant to write and read
  * No boilerplate setter and getters
  * No ugly IDE-generated hashCode, equals and toString methods that end up being stored in source control.
nastra added a commit to nastra/iceberg that referenced this pull request Dec 8, 2021
* Use true immutable objects that are type-safe, thread-safe, null-safe
* Get builder classes for free
* Get JSON serialization/deserialization for free (this will also be helpful for the REST-based Catalog apache#3424)

This is relying on https://immutables.github.io/ (Apache License 2.0), which allows generating immutable objects and builders via annotation processing.
* Immutable objects are serialization ready (including JSON and its binary forms)
* Supports lazy, derived and optional attributes
* Immutable objects are constructed once, in a consistent state, and can be safely shared
  * Will fail if mandatory attributes are missing
  * Cannot be sneakily modified when passed to other code
* Immutable objects are naturally thread-safe and can therefore be safely shared among threads
  * No excessive copying
  * No excessive synchronization
* Object definitions are pleasant to write and read
  * No boilerplate setter and getters
  * No ugly IDE-generated hashCode, equals and toString methods that end up being stored in source control.
nastra added a commit to nastra/iceberg that referenced this pull request Dec 8, 2021
* Use true immutable objects that are type-safe, thread-safe, null-safe
* Get builder classes for free
* Get JSON serialization/deserialization for free (this will also be helpful for the REST-based Catalog apache#3424)

This is relying on https://immutables.github.io/ (Apache License 2.0), which allows generating immutable objects and builders via annotation processing.
* Immutable objects are serialization ready (including JSON and its binary forms)
* Supports lazy, derived and optional attributes
* Immutable objects are constructed once, in a consistent state, and can be safely shared
  * Will fail if mandatory attributes are missing
  * Cannot be sneakily modified when passed to other code
* Immutable objects are naturally thread-safe and can therefore be safely shared among threads
  * No excessive copying
  * No excessive synchronization
* Object definitions are pleasant to write and read
  * No boilerplate setter and getters
  * No ugly IDE-generated hashCode, equals and toString methods that end up being stored in source control.
nastra added a commit to nastra/iceberg that referenced this pull request Dec 9, 2021
* Use true immutable objects that are type-safe, thread-safe, null-safe
* Get builder classes for free
* Get JSON serialization/deserialization for free (this will also be helpful for the REST-based Catalog apache#3424)

This is relying on https://immutables.github.io/ (Apache License 2.0), which allows generating immutable objects and builders via annotation processing.
* Immutable objects are serialization ready (including JSON and its binary forms)
* Supports lazy, derived and optional attributes
* Immutable objects are constructed once, in a consistent state, and can be safely shared
  * Will fail if mandatory attributes are missing
  * Cannot be sneakily modified when passed to other code
* Immutable objects are naturally thread-safe and can therefore be safely shared among threads
  * No excessive copying
  * No excessive synchronization
* Object definitions are pleasant to write and read
  * No boilerplate setter and getters
  * No ugly IDE-generated hashCode, equals and toString methods that end up being stored in source control.
@kbendick
Copy link
Contributor Author

I'm closing this as:

  1. Its rather old
  2. We had multiple rounds of review to get some of the spec in.
  3. This no longer reflects really what we agreed on.

I'll be opening a new PR (and ideally splitting it off into sub-PRs for easier digestion) ASAP. Was out sick for a week or so after the initial spec was reviewed.

@kbendick kbendick closed this Jan 18, 2022
@kbendick kbendick deleted the rest-interface-list-namespaces branch January 18, 2022 21:42
nastra added a commit to nastra/iceberg that referenced this pull request Feb 2, 2022
* Use true immutable objects that are type-safe, thread-safe, null-safe
* Get builder classes for free
* Get JSON serialization/deserialization for free (this will also be helpful for the REST-based Catalog apache#3424)

This is relying on https://immutables.github.io/ (Apache License 2.0), which allows generating immutable objects and builders via annotation processing.
* Immutable objects are serialization ready (including JSON and its binary forms)
* Supports lazy, derived and optional attributes
* Immutable objects are constructed once, in a consistent state, and can be safely shared
  * Will fail if mandatory attributes are missing
  * Cannot be sneakily modified when passed to other code
* Immutable objects are naturally thread-safe and can therefore be safely shared among threads
  * No excessive copying
  * No excessive synchronization
* Object definitions are pleasant to write and read
  * No boilerplate setter and getters
  * No ugly IDE-generated hashCode, equals and toString methods that end up being stored in source control.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants