Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: Conceptually clarify schema vs query-language #924

Merged
merged 2 commits into from
Nov 1, 2022

Conversation

AndrewSisley
Copy link
Contributor

Relevant issue(s)

Resolves #922

Description

Trades performance on a very uncritical code path for conceptual separation. The code is still somewhat coupled together, however it is IMO much clearer than before.

@AndrewSisley AndrewSisley added area/schema Related to the schema system area/parser Related to the parser components refactor This issue specific to or requires *notable* refactoring of existing codebases and components action/no-benchmark Skips the action that runs the benchmark. labels Oct 28, 2022
@AndrewSisley AndrewSisley added this to the DefraDB v0.4 milestone Oct 28, 2022
@AndrewSisley AndrewSisley requested a review from a team October 28, 2022 19:33
@AndrewSisley AndrewSisley self-assigned this Oct 28, 2022
@AndrewSisley AndrewSisley force-pushed the sisley/refactor/I922-schema-rework branch from c101b6a to bd6594a Compare October 28, 2022 19:34
) ([]client.CollectionDescription, []core.SchemaDefinition, error) {
// We should not be using this package at all here, and should not be doing most of what this package does
// Rework: https://github.com/sourcenetwork/defradb/issues/923
schemaManager, err := schema.NewSchemaManager()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A new schema manager is required, as each instance caches the types and reusing the gql-query-lang one would cause collisions

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thought: I look forward do see what you're going to do here :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be quite simple I think, and we'll be able to go straight from ast to collectionDescription without worrying about gql-types or some virtual fields (IIRC it has to skip over stuff like _group and aggregates)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. I don't think we want to do a new schema manager for each createDescriptions exec. You should be able to refer to previous types when registering new types. That wouldn't be possible if we have a new manager on each exec.

  2. I don't necessarily see the benefit of going from AST to Description without a gql type intermediary, but thats a problem that can be kicked down the road.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should be able to refer to previous types when registering new types. That wouldn't be possible if we have a new manager on each exec

Hmm, I forgot about that, and we have no tests for that either (and so it is quite possibly already broken, and I remember it having been broken before at one point - we need to cover this). For a test gap like this I'd be extremely tempted to say it is broken already, and we need to sort it out in another ticket anyway due to the scope-creep/likelyhood-of-finding-many-more-bugs.

Copy link
Contributor Author

@AndrewSisley AndrewSisley Oct 31, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I started working on this, and what you are saying is currently syntactically impossible with our current system. A schema cannot reference an existing type, as that type also needs to be amended to reference the new type (in the schema definition):

var bookAuthorGQLSchema = (`
	type book {
		name: String
		author: author
	}

	type author {
		name: String
		published: [book]
	}
`)

See how both definitions contain a reference to each other - you cannot add author, and then add book (without update, which we dont support atm).

Copy link
Member

@jsimnz jsimnz Nov 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Realized this after the fact, indeed very true.

I still would've preferred a single instance of the schema, I don't really see the value in creating one for each request. There's no way to reference the collection GQL types in a consistent way.

What benefit does this approach have in your mind?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What benefit does this approach have in your mind?

Absence of state, simplicity, ease of ripping out, minimized risk of state-polution, and the fact that persisting it provides no assertable benefits atm

@codecov
Copy link

codecov bot commented Oct 28, 2022

Codecov Report

Merging #924 (3cfc861) into develop (b847465) will decrease coverage by 0.02%.
The diff coverage is 60.52%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop     #924      +/-   ##
===========================================
- Coverage    60.08%   60.05%   -0.03%     
===========================================
  Files          159      159              
  Lines        17523    17532       +9     
===========================================
+ Hits         10528    10529       +1     
- Misses        6061     6066       +5     
- Partials       934      937       +3     
Impacted Files Coverage Δ
db/schema.go 44.26% <57.14%> (+7.89%) ⬆️
query/graphql/parser.go 88.67% <100.00%> (+3.95%) ⬆️

Descriptions: colDesc,
}, nil
func (p *parser) AddSchema(ctx context.Context, schema string) error {
_, _, err := p.schemaManager.Generator.FromSDL(ctx, schema)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: Is this simply cashing the schema for the GQL package?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generating (the query language types - used for validation and introspection) and caching them - this is done everytime the database is initialized (e.g. on restart), as well as when a new schema is provided

) ([]client.CollectionDescription, []core.SchemaDefinition, error) {
// We should not be using this package at all here, and should not be doing most of what this package does
// Rework: https://github.com/sourcenetwork/defradb/issues/923
schemaManager, err := schema.NewSchemaManager()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thought: I look forward do see what you're going to do here :)

Copy link
Collaborator

@fredcarle fredcarle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jsimnz
Copy link
Member

jsimnz commented Oct 30, 2022

Have some (minor) thoughts, but at all event all day. Should be able to do a review tonight

jsimnz
jsimnz previously requested changes Oct 31, 2022
Copy link
Member

@jsimnz jsimnz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, small change was suggested on the SchemaManager stuff

) ([]client.CollectionDescription, []core.SchemaDefinition, error) {
// We should not be using this package at all here, and should not be doing most of what this package does
// Rework: https://github.com/sourcenetwork/defradb/issues/923
schemaManager, err := schema.NewSchemaManager()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. I don't think we want to do a new schema manager for each createDescriptions exec. You should be able to refer to previous types when registering new types. That wouldn't be possible if we have a new manager on each exec.

  2. I don't necessarily see the benefit of going from AST to Description without a gql type intermediary, but thats a problem that can be kicked down the road.

return nil, nil, err
}

types, astdoc, err := schemaManager.Generator.FromSDL(ctx, schemaString)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion(non-blocking): Do you want to expose the BuildTypesFromAST func instead of executing the entire FromSDL generation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, I do not want to make the existing generator code more complex by adding a new entry point, for the sake of temporary code. Performance doesn't matter much here, so the code-cleanliness and 'keep it as before and dont worry' is much preferred.

A later commit will remove this function from the Parser type, so please dont comment on that :) Performance is not an issue here either, so the wastage is not really an issue in the short/medium-term
@AndrewSisley AndrewSisley force-pushed the sisley/refactor/I922-schema-rework branch from bd6594a to 3cfc861 Compare November 1, 2022 13:37
@AndrewSisley AndrewSisley dismissed jsimnz’s stale review November 1, 2022 13:53

Questions answered

@AndrewSisley AndrewSisley merged commit a530749 into develop Nov 1, 2022
@AndrewSisley AndrewSisley deleted the sisley/refactor/I922-schema-rework branch November 1, 2022 13:53
shahzadlone pushed a commit to shahzadlone/defradb that referenced this pull request Feb 23, 2024
…k#924)

* Conceptually split up query-language from schema

A later commit will remove this function from the Parser type, so please dont comment on that :) Performance is not an issue here either, so the wastage is not really an issue in the short/medium-term

* Remove schema definition from parser
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
action/no-benchmark Skips the action that runs the benchmark. area/parser Related to the parser components area/schema Related to the schema system refactor This issue specific to or requires *notable* refactoring of existing codebases and components
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Conceptual separation between schema, query-language and collection is non-obvious
3 participants