-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Get models from definitions of referred files #273
Get models from definitions of referred files #273
Conversation
bravado_core/model.py
Outdated
processed_files.add(referred_files[0]) | ||
yield referred_files[0] | ||
else: | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to be doing a lot of duplicate work, recalculating referred_files after yeach yield
... maybe I'm missing something, but wouldn't it be easier to write the function like this:
def _get_referred_files(swagger_spec):
for uri in swagger_spec.resolver.store:
if uri != swagger_spec.origin_url and not re.match(r'http://json-schema.org/draft-\d+/schema', uri):
yield uri
If, on the other hand, you function needs to deal with the fact that swagger_spec.resolver.store will be different after each yield
then I think you need to refactor your code; using yield
doesn't seem to be right approach. I'd expect the caller to own processed_files and pass that in on every call (since that contains the state). You'd then return from this function once you find the first unprocessed_file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As you pointed out swagger_spec.resolver.store
could change during the iteration.
I've already created a specific example of this on the tests (check test-data/2.0/multi-file-recursive/aux.json#/not_used_remote_reference/properties/random_number
which points to test-data/2.0/multi-file-recursive/aux_2.json#/definitions/random_integer
)
In such case the new file aux_2.json get's added only after aux.json is fully processed.
I'll change the method implementation to make it a bit more efficient by scanning all the list of files in the store and redoing it if there are new files at the end of the scan
bravado_core/model.py
Outdated
|
||
NOTE: The generator could change if during successive yields swagger_spec.resolver.store changed. | ||
An example of this is when you're using swagger_spec.resolver to resolve a new reference | ||
that points to a new file/URL. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still not happy with the design of this function. Why do you need to keep this state internally? It makes the function harder to test and debug. Would it be possible to do something like this:
processed_uris = ...
additional_uri = _get_unprocessed_uri(spec, processed_uris)
while additional_uri:
# Post process each referenced specs to identify models in definitions of linked files
with spec.resolver.in_scope(additional_uri):
_call_post_process_spec(
spec.resolver.store[additional_uri],
)
processed_uris.add(additional_uri)
additional_uri = _get_unprocessed_uri(spec, processed_uris)
...and use a variant of the implementation I suggested earlier. That way your function becomes a pure function, is easy to test and easy to reason about. The state gets passed in, but isn't hard to maintain.
Your implementation seems slightly more efficient now than this suggestion, but I hope it won't matter in practice.
4f1d2b6
to
bf3f31f
Compare
The goal of this PR is to make bravado-core compliant with its own documentation :)
Model discovery point 2 states
The current implementation was not fully scanning all the referenced files.
This PR achieves the goal of scanning over all the referred files during model discovery process by keeping track of the already fully scanned files.
Potential issues
I might be too pessimistic / paranoid but I prefer to make it "public" as early as possible
As we'll start to fully scan (and follow possible references) the referenced files we could end up in some endless process.
A possible example of the mentioned issue is:
{"definitions": {"def": {"$ref": "file2#/definitions/def"}}}
{"definitions": {"def": {"$ref": "file<X+1>#/definitions/def"}}}
Do you have any concern about this potential issue?