Get models from definitions of referred files #273

macisamuele · 2018-05-29T14:48:40Z

The goal of this PR is to make bravado-core compliant with its own documentation :)
Model discovery point 2 states

Search for refs that refer to external definitions with pattern <filename>#/definitions/<model name>

The current implementation was not fully scanning all the referenced files.
This PR achieves the goal of scanning over all the referred files during model discovery process by keeping track of the already fully scanned files.

Potential issues
I might be too pessimistic / paranoid but I prefer to make it "public" as early as possible
As we'll start to fully scan (and follow possible references) the referenced files we could end up in some endless process.
A possible example of the mentioned issue is:

file1: {"definitions": {"def": {"$ref": "file2#/definitions/def"}}}
fileX: {"definitions": {"def": {"$ref": "file<X+1>#/definitions/def"}}}

Do you have any concern about this potential issue?

coveralls · 2018-05-29T14:54:00Z

Coverage increased (+0.01%) to 98.398% when pulling bf3f31f on macisamuele:maci-get-models-from-definitions into ebe340d on Yelp:master.

sjaensch · 2018-05-29T15:15:22Z

bravado_core/model.py

+            processed_files.add(referred_files[0])
+            yield referred_files[0]
+        else:
+            return


This seems to be doing a lot of duplicate work, recalculating referred_files after yeach yield... maybe I'm missing something, but wouldn't it be easier to write the function like this:

def _get_referred_files(swagger_spec): for uri in swagger_spec.resolver.store: if uri != swagger_spec.origin_url and not re.match(r'http://json-schema.org/draft-\d+/schema', uri): yield uri

If, on the other hand, you function needs to deal with the fact that swagger_spec.resolver.store will be different after each yield then I think you need to refactor your code; using yield doesn't seem to be right approach. I'd expect the caller to own processed_files and pass that in on every call (since that contains the state). You'd then return from this function once you find the first unprocessed_file.

As you pointed out swagger_spec.resolver.store could change during the iteration.
I've already created a specific example of this on the tests (check test-data/2.0/multi-file-recursive/aux.json#/not_used_remote_reference/properties/random_number which points to test-data/2.0/multi-file-recursive/aux_2.json#/definitions/random_integer)
In such case the new file aux_2.json get's added only after aux.json is fully processed.

I'll change the method implementation to make it a bit more efficient by scanning all the list of files in the store and redoing it if there are new files at the end of the scan

sjaensch · 2018-05-30T08:42:30Z

bravado_core/model.py

+
+    NOTE: The generator could change if during successive yields swagger_spec.resolver.store changed.
+        An example of this is when you're using swagger_spec.resolver to resolve a new reference
+        that points to a new file/URL.


I'm still not happy with the design of this function. Why do you need to keep this state internally? It makes the function harder to test and debug. Would it be possible to do something like this:

processed_uris = ... additional_uri = _get_unprocessed_uri(spec, processed_uris) while additional_uri: # Post process each referenced specs to identify models in definitions of linked files with spec.resolver.in_scope(additional_uri): _call_post_process_spec( spec.resolver.store[additional_uri], ) processed_uris.add(additional_uri) additional_uri = _get_unprocessed_uri(spec, processed_uris)

...and use a variant of the implementation I suggested earlier. That way your function becomes a pure function, is easy to test and easy to reason about. The state gets passed in, but isn't hard to maintain.

Your implementation seems slightly more efficient now than this suggestion, but I hope it won't matter in practice.

macisamuele added 2 commits May 29, 2018 16:39

Add support to detect models from definitions on referred files

877850a

Update flattened specs file and tests

f784077

macisamuele requested a review from sjaensch May 29, 2018 14:48

sjaensch reviewed May 29, 2018

View reviewed changes

sjaensch approved these changes May 30, 2018

View reviewed changes

Update logic to extract un-processed URIs

bf3f31f

macisamuele force-pushed the maci-get-models-from-definitions branch from 4f1d2b6 to bf3f31f Compare May 30, 2018 09:23

macisamuele merged commit 17cdec1 into Yelp:master May 30, 2018

macisamuele deleted the maci-get-models-from-definitions branch May 30, 2018 10:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Get models from definitions of referred files #273

Get models from definitions of referred files #273

macisamuele commented May 29, 2018

coveralls commented May 29, 2018 •

edited

Loading

sjaensch May 29, 2018

macisamuele May 29, 2018

sjaensch May 30, 2018

Get models from definitions of referred files #273

Get models from definitions of referred files #273

Conversation

macisamuele commented May 29, 2018

coveralls commented May 29, 2018 • edited Loading

sjaensch May 29, 2018

Choose a reason for hiding this comment

macisamuele May 29, 2018

Choose a reason for hiding this comment

sjaensch May 30, 2018

Choose a reason for hiding this comment

coveralls commented May 29, 2018 •

edited

Loading