Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REQ] Pipelining not plugin #6627

Open
mwilby opened this issue Jun 11, 2020 · 0 comments
Open

[REQ] Pipelining not plugin #6627

mwilby opened this issue Jun 11, 2020 · 0 comments

Comments

@mwilby
Copy link

mwilby commented Jun 11, 2020

As I suspected the behavior of oneOf and anyOf is not effected by the proposed change in #6601, it is still completely wrong and in the same way as before. This is not a problem because there is a lot of rethinking to do before it would be possible to make them work. I always assumed the fact that models that made use of these options would compile to model code without blowing up was someones idea of a joke, but as I have been looking at your code base more seriously in the past few months I thought I would raise the issue, and then use it to explain some feature that would be of great use to us and we think would be valuable to the user base as a whole. Its not a problem but that needs fixing at the moment, so I am not reporting it as a bug. I am using it it demonstrate an argument.

The following comment is going to be long, because as I explained I am using this to outline the perspective of a user. Users have different priorities to developers, so some things that are important to users may effect the relevant importance of the development objectives.

First the bug

The code that is generated is basically the normal language model object, with a new type included referencing a model object that does not exists. For example using

stream_source:
  type: object
  properties:
    stream:
      oneOf:
      - $ref: '#/video'
      - $ref: '#/stream'
      - $ref: '#/camera_model_a'
      - $ref: '#/camera_model_b'

python-flask will yield a python object with the following includes

from openapi_server.models.base_model_ import Model
from openapi_server.models.one_ofvideostreamcamera_model_acamera_model_b import OneOfvideostreamcameraModelAcameraModelB  # noqa: E501
from openapi_server import util

java has these imports

...
import java.io.IOException;
import org.openapitools.client.model.OneOfvideostreamcameraModelAcameraModelB;

And in both cases the corresponding object is not created. So the code wont compile.

In the case of an interface, at least in python-flask the code generated is also fallacious. The generated default controller has the following imports

from openapi_server.models.one_ofstream_source import OneOfstreamSource  # noqa: E501
from openapi_server import util

Again the model object is not created.

I can say little about the details of the interface spec and server code generation. In our use case the OpenAPI interface specification is a little too simplistic and when we do use the server generated code, its at best used as a suggestion for a few options and at worst useless (this is not a criticism of OpenAPI, which does its job well from our perspective, its just we use part of your functionality to do something it wasn't designed to do), so we don't notice any problem in this side of the code. As an aside, the ability to enhance the interface specs. would be interesting and is one of the motivations for this post.

Now the extensions

OK, so whats wrong beyond the obvious of generating model code that doesn't compile. The first point is that the above schema is a perfectly valid one, BUT its not necessary a good description to turn into a code object. What the schema is saying is a document that contains a stream object should have one element and it should be one of the 4 types specified, or else the document is wrong. This does not translate into code in an unambiguous way.

The following is what we actually implement for this object

stream_source:
  type: object
  properties:
    type:
      type: string
      enum:
      - camera_model_a
      - camera_model_b
      - stream
      - video
    parameters:
      type: object

This translates into code in a much simpler way. Its a long way from perfect because the enumerator should enumerate types not strings we need to code into types by hand. But as a simple hack that gets the job done, its OK.

Once you use oneOf, or anyOf, you are implicate dealing with types. This is kind of OK in a dynamically typed language, but in a statically type language, its not so clear cut. In a language with polymorphism you can return a base object and a class spec, where the class spec can be used to cast the object, but it still makes using it cumbersome and ugly. The setters are easy if you can overload methods. I guess the thinking behind OneOfModelABMo... is an object that can report the correct type and provide a type specified return method, but even so your still going to end up with a lot of switch, or equivalent statements. And you still need to generate the model object.

In whatever language you want to build, and whatever number of buffer layers you try and put in front of the undefined return you will still need to work from a base description. That description is a list of returned data types. Its hard to see if that is in the model passed to the template engine, or not (a point I will come back to,) and if so how to get hold of it.

From our perspective, what the generator is doing is augmenting the original schema with all the elements needed to build actual code. So basically its transforming one document into another, based on some parameters, which are best thought of as an object that's been added to the specification document, even if they just start off as a bunch of cmd line settings. The transformed document is what is passed to the template generator, that then constructs the code.

Your currently in the process of discussing and in some case starting, a set of ideas going in this direction #844, #846, #837 and others. Broadly for us these ideas go in the right direction, but there are some tiny user based details that might enhance the proposed functionality.

Going back to generating the code, this is what appears (reduced to relevant data) in the var list for the stream object in the log file. This data is not always complete, so it might have the data we need, but we can't be sure.

{
      "baseName" : "stream",
      "getter" : "getStream",
      "setter" : "setStream",
      "dataType" : "OneOfvideostreamcameraModelAcameraModelB",
      "jsonSchema" : "{\n  \"oneOf\" : [ {\n    \"$ref\" : \"#/components/schemas/video\"\n  }, {\n    \"$ref\" : \"#/components/schemas/stream\"\n  }, {\n    \"$ref\" : \"#/components/schemas/camera_model_a\"\n  }, {\n    \"$ref\" : \"#/components/schemas/camera_model_b\"\n  } ]\n}",
}

One way of moving forward as a language model developer is to regard this as document that is sent to the template engine as a specification document. Then I could define what I would like to see in this document as a specification for the language extension, I have in mind. If I was looking to do something relative to the stream object and not create a new model, I might want to add something like the following to the definition.

{
      "dataType" : "",
      "isOneOf" : false,
      "listOf" : [
            "CameraModelA",
            "CameraModelB"
            "Stream",
            "Video"
      ]
}

Alternatively, I could suggest adding a model definition for OneOfvideostreamcameraModelAcameraModelB. Additionally, I would have to define the structure carefully to ensue I had no logical decisions to make.

At the moment this is what is done in practice, but at the level of a specific language. However, the above argument is a general code structure argument, not a language specific one. So pushing this definition further back to the separation between the core functionality and language specific structures could drastically simplify the design and maintenance process. At the moment the base functionality mixes together language specific functionality, with core elements. If we understand #845 correctly, this separation is precisely what your trying to do and all we are suggesting is a tiny little tweek, which not only makes the coding simpler, it provides a very powerful route to enhance the functionality for the users.

The construction of a document definition provides a specification document between the various language module developers and the core module developers. Developers for other languages could augment, modify, or argue for different approach and they could define their proposal precisely through this document. Once a design is decided upon, then the core module developers have a detailed and precise specification of follow and almost by default a test case to work with. The development of the core module and the language modules can interact with each other through this specification. New features can be proposed and discussed. Bugs can be defined by their appearance and representation in this document.

The net result is that you have separated the language module development and the core module development via the creation of a de facto interface between the two.

Now the core development team can completely redesign the architecture to their harts content, because they have a defined target for the action of the output and something that can define a very precise set of test functions.

This is not the real essence of what we are proposing, as its simply a restatement of the current architecture objectives. What we are suggestion is that the specification file was accessible and acts as a real object communicated between different modules. If this file actually existed, i.e. the core engine could stream it out in json, or yaml, or xml, then you could build a simplified version of a plugin architecture as a pipeline, which could provide a massive extension in functionality. The code has this document internally, so the code for managing it is already there. Taking that code and and addiing an output and input stream, means you can move the data from one module to another.

Separating the core engine and the template engine, by streaming this file between them immediately solves the plugin problem for the template engine, #846, but actually the separation could start further back. The language module could take the output streamed by the core element into a copy of data representation code. Basically, the language modules would become truly separate projects, and the could use the same pipelining architecture to pass their modified document to the template generator. As its now just a pipeline, they can chose whichever template engine they want, or even build their own.

A plugin architecture is only needed if the information flow is none-linear. And in this case we believe it is tool chain process, which is intrinsically liner.

Now this is where the advantage comes in, the language developers can change the tool chain process to suite their needs. They can take the generated document and augment it in any way they wish, they can process the changes with logic defining more complex code generation, or build more process steps in the tool chain and it does not get mixed in with everybody else's system. Changes that are useful can be propagated back to the core module, and intermediate modules can be built that may be used by one type of language, or framework process and not others.

Its a lot simpler to build than a plugin architecture. Finally because its all defined by an interchange document format. A library for loading the format can be built in other languages allowing developers a broader set of options development options. After all that is basically what you code is supposed to make possible, the interchange document is just a big model file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant