Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Next-generation configuration format #846

Closed
aanand opened this issue Jan 15, 2015 · 17 comments
Closed

Next-generation configuration format #846

aanand opened this issue Jan 15, 2015 · 17 comments

Comments

@aanand
Copy link

aanand commented Jan 15, 2015

There are a whole load of things that people are asking for in fig.yml. I want to pull them together here so that we can discuss configuration at a high level and avoid incremental bloat and viscosity.

(Important note: I haven’t the least bit of interest in discussing filenames or alternatives to YAML in this issue. A separate issue, sure, but don’t shed those bikes in here.)

I also want to discuss what I see as an emerging boundary between portable and non-portable configuration. By way of analogy, let's look at the Dockerfile format, which aims to be fully portable. For example:

  • VOLUME lets you specify paths inside the container that should be volumes, but doesn't let you specify a host path.
  • EXPOSE lets you specify a port to expose, but not what port on the host to map it to.

As a result, a Docker image will work anywhere - it isn't coupled to any features of the host environment, such as filesystem layout or available ports. This enables people to build tools on Docker which can run any image without worrying about how it's configured, an abstraction which is only possible because Dockerfile enforces separation.

With Fig we've created a configuration format for a group of containers - let's call it an app. Most of what you specify in fig.yml (i.e. most container configuration plus links between defined containers) is, to my eyes, portable - but not all of it. We let you specify:

There have also been requests/PRs for support for:

All of this looks highly non-portable to me - if we continue to support such features in fig.yml, it’ll never be suitable as an abstract app definition, and apps will remain coupled to particular characteristics of the systems they’re running on. As we move towards a future where deploying Docker apps on multiple hosts (e.g. with Swarm) or to multiple environments (dev, CI, staging, production), this will become more and more of a pain point.

So I want to talk about how we might redesign Fig’s configuration such that we can support the use cases those features would serve - all of them real problems that real users have faced - and simultaneously achieve an app definition format that is as portable as a Dockerfile.

If we want to do this without sacrificing a significant amount of usability, one approach I’ve thought of is to define two formats:

  • A core app definition file, analogous to Dockerfile, which only allows portable configuration.
  • An auxiliary file which allows host-specific configuration, and can perhaps augment or override bits of the core definition

The idea is that there’s always a single, version-controlled core definition, whereas there may be zero or more auxiliary configs which may or may not be versioned.

Here’s an example:

# app-definition.yml

web:
  build: ./webapp
  command: python app.py
  ports:
    - 80
  links:
    - db
    - authentication-service

db:
  image: redis:latest
# development-config.yml

web:
  volumes:
    # host volume paths are allowed
    - ./webapp:/code
  ports:
    # host ports are allowed
    - "8000:80"
  links:
    # names of external containers are allowed
    - my-external-authentication-container:authentication-service

Down the line, the auxiliary definition can be extended to allow the user to supply more of the asked-for things: variable parameterisation, initial scaling directives, dependencies on other apps, affinity constraints for Swarm, etc.

(Aside: it might be valuable from an "explicit is better than implicit" standpoint to make you name things in the core definition (such as volumes and ports) which the auxiliary config can provide, rather than letting it reach inside and change anything. For example:

# app-definition.yml

web:
  build: ./webapp
  command: python app.py
  volumes:
    - web-code:/code
  ports:
    - http:80
  links:
    - db
    - authentication-service

db:
  image: redis:latest
# development-config.yml

volumes:
  web-code: ./webapp

ports:
  http: 8000

links:
  authentication-service: my-external-authentication-container

But I digress.)

In conclusion:

  • Firstly I want to know how the community feels about the current design of fig.yml in terms of portability, extensibility and usability, especially as we add more stuff to it.
  • Secondly, ditto for the proposed enforced separation of core/auxiliary configuration.
  • Finally, if it sounds good, we can get into the details of the design.
@thaJeztah
Copy link
Member

Quick, first impressions;

  • Separating "definition" / "blueprint" and "runtime" seems the right approach to me, so +1 on that
  • Add support for environment variables with default values in fig.yml #845 could actually be useful in the "definition" file; if no defaults are set, fig/compose could error-out that required runtime parameters are missing.
  • in addition to the previous point; values for the variables could be set from environment-vars in the host (thinking heroku-like approach) and/or via the config-file.
  • I wonder if ports: should be in the "definition" file at all, given that the Dockerfile/image allready exposes ports that can be published.

@relwell
Copy link

relwell commented Jan 15, 2015

I've got some obvious biases, but I believe allowing environment variables at the fig.yml level is important because Fig/Docker serve as a forcing factor to adopt certain 12 factor principles related to configuration. To @thaJeztah's point, this paradigm helps to more effectively deploy containerized code that behaves like productionalized code. Having this feature a level up provides for structured variability according to an existing good practice.

@aanand
Copy link
Author

aanand commented Jan 16, 2015

@thaJeztah @relwell One way or another, it will be possible to:

Furthermore, it makes sense to enable the definition file to ask for arbitrary parameters to be set, supplied by a configuration file or the surrounding environment, which it can then use; this is reminiscent of the QUESTION and ANSWER verbs proposal for Dockerfile.

I'm not 100% sold on passing environment variables directly into the definition file - it feels a bit too implicit. Perhaps if they have to be listed at the top of the file (similar to Go/Python imports), or in the configuration file.

@jpetazzo
Copy link

I love the current design of fig.yml. The fact that we can specify runtime information (local volumes, ports...) means that we can have super-duper-simple workflows: "git clone, fig up, point your browser to localhost:XXXX, enjoy."

That being said, when implementing Fig with customers, we quickly saw a need for separate runtime configurations (e.g. "This is fig.yml for dev, this is fig.yml for prod, and oops we have to carefully keep them in sync. Oh and by the way we don't always want to start all the services, so we actually have multiple dev configurations.") Any method that would allow to e.g. define subsets of services would get my full support :-)

Naive question: it feels to me that most of the definitions in the "core" section could/should be in Dockerfiles: command to be executed, exposed ports, things that should be volumes... Is it OK to have some redundancy here? (Maybe yes, because that allows some fine-tuning without having to subclass an existing image?)

Also, it would be delightful if we could still have a way to define super simple configs (like we can do today) without having to break things down. But I realize that we might not have our cake and it eat too :-)

@tianon
Copy link
Contributor

tianon commented Jan 16, 2015

On 16 January 2015 at 09:40, Jérôme Petazzoni [email protected] wrote:

Naive question: it feels to me that most of the definitions in the "core" section could/should be in Dockerfiles: command to be executed, exposed ports, things that should be volumes... Is it OK to have some redundancy here? (Maybe yes, because that allows some fine-tuning without having to subclass an existing image?)

As an example of where this is useful, I run "rails s" in development,
but use "passenger" (or similar) in production, so my "fig.yml" for
development explicitly runs "rails s" but the default command of the
image is actually "passenger start".

@funkyfuture
Copy link

I'm not 100% sold on passing environment variables directly into the definition file - it feels a bit too implicit. Perhaps if they have to be listed at the top of the file (similar to Go/Python imports), or in the configuration file.

considering that configuration variables' values would be checked, it makes sense to also assign environment variables to configuration varables at that point. like import $FOO as bar

@dnephin
Copy link

dnephin commented Jan 18, 2015

I'm not 100% sold on passing environment variables directly into the definition file - it feels a bit too implicit.

I don't understand how this would be implicit. The configuration contains the name of the variable used for substitution.

I think environment variable substitution is the best way of handling the different-but-similar config issues called out by @jpetazzo. We've had one case where a team had to write a tool to generate a fig.yml from a template so they could be flexible about this type of thing. Maintaining multiple fig.yml wasn't really an option because the files were large, and there were many possible combinations.

ditto for the proposed enforced separation of core/auxiliary configuration.

Overall I think requiring this separation adds more complexity and makes fig configuration harder to understand, for what I see if a pretty minor gain. For large configurations, the separation would make it really annoying to determine the final configuration.

I would be in favor of this separation being optional, but I still think supporting environmental differences with environment variables is a better approach than separate configuration files.

I haven't found these non-portable options to be a problem personally. The configuration makes any non-portable dependencies explicit.

Edit: I guess for swarm (and dev-vs-prod setups) the multiple configuration files does have advantages over environment variables. I still see fig as really a dev-only tool, which is maybe why I'm not as enthusiastic about requiring the separation.

@dnephin
Copy link

dnephin commented Jan 18, 2015

I'm not sure I understand how these would be non-portable.

initial_scale seems like it wouldn't add any other dependencies, or behave any different than running a single instance of a service.

The "Include external config" I see as analogous (in some ways) to the FROM directive in a Dockerfile. Sure you're pointing at some url, but the resource at that url should remain consistent.

@dnephin
Copy link

dnephin commented Jan 18, 2015

Naive question: it feels to me that most of the definitions in the "core" section could/should be in Dockerfiles ... Is it OK to have some redundancy here?

I think this is a nice feature. command is especially useful. A single image can be re-used for multiple purposes (ex: a codebase might have both a web app and some scripts).

@funkyfuture
Copy link

i find the thought of a templating syntax like Jinja with minimal scripting support more and more appealing.

therefore two capabilities would be necessary for defining canonical design patterns. and which could also be used w/o scripting, if 'scripting' would be a later generation:

  1. the capability to include another file.
  2. to reference the context's project and service name as alias, so one could e.g. hostname: %project%-%service%

@relwell
Copy link

relwell commented Jan 23, 2015

It'd be great it we could put some parameters around when these decisions get made. I'd really like to see #845 get merged or updated per the results of this discussion.

@mbdas
Copy link

mbdas commented Feb 16, 2015

Aanand I had commented on #235 to have some ordering on containers that are not link, volumes from dependent. The 2 conditions mentioned should be portable. One was like a on-exit keyword , to start a container on successful exit of others and other maybe start a container on successful health check (user defined) of others.

@razic
Copy link

razic commented Apr 8, 2015

@jpetazzo you can now inherit from files with extend

@kevinSuttle
Copy link

My first (admittedly naive) impression of the docker-compose intro tutorial:

Good gravy that's a lot of duplicated and spread out config. There is a Dockerfile, a docker-compose.yml, a requirements.txt for Python, and the implementation in the app.py file itself. I just saw a blog post advocating for Dockfile.build files also. This isn't very DRY. However, it's not super tightly-coupled, though, which is good.

It feels like a lot of this should happen in Dockerfiles and the docker command itself. Thoughts?

@dnephin
Copy link

dnephin commented Jun 4, 2015

I just saw a blog post advocating for Dockfile.build files

A Dockerfile.build can work for larger projects, or projects that have a lot of dependencies required for building that aren't required for deployment. Python in general rarely has dependencies that are build-only, and this example is pretty small, so I don't think a Dockerfile.build is applicable here.

This isn't very DRY

DRY literally means "Don't repeat yourself". I'm looking at the files you mention, and I don't see any duplication at all. Each has a distinct responsibility within the build. I suppose you could delete the requirements.txt and instead inline the dependencies in the Dockerfile, but most python tooling expects the requirements.txt, so I think it works better to keep them separate.

@dnephin
Copy link

dnephin commented Jan 15, 2016

@aanand now that we have the V2 format, I think we've addressed the top concerns. Should we close this issue?

@dnephin
Copy link

dnephin commented Feb 3, 2016

Since we just released a V2 format, I think any further discussion to config changes should be re-considered based on the new format. Going to close this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

10 participants