Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Combine settings, metadata, static, etc. into a single datasette.yaml File #2093

Open
asg017 opened this issue Jun 29, 2023 · 8 comments

Comments

@asg017
Copy link
Collaborator

asg017 commented Jun 29, 2023

Very often I get tripped up when trying to configure my Datasette instances. For example: if I want to change the port my app listen too, do I do that with a CLI flag, a --setting flag, inside metadata.json, or an env var? If I want to up the time limit of SQL statements, is that under metadata.json or a setting? Where does my plugin configuration go?

Normally I need to look it up in Datasette docs, and I quickly find my answer, but the number of places where "config" goes it overwhelming.

  • Flat CLI flags like --port, --host, --cors, etc.
  • --setting, like default_page_size, sql_time_limit_ms etc
  • Inside metadata.json, including plugin configuration

Typically my Datasette deploys are extremely long shell commands, with multiple --setting and other CLI flags.

Proposal: Consolidate all "config" into datasette.toml

I propose that we add a new datasette.toml that combines "settings", "metadata", and other common CLI flags like --port and --cors into a single file. It would be similar to "Cargo.toml" in Rust projects, "package.json" in Node projects, and "pyproject.toml" in Python, etc.

A sample of what it could look like:

# "top level" configuration that are currently CLI flags on `datasette serve`
[config]
port = 8020
host = "0.0.0.0"
cors = true

# replaces multiple `--setting` flags
[settings]
base_url = "/app/datasette/"
default_allow_sql = true
sql_time_limit_ms = 3500

# replaces `metadata.json`.
# The contents of datasette-metadata.json could be defined in this file instead, but supporting separate files is nice (since those are easy to machine-generate)
[metadata]
include="./datasette-metadata.json"

# plugin-specific 
[plugins]
[plugins.datasette-auth-github]
client_id = {env = "DATASETTE_AUTH_GITHUB_CLIENT_ID"}
client_secret = {env = "GITHUB_CLIENT_SECRET"}

[plugins.datasette-cluster-map]

latitude_column = "lat"
longitude_column = "lon"

Pros

  • Instead of multiple files and CLI flags, everything could be in one tidy file
  • Editing config in a separate file is easier than editing CLI flags, since you don't have to kill a process + edit a command every time
  • New users will know "just edit my datasette.toml instead of needing to learn metadata + settings + CLI flags
  • Better dev experience for multiple environment. For example, could have datasette -c datasette-dev.toml for local dev environments (enables SQL, debug plugins, long timeouts, etc.), and a datasette -c datasette-prod.toml for "production" (lower timeouts, less plugins, monitoring plugins, etc.)

Cons

  • Yet another config-management system. Now Datasette users will need to know about metadata, settings, CLI flags, and datasette.toml. However with enough documentation + announcements + examples, I think we can get ahead of it.
  • If toml is chosen, would need to add a toml parser for Python version <3.11
  • Multiple sources of config require priority. For example: Would --setting default_allow_sql off override the value inside [settings]? What about --port?

Other Notes

Toml

I chose toml over json because toml supports comments. I chose toml over yaml because Python 3.11 has builtin support for it. I also find toml easier to work with since it doesn't have the odd "gotchas" that YAML has ("ex 3.10 resolving to 3.1, Norway NO resolving to false, etc.). It also mimics pyproject.toml which is nice. Happy to change my mind about this however

Plugin config will be difficult

Plugin config is currently in metadata.json in two places:

  1. Top level, under "plugins.[plugin-name]". This fits well into datasette.toml as [plugins.plugin-name]
  2. Table level, under "databases.[db-name].tables.[table-name].plugins.[plugin-name]. This doesn't fit that well into datasette.toml, unless it's nested under [metadata]?

Extensions, static, one-off plugins?

We could also include equivalents of --plugins-dir, --static, and --load-extension into datasette.toml, but I'd imagine there's a few security concerns there to think through.

Explicitly list with plugins to use?

I believe Datasette by default will load all install plugins on startup, but maybe datasette.toml can specify a list of plugins to use? For example, a dev version of datasette.toml can specify datasette-pretty-traces, but the prod version can leave it out

@simonw
Copy link
Owner

simonw commented Jun 29, 2023

I'm strongly in favour of combining settings, configuration and plugin configuration.

I'm not keen on mixing in metadata as well - that feels like a different concept to me, and I'm unhappy with how that's already had things like plugin settings leak into it.

I'm not yet sold on TOML - I actually find it less intuitive than YAML, surprisingly. They all have their warts I guess.

Datasette already has the ability to consume JSON or YAML for metadata - maybe it could grow TOML support too? That way users could have a datasette.json or datasette.yaml or datasette.toml file depending on their preference.

In terms of metadata: since that's means to be driven by a plugin hook anyway, maybe one of the potential sources of metadata is a metadata nested object in that datasette.* configuration file. Or you can have it in a separate metadata.json or bundled into the SQLite database or some other plugin-driven mechanism.

@simonw
Copy link
Owner

simonw commented Jun 29, 2023

I do like also being able to set options using command line options though - for things like SQL time limits I'd much rather be able to throw on --setting sql_time_limit_ms 10000 than have to save a config file to disk.

So I'd want to support both. Which maybe means also having a way to set plugin options with CLI options. datasette publish kind of has that ability already:

datasette publish heroku my_database.db \
    --name my-heroku-app-demo \
    --install=datasette-auth-github \
    --plugin-secret datasette-auth-github client_id your_client_id \
    --plugin-secret datasette-auth-github client_secret your_client_secret

@asg017
Copy link
Collaborator Author

asg017 commented Jun 29, 2023

I agree with not liking metadata.json stuff in a datasette.* config file. Editing description of a table/column in a file like datasette.* seems odd to me.

Though since plugin configuration currently lives in metadata.json, I think it should be removed from there and placed in datasette.*, at least for top-level config like datasette-auth-github's config. Keeping metadata.json strictly for documentation/licensing/column units makes sense to me, but anything plugin related should be in some config file, like datasette.*.

And ya, supporting both datasette.* and CLI flags makes a lot of sense to me. Any --setting flag should override anything in datasette.* for easier debugging, with possibly a warning message so people don't get confused. Same with --port and a port defined in datasette.*

@asg017
Copy link
Collaborator Author

asg017 commented Jun 29, 2023

Maybe we can have a separate issue for revamping metadata.json? A datasette_metadata table or the sqlite-docs extension seem like two reasonable additions that we can work through. Storing metadata inside a SQLite database makes sense, but I don't think storing datasette.* style config (ex ports, settings, etc.) inside a SQLite DB makes sense, since it's very environment-dependent

@simonw
Copy link
Owner

simonw commented Jun 30, 2023

I agree, settings in the DB doesn't make sense but metadata does.

On the JSON v YAML v TOML issue I just spotted Caddy has a concept of config adapters which they use to resolve exactly that problem: https://caddyserver.com/docs/config-adapters

@terinjokes
Copy link

terinjokes commented Jul 2, 2023

I'm not keen on requiring metadata to be within the database. I commonly have multiple DBs, from various sources, and having one config file to provide the metadata works out very well. I use Datasette with databases where I'm not the original source, needing to mutate them to add a metadata table or sqlite-docs makes me uncomfortable.

@asg017
Copy link
Collaborator Author

asg017 commented Jul 2, 2023

Storing metadata in the database won't be required. I imagine there'll be many different ways to store metadata, including any possible datasette_metadata or sqlite-docs, or the older metadata.json way.

The next question will be how precedence should work - i'd imagine metadata.json > plugins > datasette_metadata > sqlite-docs

@asg017
Copy link
Collaborator Author

asg017 commented Aug 22, 2023

OK Here's the gameplan for this, which is closely tied to #2143 :

  • We will add a new datasette.json/datasette.yaml configuration file to datasette, which combines settings/plugin config/permissions/canned queries into a new file format
  • Metadata will NOT be a part of this file
  • TOML support is not planned, but maybe we can create a separate issue for support TOML with JSON/YAML
  • The settings.json file will be deprecated, and the --config arg will be brought back.
  • Command line arguments can still be used to overwrite values (ex --setting will overwrite settings in datasette.yaml

The format of datasette.json will follow what Simon listed here: #2143 (comment)

Here's the current implementation plan:

  1. Add a new --config flag and port over "settings" into a new datasette.json config file, remove settings.json
  2. Add top-level plugin config support to datasette.json
  3. Figure out database/table structure of config datasette.json
  4. Port over database/table level plugin config support datasette.json
  5. Port over permissions/auth settings to datasette.json
  6. Deprecate non-metadata values in metadata.json

simonw pushed a commit that referenced this issue Aug 23, 2023
The first step in defining the new `datasette.json/yaml` configuration mechanism.

Refs #2093, #2143, #493
simonw added a commit that referenced this issue Aug 29, 2023
@asg017 asg017 changed the title Proposal: Combine settings, metadata, static, etc. into a single datasette.toml File Proposal: Combine settings, metadata, static, etc. into a single datasette.yaml File Sep 11, 2023
simonw pushed a commit that referenced this issue Sep 13, 2023
* Checkpoint, moving top-level plugin config to datasette.json
* Support database-level and table-level plugin configuration in datasette.yaml

Refs #2093
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants