Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Allow newlines and trailing commas in inline tables #516

Closed
JelteF opened this issue Jan 22, 2018 · 44 comments · Fixed by #904
Closed

Proposal: Allow newlines and trailing commas in inline tables #516

JelteF opened this issue Jan 22, 2018 · 44 comments · Fixed by #904

Comments

@JelteF
Copy link
Contributor

JelteF commented Jan 22, 2018

Overall I really like toml and its syntax feels very obvious to me for the most part. The only thing that doesn't is the expclicit cripling of inline tables, i.e. inline tables cannot have newlines or trailing commas. I've read the reasoning behind this in the existing issue and PR. However, I don't think that the reason given (discouraging people from using big inline tables instead of sections) weighs up against the downsides. That's why I would like to open up a discussion about this.

There's three main downsides I see:

  1. It's unexpected for most people using the language. Most popular languages that have {} style mappings allow newlines in them (JSON, Python, Javascript, Go). Also newlines and trailing commas are allowed in lists in the toml spec, so it is inconsistent in this regard.
  2. To me even small inline tables are much more readable at first glance when split over multiple lines:
# Single line
person = { name = "Tom", geography = { lat = 1.0, lon = 2.0 } }

# Multi line
person = { 
    name = "Tom", 
    geography = { 
        lat = 1.0, 
        lon = 2.0,
    },
}
  1. A deeply nested list of tables are forced to have a lot of repeated keys in the section headers. Compare this version with list of tables:
[main_app.general_settings.logging]
log-lib = "logrus"

[[main_app.general_settings.logging.handlers]]
  name = "default"
  output = "stdout"
  level = "info"

[[main_app.general_settings.logging.handlers]]
  name = "stderr"
  output = "stderr"
  level = "error"

[[main_app.general_settings.logging.handlers]]
  name = "access"
  output = "/var/log/access.log"
  level = "info"

To the one with inline tables with newlines:

[main_app.general_settings.logging]
log-lib = "logrus"

handlers = [
    {
        name = "default",
        output =  "stdout",
        level = "info",
    }, {
        name = "stderr",
        output =  "stderr",
        level = "error",
    }, {
        name = "access",
        output =  "/var/log/access.log",
        level = "info",
    },
]

Finally, extending current toml parsers to support this is usually really easy, so that also shouldn't be an argument against it. I changed the the https://github.com/pelletier/go-toml implementation to support newlines in tables and I only had to change 5 lines to do it (3 of which I simply had to delete).

@eksortso
Copy link
Contributor

Maybe I'm off-base, but I'm not yet sold on this proposal.

Regarding the first point, considering that arrays and tables are different things, the perceived inconsistency in syntax is not a problem, is perfectly acceptable, and sets these different things off nicely.

Skipping down to point 3, isn't the following equivalent? It's already legal TOML, it's readable, and it's space-efficient, or so I like to think. You may disagree with me (especially since I swapped two of the keys) but at least take a look:

[main_app.general_settings.logging]
log-lib = "logrus"

handlers = [
    {name = "default", level = "info",  output = "stdout"},
    {name = "stderr",  level = "error", output = "stderr"},
    {name = "access",  level = "info",  output = "/var/log/access.log"},
]

Also consider this. Any more readable?

[person]
name = "Tom"
geography = {lat = 1.0, lon = 2.0}

Inline tables are fully intended to be small tables, with multiple key/value pairs on one line. If the tables in your (quite readable) example were any larger, then double-bracket notation would make much more sense, even with repeated keys, and you'd get the one-line-one-pair that you seem to find aesthetically appealing.

In either case, we don't need to add a pseudo-JSON to get readability, no matter whether it would be simple to implement.

@JelteF
Copy link
Contributor Author

JelteF commented Jan 24, 2018

@eksortso I can see where you're coming from on the first point. I do disagree though, because IMHO they are very much similar because they're both inline datastructures. I don't know how to make that argument more convincing though. I think my main point there is: both the ararys and the inline tables have pseudo-JSON syntax, but the inline tables are missing some features right now that you would expect coming from JSON (or any other language that has similar syntax).

On the other two examples you make some good points. I took the second example because it was mentioned in the original issue. I see now though that there was discussion on that issue if it was even a good example.

The third one is an issue I actually have myself with my configs. I think you made some good points there as well. Especially the aligning of keys in the third one helps quite a lot with readability. I do think you indeed cheated in a smart way a bit by moving the keys around a bit. I'll will expand on that point and hopefully make my arguments there a bit stronger, but of course you're still allowed to disagree:

Modified point 3

I'll show the same piece of config in different ways below and list some of disadvantages and advantages with each one.

With double brackets

[main_app.general_settings.logging]
log-lib = "logrus"

[[main_app.general_settings.logging.handlers]]
  name = "default"
  output = "stdout"
  level = "info"

[[main_app.general_settings.logging.handlers]]
  name = "stderr"
  output = "stderr"
  level = "error"

[[main_app.general_settings.logging.handlers]]
  name = "http-access"
  output = "/var/log/access.log"
  level = "info"

[[main_app.general_settings.logging.loggers]]
  name = "default"
  handlers = ["default", "stderr"]
  level = "warning"

[[main_app.general_settings.logging.loggers]]
  name = "http-access"
  handlers = ["default"]
  level = "info"

Advantages:

  • Diffs are extremely clear, a line changed means that value changed.
  • Short lines

Disadvantages:

  • Lot's of times repeated main_app.general_settings.logging
  • Hard to see at first glance that there's two distinct arrays handlers and loggers
  • Quite a lot of vertical space is taken

With inline tables unaligned

[main_app.general_settings.logging]
log-lib = "logrus"

handlers = [
    {name = "default", output = "stdout", level = "info"},
    {name = "stderr", output = "stderr", level = "error"},
    {name = "http-access", output = "/var/log/access.log", level = "info"},
]
loggers = [
    {name = "default", handlers = ["default", "stderr"], level = "warning"}, 
    {name = "http-access", handlers = ["http-access"], level = "info"},
]

Advantages:

  • Very little vertical space is used

Disadvantages:

  • Looks messy, which makes it hard to compare the different tables in a single list.
  • Line based diffs don't show easily what value changed.

With inline tables without newlines without reordered keys

[main_app.general_settings.logging]
log-lib = "logrus"

handlers = [
    {name = "default",     output = "stdout",              level = "info"},
    {name = "stderr",      output = "stderr",              level = "error"},
    {name = "http-access", output = "/var/log/access.log", level = "info"},
]
loggers = [
    {name = "default",     handlers = ["default", "stderr"], level = "warning"}, 
    {name = "http-access", handlers = ["http-access"],       level = "info"},
]

Advantages:

  • Looks quite pretty
  • Very little vertical space is used

Disadvantages:

  • Quite long lines because of the added white space.
  • Changing the length of a value requires a some effort. You have to change the spacing in that line or in the other lines appart from changing the value itself.
  • Line based diffs don't show easily what value was changed.
  • Changing one value can even show other lines as changed in diffs because the spacing had to be chanegd.

With inline tables without newlines with reordered keys

[main_app.general_settings.logging]
log-lib = "logrus"

handlers = [
    {name = "default",     level = "info",  output = "stdout"},
    {name = "stderr",      level = "error", output = "stderr"},
    {name = "http-access", level = "info",  output = "/var/log/access.log"},
]
loggers = [
    {name = "default",     level = "warning", handlers = ["default", "stderr"]}, 
    {name = "http-access", level = "info",    handlers = ["http-access"]},
]

Advantages:

  • Lines are less long than without reordering
  • Looks quite pretty

Disadvantages:

  • Still quite long lines.
  • You have to reorder the keys, possibly having to choose between a logical order and order in which the whitespace is minimised.
  • Changing the length of a value requires a some effort. You have to change the spacing in that line or in the other lines appart from changing the value itself.
  • Line based diffs don't show easily what value changed.
  • Changing one value can even show other lines as changed in diffs because the spacing had to be chanegd.

With newlines

[main_app.general_settings.logging]
log-lib = "logrus"

handlers = [
    {
        name = "default",
        output =  "stdout",
        level = "info",
    }, {
        name = "stderr",
        output =  "stderr",
        level = "error",
    }, {
        name = "http-access",
        output =  "/var/log/access.log",
        level = "info",
    },
]
loggers = [
    {
        name = "default",
        handlers = ["default", "stderr"]
        level = "warning",
    }, {
        name = "http-access",
        handlers = ["http-access"]
        level = "info",
    },
]

Advantages:

  • Diffs are extremely clear, a line changed means that the value changed.
  • Short lines

Disadvatages:

  • Needs to indent twice
  • Quite a bit of vertical space is used.

Conclusion

I think ultimately it's a matter of taste what looks better. And a matter of tradeoffs between, repeated keys, vertical space, line length, diff clarity and logical vs visually pleasant key ordering. I think my main point with this example is that it would be nice if users could choose what they find more important.

@seeruk
Copy link

seeruk commented Feb 28, 2018

I'm all for this. Just started to look into TOML properly for the first time as I was planning on using it for the configuration file for a tool I'm writing. I really like TOML overall, but this one thing makes some specific things really nasty. The bit I'm working on is actually sort of like the Docker Compose syntax in some ways.

Take this YAML for example:

version: "3"

services:
    elasticsearch:
        container_name: metrics_elasticsearch
        image: docker.elastic.co/elasticsearch/elasticsearch:5.5.3
        network_mode: host
        environment:
          discovery.type: single-node
          http.cors.enabled: true
          http.cors.allow-origin: "*"
          xpack.security.enabled: false
        ports:
        - 9200:9200
        - 9300:9300
        volumes:
        - elasticsearch-data:/usr/share/elasticsearch/data

    kibana:
        container_name: metrics_kibana
        image: docker.elastic.co/kibana/kibana:5.5.3
        network_mode: host
        environment:
          ELASTICSEARCH_URL: http://localhost:9200
          XPACK_MONITORING_ENABLE: false
        ports:
        - 5601:5601

volumes:
    elasticsearch-data:
        driver: local

And then compare it to the equivalen TOML:

version = "3"

[services]

  [services.elasticsearch]
  container_name = "metrics_elasticsearch"
  image = "docker.elastic.co/elasticsearch/elasticsearch:5.5.3"
  network_mode = "host"
  ports = [
    "9200:9200",
    "9300:9300"
  ]
  volumes = [
    "elasticsearch-data:/usr/share/elasticsearch/data"
  ]

    [services.elasticsearch.environment]
    "discovery.type" = "single-node"
    "http.cors.enabled" = true
    "http.cors.allow-origin" = "*"
    "xpack.security.enabled" = false

  [services.kibana]
  container_name = "metrics_kibana"
  image = "docker.elastic.co/kibana/kibana:5.5.3"
  network_mode = "host"
  ports = [
    "5601:5601"
  ]

    [services.kibana.environment]
    ELASTICSEARCH_URL = "http://localhost:9200"
    XPACK_MONITORING_ENABLE = false

[volumes]

  [volumes.elasticsearch-data]
  driver = "local"

That extra level of nesting just makes TOML that much less nice to use in this case. If the environment could be on the same level as the rest of the service configuration it'd tidy it right up.

version = "3"

[services]

  [services.elasticsearch]
  container_name = "metrics_elasticsearch"
  image = "docker.elastic.co/elasticsearch/elasticsearch:5.5.3"
  network_mode = "host"
  environment = {
    "discovery.type" = "single-node",
    "http.cors.enabled" = true,
    "http.cors.allow-origin" = "*",
    "xpack.security.enabled" = false,
  }
  ports = [
    "9200:9200",
    "9300:9300"
  ]
  volumes = [
    "elasticsearch-data:/usr/share/elasticsearch/data"
  ]

  [services.kibana]
  container_name = "metrics_kibana"
  image = "docker.elastic.co/kibana/kibana:5.5.3"
  network_mode = "host"
  environment = {
    ELASTICSEARCH_URL = "http://localhost:9200",
    XPACK_MONITORING_ENABLE = false,
  }
  ports = [
    "5601:5601"
  ]

[volumes]

  [volumes.elasticsearch-data]
  driver = "local"

@eksortso
Copy link
Contributor

At the heart of your issue, you have a large subtable that you wish to keep in the middle of your configurations. Not before it, and not after it. Some relief exists with inline tables and key-path assignments. But with a table nested a few layers deep, the keys would grow very long.

I still find multiline tables that look like JSON offputting. But I think I have an idea for a TOML-friendly syntax that could get you what you're wanting. I don't have time to write it down now, but I'll be back later on.

@pradyunsg
Copy link
Member

pradyunsg commented May 6, 2019

Hi @JelteF!

Thanks for filing this issue. I'm deferring any new syntax proposal as I try to ramp up my effort to get us to TOML 1.0, which will not contain any new syntax changes from TOML 0.5.

This is definitely an idea I want to explore more -- personally, I still haven't finalized how much TOML should be flat (INI-like) vs nested (JSON-like). Both approaches have their trade-offs and we'll know what we want to do for this specific request, once we finalize that overarching idea. However, I'd appreciate if we hold off that discussion until TOML 1.0 is released.

@polarathene
Copy link

polarathene commented Dec 4, 2019

The earlier example:

[services]
  [services.kibana]
  container_name = "metrics_kibana"
  image = "docker.elastic.co/kibana/kibana:5.5.3"
  network_mode = "host"
    [services.kibana.environment]
    ELASTICSEARCH_URL = "http://localhost:9200"
    XPACK_MONITORING_ENABLE = false

Could instead be:

[services]
  [.kibana]
  container_name = "metrics_kibana"
  image = "docker.elastic.co/kibana/kibana:5.5.3"
  network_mode = "host"
    [.environment]
    ELASTICSEARCH_URL = "http://localhost:9200"
    XPACK_MONITORING_ENABLE = false

The example just takes advantage of the dotted keys notation, in that if the key starts with a dot, it would inherit the parent table keyspace. I went with a dot as it is has related meaning for a relative path as well ./, although another symbol may stand out better?(or could instead be prepended to the table syntax, .[environment])

The above deals with the issue of table keys getting progressively longer, where the actual table unique name gets offset to the right(potentially requiring scrolling) and/or lost in the noise of similar table keys as shown earlier in the thread.

Personally, for nested config the table keys or dotted keys can get quite long/repetitive. It's one area that I think JSON and YAML handle better.

I still find multiline tables that look like JSON offputting. But I think I have an idea for a TOML-friendly syntax that could get you what you're wanting. I don't have time to write it down now, but I'll be back later on.

@eksortso I take it later on never came, or did you raise it in another issue? What do you think about the above?

I did find it odd that inline tables have this special syntax for single lines, unable to break to multi-line with trailing commas like arrays can. Most new comers to TOML will be familiar with a table/object being defined this way and it'd click, until they realize it breaks should you want to go to multiple lines, yet arrays don't share this restriction.

I personally prefer curly brackets for additional clarification of scope. TOML appears to rely on name-spacing allowing for a flat format should you pay attention to the keys. Some try to indicate the scope a bit more via the optional indentation as shown earlier but that uncomfortable/detached to me.

I like that end of lines don't need commas in TOML, although they're required for arrays(and inline tables), they could be dropped/optional for multi-line variants?:

[services]
  [.kibana]
  container_name = "metrics_kibana"
  image = "docker.elastic.co/kibana/kibana:5.5.3"
  network_mode = "host"
  environment = {
    ELASTICSEARCH_URL = "http://localhost:9200"
    XPACK_MONITORING_ENABLE = false
  }
  ports = [
    "5601:5601"
  ]

  [.kibana_2]
  container_name = "metrics_kibana2"
  image = "docker.elastic.co/kibana/kibana:5.5.3"
  network_mode = "host"
  environment = {
    ELASTICSEARCH_URL = "http://localhost:9201"
    XPACK_MONITORING_ENABLE = false
  }
  ports = [
    "5602:5601"
  ]

This example from the project readme is a good case of verbosity/noise that gave me a double take of trying to make sense of what was going on:

[[fruit]]
  name = "apple"

  [fruit.physical]  # subtable
    color = "red"
    shape = "round"

  [[fruit.variety]]  # nested array of tables
    name = "red delicious"

  [[fruit.variety]]
    name = "granny smith"

[[fruit]]
  name = "banana"

  [[fruit.variety]]
    name = "plantain"

This is probably not much better, and might be asking for too much?(strays too far from what TOML currently is?):

[[fruit]]
name = "apple"
physical { # Scoped table
  color = "red"
  shape = "round"
}
variety [ # Scoped array of tables
  name = "red delicious"
  --- # A separator between objects
  name = "granny smith"
]

[[fruit]] # Still useful as a `---` above may not be distinct enough
name = "banana"
variety [
  name = "plantain"
]

Applied to the earlier example for arrays of tables:

[main_app.general_settings.logging]
log-lib = "logrus"

handlers [
    name = "default"
    output =  "stdout"
    level = "info"
    ---
    name = "stderr"
    output =  "stderr"
    level = "error"
    ---
    name = "http-access"
    output =  "/var/log/access.log"
    level = "info"
]
loggers [
    name = "default"
    handlers = ["default", "stderr"]
    level = "warning"
    ---
    name = "http-access"
    handlers = ["http-access"]
    level = "info"
]

The use of --- as a separator between elements allows for avoiding unnecessary{ }(which are useful for a single instance assigned to a key), as those add noise along with , that @eksortso I believe found offputting?

Note the lack of assignment =, that would probably lead to some mishaps with array elements as you'd need ,(instead of inferring from \n) on single lines and objects/tables would need to be wrapped with { }..

@eksortso
Copy link
Contributor

eksortso commented Dec 5, 2019

@eksortso I take it later on never came, or did you raise it in another issue? What do you think about the above?

Ouch...

Later on came and went. See #525 for discussion, and #551 for the now-closed PR.

I'll be back in a few hours.

@polarathene
Copy link

Ouch...

Oh, I didn't mean it that way! 😝

Later on came and went. See #525 for discussion, and #551 for the now-closed PR.

Ah, that's unfortunate.. 😞 I liked the multi-line approach you proposed, substituting commas with new lines. HJSON ended up offering a good enough solution for me offering this feature in the meantime.

@eksortso
Copy link
Contributor

eksortso commented Dec 5, 2019

Ouch...

Oh, I didn't mean it that way! 😝

No worries. But there is a link to #525 up there.

Later on came and went. See #525 for discussion, and #551 for the now-closed PR.

Ah, that's unfortunate.. 😞 I liked the multi-line approach you proposed, substituting commas with new lines. HJSON ended up offering a good enough solution for me offering this feature in the meantime.

Thanks! That's good how HJSON implemented it. I've see similar patterns in other config formats, whose names I've forgotten.

But keep in mind that HJSON is based on JSON, and TOML was originally inspired by informal INI formats. What that means, philosophically, is that nesting in TOML is possible, but deep nesting is, and ought to be, discouraged. By that philosophy, shallow nesting is ideal for a configuration format, and it also works for simple data exchange uses. Over time, I've come to adopt this philosophy myself. I'm still interested in bringing back a little bit of nesting, a la #551, but unless it gains traction, I won't push for it.

Other proposals have been offered to use [.subtable] syntax for nesting. But it can get confusing if you can't keep track of your absolute path. In fact, your first example suggests that each [.subtable] nests inside its parent, but your second example suggests that each [.subtable] is a subtable of a common parent. Is [.kibana_2] actually [services.kibana_2], or [services.kibana.kibana_2]?

But there's another problem that [.subtable] syntax doesn't solve, which was my impetus for #525: it can't be used to put subtable definitions in the middle of other tables. That was relieved with the introduction of dotted keys in key/value pairs. Again, it works best for shallow nesting.

Regarding commas in arrays and in inline tables, I do feel like the rules for placing those commas ought to be strict, to prevent confusion. It's already decided that arrays require commas between elements, and that a trailing comma is fine. For inline tables, commas must separate the key/value pairs, since they're on the same line. If #551 were reintroduced, newlines could be used to separate the key/value pairs in multi-line inline tables, same as they are used for regular tables. But commas would not be allowed between lines.

I'm intrigued by some of your other proposals, particularly the --- separator in arrays of tables (which could be used with regular table-array notation, actually). But I'll suggest that you simplify their presentations, and open a new issue for each to present them. Long posts like these are often hard to follow, so less really would be more.

@brunoborges
Copy link

brunoborges commented Oct 19, 2020

Perhaps worth connecting this proposal with #744 as well, for use of placeholders/shortcuts to outer tables names.

Example:

[servers]
mask = "255.255.255.0"

[*.server1] # subtable 2
ip = "192.168.0.1"

[*.server2] # subtable 3
ip = "192.168.0.2"

The above is the same as explicit/verbose keys servers.server and servers.server2 in tables 2 and 3.

@oovm
Copy link

oovm commented Oct 21, 2020

I agree to support line break.


Yes, there exists a form that makes the final result look good and easy to read.

But the problem is that the conversion tools and serde tools can’t do it.

The conversion tool can only convert a long line of things that cannot be read.

If line breaks are allowed, these tools can adjust the indentation to make the results look better.

@dmbaturin
Copy link

For me as a user, the fact that newlines aren't allowed inside inline tables was extremely surprising.

For me as an implemented, that's a special case in the parser that I wish I could get rid of.

I'm all for allowing it.

@eksortso
Copy link
Contributor

For me as a user, the fact that newlines aren't allowed inside inline tables was extremely surprising.

I could say in response that the mere existence of inline tables is surprising, because the INI tradition only allows values on a single line, and only then it's just one key/value pair per line. Multiple lines are the exception, not the rule. And there are two other, more versatile, ways to define a table over multiple lines.

Probably ought to go to #781 and join the discussion there.

@dmbaturin
Copy link

@eksortso It's a surprise within the TOML specification. If you see a file with an array with line breaks, it's quite reasonable to assume that all composite values can have line breaks in them, except it's not the case. Same goes for trailing commas.

@marzer
Copy link
Contributor

marzer commented Jul 21, 2021

If you see a file with an array with line breaks, it's quite reasonable to assume that all composite values can have line breaks in them

This is a good point. I'd support this change for that reason alone; it's a very weird inconsistency in the language.

@eksortso
Copy link
Contributor

eksortso commented Jul 22, 2021

Everyone has forgotten that inline tables were intended to allow brief, terse injections of small tables into a configuration. They were never intended to replace table headers and sections, and they were never intended to extend beyond a single line.

How consistent must we be? Consistent enough to nullify all intentional design choices? This is still a bad idea.

I mean, it wouldn't be hard to implement. Our work is halfway done for us already, because we can reuse the ABNF code for splitting arrays across multiple lines. This would also let us include end-of-line comments. More consistent all around.

servers = {
    alpha = {  # primary server
        ip = "10.0.0.1",
        role = "frontend",},
    beta = {  # secondary server
        ip = "10.0.0.2",
        role = "backend",},
}

And while we're at it, let's allow commas between key/value pairs outside of inline tables, so we can have more than one key/value pair on a single line. This is also a bad idea, but it's consistency, and that's what we want.

[owner]
name = "Tom Preston-Werner", dob = 1979-05-27T07:32:00-08:00,

Other benefits may come from this. If all headers were replaced with inline tables, then we could define top-level key/value pairs at the bottom of the document, or in the middle, because why not?

servers = {
    alpha = {ip = "10.0.0.1", role = "frontend",},  # primary server
    beta = {ip = "10.0.0.2", role = "backend",},  # secondary server
},

title = "TOML Example",

# This was a TOML document

Consistency over design, consistency over functionality, consistency over readability, consistency over everything else. Where does it end? When TOML becomes a superset of JSON?

Never mind the bitterness. Tell me what you think of these different ideas. Maybe you can put my fears to rest.

@eksortso
Copy link
Contributor

But if we're going to smash this piñata to bits, let's stuff it with some more sweet treats. Once again, I propose we allow newlines to separate key/value pairs as well as commas, just like we can do outside of inline tables. That will make things even more consistent. And we can still have a comma before or after the newline if we wanted.

[database]
enabled = true
ports = [ 8000, 8001, 8002 ]
data = [ ["delta", "phi"], [3.14] ]
temp_targets = {
    cpu = 79.5
    case = 72.0
}

@awused
Copy link

awused commented Jul 22, 2021

Everyone has forgotten that inline tables were intended to allow brief, terse injections of small tables into a configuration.

I don't think anyone has forgotten, they just disagree with the intentions behind the design. The design is not sacrosanct and it should not be treated as such.

How consistent must we be? Consistent enough to nullify all intentional design choices? This is still a bad idea.

This entire bug is debating over a specific intentional design decision, and the answer seems to be "at least a tiny bit more consistency than we have now."

If TOML's primary goal was to make pretty configs the current design already does poorly when tasked with common config structures. Those examples are at least concise and consistent, even if they're intentionally ugly. TOML doesn't currently force end users to write good looking configs, and if it did it would have to be with parsers rejecting configs that don't follow some strictly mandated style.

then we could define top-level key/value pairs at the bottom of the document, or in the middle, because why not?

The inability to back out of a regular table to the global scope is also a surprising pain point that has come up repeatedly, dictating the order of configuration options to applications. Just because a key/value is top level doesn't mean it's important, it can be much less important than the tables that would appear before it in other languages.

I think the primary reason #551 failed to garner interest was because it would result in unexpected and surprising parsing errors for end users (as opposed to developers writing parsers). At least that was my problem with it. They will not realize or appreciate that there are two types of tables using {} and that each table must entirely conform to one style. The current design also surprises and confuses end users, as evidenced by this very bug.

@marzer
Copy link
Contributor

marzer commented Jul 22, 2021

@eksortso

Consistency over design, consistency over functionality, consistency over readability, consistency over everything else.

Literally nobody is saying that, but you know that. The inconsistency is dumb in this one particular context because it already causes regular, significant confusion for users.

I'd also argue that "consistency over design" is conceptually nonsensical; good design is always internally consistent. TOML has two collection value types (as in, two ways of specifying a collection on the right hand side of a KVP assignment); both are comma-delimited, but only one allows newlines. This is internally inconsistent, enough that users are regularly caught out by it.

@mcarans
Copy link

mcarans commented Mar 6, 2022

@pradyunsg I am delighted to hear that this is moving forward. Taking the wording for arrays from the spec and using for inline tables, the spec for inline tables will be: "Inline tables can span multiple lines. A terminating comma (also called a trailing comma) is permitted after the last value of the inline tables. Any number of newlines and comments may precede values, commas, and the closing bracket. Indentation between inline table values and commas is treated as whitespace and ignored."

My understanding then is that both the representations below will be valid. Please correct me if I am wrong.

[tool.pydoc-markdown.renderer]
type = "mkdocs"
mkdocs_config = {
  site_name = "HDX Python Scraper"
}

pages = [
  { title = "Home"},
  {
    title = "API Documentation",
    children = [
      {
        title = "Source Readers",
        contents = [
          "hdx.scraper.readers.*"
        ]
      },
      {
        title = "Outputs",
        contents = [
          "hdx.scraper.jsonoutput.*",
          "hdx.scraper.googlesheets.*",
          "hdx.scraper.exceloutput.*"
        ]
      }
    ]
  }
]
[tool.pydoc-markdown.renderer]
type = "mkdocs"
mkdocs_config = {
  site_name = "HDX Python Scraper",
}

pages = [
  { title = "Home", },
  {
    title = "API Documentation",
    children = [
      {
        title = "Source Readers",
        contents = [
          "hdx.scraper.readers.*",
        ]
      },
      {
        title = "Outputs",
        contents = [
          "hdx.scraper.jsonoutput.*",
          "hdx.scraper.googlesheets.*",
          "hdx.scraper.exceloutput.*",
        ],
      },
    ],
  },
]

@jstm88
Copy link

jstm88 commented Mar 18, 2022

I just came across this in my project (my first using TOML) and I think my example illustrates why the current solutions just don't "feel" clean, even though they aren't really problematic per se.

The starting point in my project was:

# Form 1
[layer.base]
name = 'Base Layer'
buttons = [
	'open-test-layer',
	'',
	'',
	'',
	'reset',
	'exit'
]

However, the buttons array is sparse and I didn't want to need to include empty keys. This is what I tried next, which seemed like a logical way to move from an array to a dict, and looks clean, but isn't currently allowed:

# Form 2
[layer.base]
name = 'Base Layer'
buttons = {
    1 = 'open-test-layer'
    5 = 'reset'
    6 = 'exit'
}

The next version of course works, but with more than 1 or 2 buttons this would begin to completely fall flat in terms of readability:

# Form 3
[layer.base]
name = 'Base Layer'
buttons = { 1 = 'open-test-layer', 5 = 'reset', 6 = 'exit' }

And finally, what I've settled on (for now) as the best available option:

# Form 4 - okay
[layer.base]
name = 'Base Layer'
[layer.base.buttons]
1 = 'open-test-layer'
5 = 'reset'
6 = 'exit'

This is not bad, but I really don't like the duplication in the buttons array. For longer keys (and with multiple sub-tables) this could get quite tiring.

I think my issues come down to two things:

  1. The need to re-specify the parent key for all sub-tables - I know I've seen some proposals that would allow this to be replaced with something like [.buttons] which would be quite nice. I also know there's some pushback saying that it makes it harder to read, but I disagree and I think almost any feature can be misused. :)
  2. Arrays (see Form 1) can span multiple lines, but dicts/tables cannot (Form 2). In almost all programming languages definitions for both arrays and dicts can span lines and I think this disconnect is why it feels like multi-line tables are missing, even though there's technically already an alternative.

And to add one more thing: the ending commas on arrays really seem like they could be optional - I don't believe it would introduce any ambiguity by not requiring them, but maybe others have some more well-researched thoughts on this.

Regardless, I'm really liking TOML. It's a breath of fresh air after the feature-creep abomination that YAML has turned into. 🤣

@ChristianSi
Copy link
Contributor

@jstm88 Using only the existing syntax, your Form 4 can also be nicely written using dotted keys:

[layer.base]
name = 'Base Layer'
buttons.1 = 'open-test-layer'
buttons.5 = 'reset'
buttons.6 = 'exit'

(It would look even better if the subtable were named "button" instead of "buttons".)

@lizelive
Copy link

lizelive commented May 4, 2022

i created a few proposals for how to do nesting in toml json like
i prefer #898
but also #900

@eksortso
Copy link
Contributor

I have an idea for a single universal separation format. It incorporates the idea of newlines and trailing commas in inline tables, and much more. In a sense, it's a bound on the other extreme of this debate. Take a look at #903.

@arp242
Copy link
Contributor

arp242 commented May 16, 2022

I touched a bit on this in #903 (comment), but my main concern with this is generating quality error messages.

For example:

tbl = {
    a = 1,
    b = 2,
    c = 3
k  = 4
k2 = 5
tbl = {
    a = 1,
    b = 2,
    c = 3,
k  = 4
k2 = 5

Assuming we allow both newlines and trailing commas, in the first example we can generate a good error message: after c = 3 there is no comma and now we see another key/value pair, so we can display:

Error: missing , or } after 3 in:

    c = 3
         ^

The second example is trickier; we left off the } but where do we intend the table to end? This is ambiguous; the error message here will be:

Error: missing , or } after 4 in:

    k  = 4
          ^

Which is still okay-ish, I guess, but not great either.


The difficulty here is that key = value is used both in inline tables and top-level k/v pairs. I like this feature, but it can make things a bit trickier as the same syntax is used in two different contexts.

None of this is a show-stopper as far as I'm concerned, but I'm a huge fan of accurate error messages that say "here exactly is your error", rather than "here is where I encountered a parsing error, but your actual error is a few lines up". Currently, TOML allows almost entirely the first type of errors.

@eksortso
Copy link
Contributor

@pradyunsg A while back, you observed that this is "just a matter of changing the ws, comment and comma handling for inline tables to be consistent with arrays," and you would file a PR. I'd like to expedite this. Do you have a PR started? Would you mind if I took a crack at it?

I'm leaving #903 open for further discussion, but it's becoming apparent that this change needs to be made. We'll retain the need for commas as separators inside inline tables even if those tables span multiple lines, and we will allow a trailing comma. From the perspective of #903, this change could be seen as a precursor. But it's necessary now.

arp242 added a commit to arp242/toml that referenced this issue May 18, 2022
arp242 added a commit to arp242/toml that referenced this issue Jun 2, 2023
This backs out the unicode bare keys from toml-lang#891.

This does *not* mean we can't include it in a future 1.2 (or 1.3, or
whatever); just that right now there doesn't seem to be a clear
consensus regarding to normalisation and which characters to include.
It's already the most discussed single issue in the history of TOML.

I kind of hate doing this as it seems a step backwards; in principle I
think we *should* have this so I'm not against the idea of the feature
as such, but things seem to be at a bit of a stalemate right now, and
this will allow TOML to move forward on other issues.

It hasn't come up *that* often; the issue (toml-lang#687) wasn't filed until
2019, and has only 11 upvotes. Other than that, the issue was raised
only once before in 2015 as far as I can find (toml-lang#337). I also can't
really find anyone asking for it in any of the HN threads on TOML.

All of this means we can push forward releasing TOML 1.1, giving people
access to the much more frequently requested relaxing of inline tables
(toml-lang#516, with 122 upvotes, and has come up on HN as well) and some other
more minor things (e.g. `\e` has 12 upvotes in toml-lang#715).

Basically, a lot more people are waiting for this, and all things
considered this seems a better path forward for now, unless someone
comes up with a proposal which addresses all issues (I tried and thus
far failed).

I proposed this over here a few months ago, and the response didn't seem
too hostile to the idea:
toml-lang#966 (comment)
arp242 added a commit to arp242/toml that referenced this issue Jun 2, 2023
This backs out the unicode bare keys from toml-lang#891.

This does *not* mean we can't include it in a future 1.2 (or 1.3, or
whatever); just that right now there doesn't seem to be a clear
consensus regarding to normalisation and which characters to include.
It's already the most discussed single issue in the history of TOML.

I kind of hate doing this as it seems a step backwards; in principle I
think we *should* have this so I'm not against the idea of the feature
as such, but things seem to be at a bit of a stalemate right now, and
this will allow TOML to move forward on other fronts.

It hasn't come up *that* often; the issue (toml-lang#687) wasn't filed until
2019, and has only 11 upvotes. Other than that, the issue was raised
only once before in 2015 as far as I can find (toml-lang#337). I also can't
really find anyone asking for it in any of the HN threads on TOML.

Reverting this means we can go forward releasing TOML 1.1, giving people
access to the much more frequently requested relaxing of inline tables
(toml-lang#516, with 122 upvotes, and has come up on HN as well) and some other
more minor things (e.g. `\e` has 12 upvotes in toml-lang#715).

Basically, a lot more people are waiting for this, and all things
considered this seems a better path forward for now, unless someone
comes up with a proposal which addresses all issues (I tried and thus
far failed).

I proposed this over here a few months ago, and the responses didn't
seem too hostile to the idea:
toml-lang#966 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.