Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

why not just use a map? #997

Closed
bdytx5 opened this issue Oct 7, 2023 · 15 comments
Closed

why not just use a map? #997

bdytx5 opened this issue Oct 7, 2023 · 15 comments

Comments

@bdytx5
Copy link

bdytx5 commented Oct 7, 2023

[hardware_filters]
cpu_ram = ">20"
disk_space = ">20"
gpu_name = "RTX_3060"

first of all TOML seems to be perfect, except for this syntax. Why not just do ->

hardware_filteres = {
cpu_ram = ">20"
disk_space = ">20"
gpu_name = "RTX_3060"
}

@eksortso
Copy link
Contributor

eksortso commented Oct 7, 2023

Well, you will be able to do that, once we get TOML v1.1.0 out the door. (EDIT: Whoops, no you can't. But see the section below.) But you're asking for a deeper explanation, I take it. (This would be a great topic to add to a FAQ on the toml.io website, by the way!)

TOML was originally inspired by the many, many different INI formats that existed. They all offered different features, and none of them were ever standardized. TOML is a single standard that follows in the INI tradition. The syntax encourages simplicity, and as a configuration language, it allows for greater depth, but no more than is needed.

All this INI stuff happened decades ago, before XML and later JSON were used for data serialization. Those formats famously embrace depth. And in JSON's case, { literally! } But depth for configuration can be overkill, and it's why we held off allowing multiline table values for so long.


I'm personally very big on allowing newlines instead of commas to separate key-value pairs. It's more TOML-like, I would say. I actually proposed this way back on #525, and even made a PR for it. It was overly strict by current standards and didn't go anywhere though. Maybe it's time to reintroduce the idea and loosen up a bit.

Let me propose this: Let's allow newlines as well as commas as delimiters in inline tables. I haven't got much time, but I can find time to make a minimal PR to make this happen before v1.1.0 goes out.

@bdytx5
Copy link
Author

bdytx5 commented Oct 8, 2023

Ah very interesting. Im new to TOML and I guess I didn't know you could even use brackets at all! Yeah I guess if you can, that changes things, but I do agree the newlines seem reasonable. I suppose the biggest thing that stands out as less intuitive is the table [] syntax. Basically I'm working on a project, and I need a config file that is readable, but also is somewhat intuitive for users to change without messing up the formatting and preventing the code from parsing it... Json is unbreakable, yet its not very readable or visually appealing, no comments etc. So then I go to yaml, and realize yaml is super readable, but the format for adding things is less obvious (for the typical python programmer imo). So then I go to Toml, and it seems to solve both issues, only drawback is a less common syntax using the [] syntax. Just brainstorming what seems most intuitive, but something like

[owner]
name = "Tom Preston-Werner"
dob = 1979-05-27T07:32:00-08:00

with alternative syntax

owner = {
name = "Tom Preston-Werner"
dob = 1979-05-27T07:32:00-08:00
}

which now after refering to the docs, I see the dot syntax !!

I guess after thinking about this, there so much that goes into picking a syntax and its so reliant on past experiences. So what seems best for a python programmer may not make sense for a C++ programmer...

anyway, thanks for the response!!

@eksortso
Copy link
Contributor

eksortso commented Oct 9, 2023

I guess after thinking about this, there so much that goes into picking a syntax and its so reliant on past experiences. So what seems best for a python programmer may not make sense for a C++ programmer...

It's why having so many different voices contributing to the standard makes it better. Great ideas can come from anywhere, even about things thought long-settled.

I'd hope both Python and C++ developers can find value in TOML. We try to be as language-neutral as possible. But the needs of different users vary a lot, even among non-programmers, and the users are necessarily our priority.

anyway, thanks for the response!!

Thank you for your question! All those minute decisions that make up TOML get baked into the standard, and we always need input to help us make the right decisions and to get us where we're going, however long it takes.

I'll open another issue to reintroduce the newline delimiter idea within braces. Follow along if you'd like.

@ChristianSi
Copy link
Contributor

ChristianSi commented Oct 28, 2023

I think such new syntax should be reserved for TOML 1.2 or later. For now we should focus on getting 1.1 shipped.

@eksortso
Copy link
Contributor

Agreed. Though I get a feeling this will come in high demand once v1.1.0 is out.

@mcexit
Copy link

mcexit commented Nov 15, 2023

Will #904 still make it into v1.1.0? If so, it would be great to have the v1.1.0 spec expedited even if the only change is that feature alone.

As for merely allowing newlines to separate key/value pairs without commas, this makes sense but I believe it should also be consistent with arrays.

So if you allow:

hardware_filteres = {
cpu_ram = ">20"
disk_space = ">20"
gpu_name = "RTX_3060"
}

It should also support:

hardware_filteres = [
'cpu_ram'
'disk_space'
'gpu_name'
]

I don't think the lack of commas should affect parser performance much at all, but I'm not an expert in configuration parsing. In all honesty you could support spaces as well, albeit a bit ugly for my tastes:

hardware_filteres = [ 'cpu_ram' 'disk_space' 'gpu_name' ]
hardware_filteres = { cpu_ram = ">20" disk_space = ">20" gpu_name = "RTX_3060" }

I don't see them being much different when parsing, and although commas in this scenario make more sense visually... I'd much rather see TOML allow user preference instead of enforcing style when there is no significant performance penalty.

@eksortso
Copy link
Contributor

Unless #904 gets reverted or overwritten, it'll be part of TOML v1.1.0. Nobody's talking about reverting it; it's undeniably popular!

If I had more time (and I'm cutting into my other matters' time just responding), I'd certainly push harder to allow newlines as separators in inline tables.

I personally do not want simple whitespace to act as a separator, though. I don't know how feasible that would be, frankly. Style in TOML needs to be more obvious than flexible, because if it gets too flexible, it gets too complicated. So new style options will only be added if their benefits outweigh their limitations.

Others have pointed out potential points of confusion if we allow newlines as separators in arrays. I'm more open to that particular idea (and in fact suggested it at one point), but now isn't the time to promote it. That idea will likely have to wait until after v1.1.0 when we get more feedback from users.

@levicki

This comment was marked as abuse.

@eksortso
Copy link
Contributor

eksortso commented Mar 12, 2024

If you allow new line as a separator, is the following array valid or not?

Bytes = [
    0x00, 0x01, 0x02
    0x03, 0x04, 0x05
]

It would be valid, but it could be confusing to folks who see the tabular layout but no square brackets, then inadvertently think it's a 2x3 array. This was brought up in the past; I'm very aware of it.

We are not discussing arrays or how to separate array values. We are discussing inline tables, and how to separate key/value pairs. And we could separate syntaxes for each of them. Which, actually, we currently do.

Since new line character is white-space, is this now also valid array?
[...]

We do make a distinction between horizontal whitespace (spaces, tabs) and newlines. That's never been an issue.

TOML seems well defined for me — I really don't want it devolved into a mess which can't be parsed without employing complicated algorithms and which you can't tell visually whether it is well-formed or not. Whoever is deciding what goes into next specification should keep in mind that TOML is supposed to be human-readable and unambiguous, not just another machine-only format.

We're certainly aware of that. It's for human beings to read, as well as to write. But that doesn't mean that they can get all the nuances without looking at multiple examples or at the spec.

The driving idea is that we want it to be simple and obvious enough for an uninformed user to pick up for common usage, and to still be simple and obvious for them once they learn about other syntax options. We don't want a million ways of doing the same thing; the few choices that have happened, happened organically. (For instance, on a recent issue, I mentioned dotted keys as a way to simplify their configuration. Dotted keys were originally introduced as an alternative to subtables, whether standard or inline. They still define subtables, but they're conceptually easier to handle, and they fit in well with how subtables are named.)

For this proposal, the notion that inline tables that span multiple lines will require commas between the key/value pairs is evident as it stands, but may become a hindrance for future writers. Allowing newlines between key/value pairs, so that they resemble the standard table syntax, is worth our consideration. Maybe not right now, but after TOML v1.1.0 is released.

@levicki

This comment was marked as abuse.

@eksortso
Copy link
Contributor

Please stop.

@levicki Nothing's forcing you to stop using table sections. Nothing ever will.

@levicki

This comment was marked as abuse.

@mcexit
Copy link

mcexit commented Mar 12, 2024

@levicki I don't understand why you're complaining. First, nothing forces parsers to implement anything. If you maintain a parser that wants to stay on an older version of TOML or use your own custom implementation, then do so.

Second, if people use a different format... maybe there is a reason. Inline tables are more readable overall IMO and express depth in a way that is natural for many users, especially when you have several nested objects. TOML adapting this while maintaining their original compatibility keeps it relevant, and makes things easier for many users (myself included).

I honestly don't see TOML becoming more like JSONC or JSON5 as a bad thing. Each language has adapted to become more convenient and readable for users. If you want an authoritarian language resistant to modernization & convenience then there is always YAML. TOML adapting is a good thing. I only wish it could have happened sooner.

@levicki

This comment was marked as abuse.

@pradyunsg
Copy link
Member

I think I'm going to close this out for now -- most of the needs for this are covered by #904 and we can revisit #551 at a later date if needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants