Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: YAML-like structure syntax #4952

Open
coffeescriptbot opened this issue Feb 19, 2018 · 30 comments
Open

Proposal: YAML-like structure syntax #4952

coffeescriptbot opened this issue Feb 19, 2018 · 30 comments

Comments

@coffeescriptbot
Copy link
Collaborator

From @YamiOdymel on 2016-12-10 10:16

I still don't see why can't we use YAML-like syntax after read three of the issue which mentioned below, and I also feel that {, } and [, ] are the most un-CoffeeScript syntaxes (Not indent enough).

So instead of having (CoffeeScript)

kids = sister:
  name   : 'Ida'
  age    : 9
  parents: [
    {
      name    : 'Caris'
      relation: 'xxx'
    }
    {
      name    : 'Mike'
      relation: 'xxx'
    }
  ]

can we have RAML-like syntax like this?

kids = sister:
  name   : 'Ida'
  age    : 9
  parents:
    - name    : 'Caris'
      relation: 'xxx'
    - name    : 'Mike'
      relation: 'xxx'

and both compile to

var kids = 
{
  sister:
  {
    name   : "Ida",
    age    : 9,
    parents:
    [
        { name: 'Caris', relation: 'xxx' },
        { name: 'Mike' , relation: 'xxx' }
    ]
  }
};

More Examples

Types

CoffeeScript

types = [
  -1
  '2'
  0.3
  ->
    alert 'foo, bar'
  helloFunc
]

YAML-like (the CoffeeScript one might be better than this)

types =
  - -1
  - "2"
  - 0.3
  - ->
    alert "foo, bar"
  - helloFunc

JavaScript

var types =
[
  -1,
  "2",
  0.3,
  function()
  {
    alert("foo, bar")
  },
  helloFunc
]

Large structure

CoffeeScript

objs = [ {
  username: 'YamiOdymel'
  nickname: 'yamiodymel'
  avatar  :
    small: 'http://www.example.com/example.png-small'
    medium: 'http://www.example.com/example.png-medium'
    large: 'http://www.example.com/example.png-large'
  hobbies:
    most: [
      'tech'
      'sport'
      'eco'
    ]
    medium: [
      'animals'
      'music'
    ]
    lowest: [ 'science' ]
} ]

YAML-like

objs =
  - username: 'YamiOdymel'
    nickname: 'yamiodymel'
    avatar:
      small : 'http://www.example.com/example.png-small'
      medium: 'http://www.example.com/example.png-medium'
      large : 'http://www.example.com/example.png-large'
    hobbies:
      most:
        - 'tech'
        - 'sport'
        - 'eco'
      medium:
        - 'animals'
        - 'music'
      lowest:
        - 'science'

JavaScript

objs =
[
  {
    username: 'YamiOdymel',
    nickname: 'yamiodymel',
    avatar:
    {
      small: 'http://www.example.com/example.png-small',
      medium: 'http://www.example.com/example.png-medium',
      large: 'http://www.example.com/example.png-large'
    },
    hobbies:
    {
      most  : ['tech'   , 'sport', 'eco'],
      medium: ['animals', 'music'],
      lowest: ['science']
    }
  }
]

Other ideas

From Terser array syntax

array = []
  {}
    key: value
    bla: da
  {}
    key: value2
    bla: doom
foo :[]
  'hooray'
  'no closing character'

Better support for YAML-like syntax
Enhancement: possible solution to the Array-Literal-Without-Brackets problem
Terser array syntax

@coffeescriptbot
Copy link
Collaborator Author

From @edemaine on 2016-12-10 12:10

I have on several occasions written CS like

arrayOfObjects = [
  name: 'me' 
  email: '[email protected]'
,
  name: 'you'
  email: '[email protected]'
]

which can get a little hard to read, and sometimes it's hard to find the right outdent level to put the comma (e.g. if this is wrapped in a function call). Having optional - syntax for making array items clearer would be nice for these cases, if it's disambiguatable in the parser.

(This is a YAML and TypeScript feature, but there's still a lot of other stuff in YAML not supported here, e.g. | to write literals. So I wouldn't necessarily call this YAML support, but it's still a nice feature.)

@coffeescriptbot
Copy link
Collaborator Author

From @jituanlin on 2017-03-23 10:35

Nice features!I support it.

@aminland
Copy link

This would be a huge improvement.

One of the main reasons I use CoffeeScript over es6 is readability. A yaml-like extension to the object notation would be amazing.

@vendethiel
Copy link
Collaborator

vendethiel commented Feb 25, 2018

This is one of the oldest ones... See #645 for the "YAML" side as well as #2259, and also #1872, #3018... Even #2642, #1190, #1579 are related.
I'm OK with leaving this one open, but... I think we've seen enough of it.

@aminland
Copy link

aminland commented Feb 25, 2018

If people keep making the request, maybe it's not such a bad idea? Adding new capabilities and syntax sugars is one of the best ways to help with ensuring the community grows and thrives...

With respect to solving the recurring issue of denoting arrays of objects, perhaps we can take an existing quirk (possibly a bug) in coffeescript and extend it to our advantage:
At present, if you write x = [a:'a' ; b:'b'] on a single line, you get x = [{ a: 'a' }, { b: 'b' }]

But if it's not on a single line:

x = [ 
  a:'a' ; 
  b:'b'
] 

currently compiles to x = [{ a: 'a', b: 'b' }];

So perhaps an easy way to address this recurring request would be to allow the quirk that causes the semicolon hack on a single line to work in the multiline case?

@zdenko
Copy link
Collaborator

zdenko commented Mar 2, 2018

Just adding my 2¢. See also #4600 and #4618.
I don't think YAML-like syntax will improve readability.

Indention matters.

kids:
  sister:
    name: 'Ida'
    age: 9
  parents:
    - name: 'Caris'
      relation: 'xxx'
     - name:  'Mike'
       relation: 'xxx'

# bad indentation of a sequence entry at line 7, column 6: - name : 'Mike' ^

More keystrokes. YAML (9 x -) vs CS (6 = 3 x [ + 3 x ]) in given example.

objs:
  hobbies:
    most:
      - 'tech'
      - 'sport'
      - 'eco'
    medium:
      - 'animals'
      - 'music'
      - 'cooking'
    lowest:
      - 'science'
      - 'math'
      - 'reading'
# Note that indention of the array items is not strict.
objs =
  hobbies:
    most: [
        'tech'
          'sport'
      'eco'
    ]
    medium: [
      'animals'
         'music'
        'cooking'
    ]
    lowest: [ 
       'science'
         'math'
        'reading' 
    ]

Spread properties in YAML-like syntax?

child = {parents..., environment..., school...}

YAML or YAML-like syntax has its advantages, but not too much IMHO.
Besides, if a list is getting longer and thus less readable, it's usually a sign that approach to the problem must be changed: split data into smaller chunks, generators, import/export module, external YAML file and parse when needed (e.g. cached version or on the fly)...

@vendethiel
Copy link
Collaborator

vendethiel commented Mar 2, 2018

Note that indention of the array items is not strict.

that's definitely not a feature.

@zdenko
Copy link
Collaborator

zdenko commented Mar 2, 2018

@vendethiel probably not, but current rules in grammar allow this.
I'm not familiar with the history, but I presumed there is probably a reason behind this.
Also, my intention was not to emphasize this as a feature, just wanted to point out the difference.

(I think you copied the whole line from the comment, which now became a title because of #, and it looks like one of us is shouting 😄)

@lunakurame
Copy link

I really like that proposal, I use the unintended comma for now, but having YAML-like arrays would be nice. But don't forget CoffeeScript allows tab characters, while YAML doesn't (in most places). I don't know how does CoffeeScript's parser even work, so I don't know if it's an issue, but I wanted to point that out, just in case. Even if you don't use tabs, I think tabs explain the meaning of indentation quite nicely in this case.

I looked at the YAML 1.2 specs, from what I understood the - character is part of the indentation in YAML, the specs allows this:

-
  name: Mark McGwire
  hr:   65
  avg:  0.278
-
  name: Sammy Sosa
  hr:   63
  avg:  0.288

But placing - in the same line as name makes - a part of indentation. It's really just taking advantage of tab stops (which is quite ironic since YAML doesn't use tabs) like in the Horstmann style.

Semantically a map containing a sequence (which in CoffeeScript is an object containing an array) looks like this:

object:
  -
    array item
  -
    array item

Which can be (and usually is) converted to this:

object:
  - array item
  - array item

But YAML also allows omitting that indentation level for sequences, so this is also valid and pretty common:

object:
- array item
- array item

So that means if we want this in CoffeeScript, we would have to make it work with tabs and the only reasonable way I see is using tab stops like this (---| is a 4-characters wide tab, it shrinks in places with - at the beginning because of tab stops, so it looks aligned):

kids = sister:
    name: 'Ida'
    age: 9
    parents:
        -   name: 'Caris'
            relation: 'xxx'
        -   name: 'Mike'
            relation: 'xxx'

kids = sister:
___|name: 'Ida'
___|age: 9
___|parents:
___|___|-__|name: 'Caris'
___|___|___|relation: 'xxx'
___|___|-__|name: 'Mike'
___|___|___|relation: 'xxx'

And if we allow omitting the indentation level for arrays, like YAML does, this would be legit too:

kids = sister:
    name: 'Ida'
    age: 9
    parents:
    -   name: 'Caris'
        relation: 'xxx'
    -   name: 'Mike'
        relation: 'xxx'

kids = sister:
___|name: 'Ida'
___|age: 9
___|parents:
___|-__|name: 'Caris'
___|___|relation: 'xxx'
___|-__|name: 'Mike'
___|___|relation: 'xxx'

That would work for any tab width greater than 1, it's semantic and doesn't mix spaces and tabs. Unfortunatelly it won't look good if your tab is just 1 character wide (does anyone use those?), but it will still parse properly since tab width doesn't change the actual file content and not using tab stops is not an option here, because using spaces to align the object keys would still abuse tab stops, create a real whitespace mess and could cause problems with parsing it since even GitHub's syntax coloring breaks on this example (---| is tab, . is space):

kids = sister:
___|name: 'Ida'
___|age: 9
___|parents:
___|___|-.name: 'Caris'
___|___|..relation: 'xxx'
___|___|-.name: 'Mike'
___|___|..relation:
___|___|.._|- test:
___|___|.._|.._|- 4

So I think using tab stops like in Horstmann style is the way to go. And of course it won't affect you if you use spaces for indentation anyway.

Bonus: we could also allow these, which are not allowed in YAML:

kids =
  - name: 'Ida', age: 9
  - name: 'Sai', age: 15
  - name: 'Yami', age: 23

YAML requires either using curly brackets for those objects, or placing every key in a separate line.

@lunakurame
Copy link

More keystrokes. YAML (9 x -) vs CS (6 = 3 x [ + 3 x ]) in given example.

But YAML also doesn't have useless lines which contain only the ] character, so the whole code uses less vertical space. And yeah, indentation matters in YAML, but it allows inconsistent indentation too, as long as every - of an array is aligned:

objs:
  hobbies:
    most:
      - 'tech'
      - 'sport'
      - 'eco'
    medium:
           - 'animals'
           - 'music'
           - 'cooking'
    lowest:
        - 'science'
        -       'math'
        -    'reading'

@vendethiel
Copy link
Collaborator

I'm not familiar with the history, but I presumed there is probably a reason behind this.

A Long Time Ago ™️ , any amount of dedent counted as a dedent. See #3305 for an actual explanation.

@zdenko
Copy link
Collaborator

zdenko commented Mar 2, 2018

indentation matters in YAML, but it allows inconsistent indentation too

I stand corrected.

@GeoffreyBooth
Copy link
Collaborator

Wow, what an active thread. Can we maybe narrow the focus a bit? The desire for YAML or YAML-like syntax, per the OP, is to have some more natural way to express arrays of objects. If YAML syntax specifically won’t work, for the various reasons discussed above, is there another alternative that’s better than current syntax?

@aminland
Copy link

aminland commented Mar 2, 2018

What's the thought on taking advantage of the semicolon bug I mentioned earlier?

@edemaine
Copy link
Contributor

edemaine commented Mar 2, 2018

@aminland I find the semicolon behavior nonobvious / hard to read. It's seems pretty sketchy to rely on that behavior, and a severe overloading of what semicolon should mean (concatenating lines, or doing one thing then the other).

But I like the original idea of this issue, which is to allow specifying arrays via aligned -s as an alternative to wrapping in brackets, to more cleanly handle objects nested within lists.

Here's a real-life example from some Meteor code of mine, where I'm constructing a MongoDB query:

query = (username) ->
  $or: [
    published: $ne: false
    deleted: $ne: true
    private: $ne: true
  ,
    "authors.#{escape username}": $exists: true
  ,
    title: ///@#{username}///
  ,
    body: ///@#{username}///
  ]

I would much prefer to read/write this code:

query = (username) ->
  $or:
    - published: $ne: false
      deleted: $ne: true
      private: $ne: true
    - "authors.#{escape username}": $exists: true
    - title: ///@#{username}///
    - body: ///@#{username}///

@aminland
Copy link

aminland commented Mar 7, 2018

I agree it's an ugly character, but it does make sense semantically. If i'm in the context of any block, a semicolon signifies the end of each item in said block.

I only really suggested it since it's already a bug and fixing said bug would likely break some people's code...

@edemaine
Copy link
Contributor

edemaine commented Mar 7, 2018

@aminland Do you have something against the YAML hyphen notation? I'm pretty sure the example above would be pretty unreadable with your proposed semicolon hack.

@aminland
Copy link

aminland commented Mar 8, 2018

Just that it would be very difficult to get right within coffeescript.

e.g. the following is currently valid syntax and means something completely different:

a = 
  - b

@wesvetter
Copy link

While I agree that expressing arrays of objects can be kind of clunky, I'm not sure it's enough of a pain to warrant adding to CoffeeScript.

One solution I've used to make configuration objects a little more readable is just to use a dict and strip off the values.

So this:

kids = sister:
  name   : 'Ida'
  age    : 9
  parents: [
    {
      name    : 'Caris'
      relation: 'Mother'
    }
    {
      name    : 'Mike'
      relation: 'Step-Father'
    }
    {
      name    : 'Jane'
      relation: 'Step-Mother'
    }
    {
      name    : 'Tom'
      relation: 'Father'
    }
  ]

becomes:

kids = sister:
  name   : 'Ida'
  age    : 9
  parents: _.values
    mom1:
      name    : 'Caris'
      relation: 'Mother'
    dad1:
      name    : 'Mike'
      relation: 'Step-Father'
    mom2:
      name    : 'Jane'
      relation: 'Step-Mother'
    dad2:
      name    : 'Tom'
      relation: 'Father'

(The key names don't matter.)

@lunakurame
Copy link

That _.values is an interesting idea, but it has some flaws:

  • if you add multiple elements with the same key, only the last one will be in the final array
  • doesn't work with CSON since it's not pure data
  • takes the same vertical space as this, which doesn't have the rest of those problems and has less clutter (no keys):
     kids = sister:
       name   : 'Ida'
       age    : 9
       parents: [
           name    : 'Caris'
           relation: 'Mother'
         ,
           name    : 'Mike'
           relation: 'Step-Father'
         ,
           name    : 'Jane'
           relation: 'Step-Mother'
         ,
           name    : 'Tom'
           relation: 'Father'
       ]

@rdeforest
Copy link
Contributor

I thought of this issue when I started reading https://m.signalvnoise.com/on-writing-software-well-aee3780767a6.

If anyone can link to some production code which demonstrates the problem, that will move the discussion forward dramatically.

Meanwhile, here's my argument against this feature: if I wanted to embed constant structures such as the examples above in my code, here's what I think I'd do:

{load: y} = (require './my-yaml').loadSync

query = (username) -> y """
  $or:
    - published: $ne: false
      deleted: $ne: true
      private: $ne: true
    - "authors.#{escape username}": $exists: true
    - title: ///@#{username}///
    - body: ///@#{username}///
"""

@edemaine
Copy link
Contributor

edemaine commented Apr 4, 2018

https://github.com/edemaine/coauthor/blob/master/lib/messages.coffee is production code with the example above (and lots of others). It's not terrible as is, but would be nicer with YAML support. But I don't know how to solve @aminland's point about ambiguity with unary minus...

@rdeforest I don't think your code properly processes the /// as CoffeeScript regexes... needs some more #{...} escapes.

@rdeforest
Copy link
Contributor

I meant for the regexps to be processed by "./my-yaml". The input to y() would be something like

  $or:
    - published: $ne: false
      deleted: $ne: true
      private: $ne: true
    - "authors.rdeforest": $exists: true
    - title: ///@rdeforest///
    - body: ///@rdeforest///

But I'm also the kind of weirdo who writes this kind of thing:

publishedPublic   = ->
  published: $ne: false
  deleted:   $ne: true
  private:   $ne: true

writtenBy         = (name) -> "authors.#{escape name}": $exists: true

fieldContainsUser = (field) -> (s) -> [field]: ///@#{s}///
titleContainsUser = fieldContainsUser 'title'
bodyContainsUser  = fieldContainsUser 'body'

query = (username) ->
  $or: [ publishedPublic
         writtenBy
         titleContainsUser
         bodyContainsUser
       ] .map (predicate) -> predicate username

@phil294
Copy link

phil294 commented Mar 16, 2019

What sets CoffeeScript still apart from modern ES2018 Javascript is its ease of readability. Since YAML has become so popular, I think the specified syntax would be greatly appreciated by newcoming users. There was 12 months silence in this thread, any news?


As a workaround, currently, I am piping all my coffee files through a custom converter before passing it to the actual cs compiler (and the syntax highlighting plugin, respectively), so I can use yaml syntax in cs. Not thoroughly tested, but does the job in my codebase:

https://github.com/phil294/MEVN-base/blob/059e21d58db49e0244361e5ab667f3138584b05a/web/build/custom-loaders/coffee-loader.coffee#L20

Which will transform something like

x:
	-	a: 1
	-	b: 2
	-	c:
			-	'one'
			-	'two'

into

x: [
	a: 1
,
	b: 2
,
	c: [
		'blub'
	,
		'two'
	]

Or, if you prefer, as a non-readable Javascript script:

  • tab indentation:
let theCode = '...';
while (match = theCode.match(/([\w\W]*^(\t+)\w+):\n((?:\2\t-\t.+\n(?:\2\t\t\t.+\n)*(?:\2\t\t.+\n(?:\2\t\t\t.+\n)*)*)+)([\w\W]*)/m)) {
  const [_, before, indent, arraybody, after] = match;
  const arraybody_transformed = arraybody.replace(/^\t/gm, '').replace(/^(\t*)-/, '$1').replace(/^(\t*)-/gm, '$1,\n$1');
  theCode = `${before}: [\n${arraybody_transformed}${indent}]\n${after}`;
}
  • space indentation (configurable amount):
const indentSize = 4;
let theCode = '...';
while (match = theCode.match(RegExp(`([\\w\\W]*^( +)\\w+):\\n((?:\\2 {${indentSize}}- {${indentSize - 1}}.+\\n(?:\\2 {${indentSize * 3}}.+\\n)*(?:\\2 {${indentSize * 2}}.+\\n(?:\\2 {${indentSize * 3}}.+\\n)*)*)+)([\\w\\W]*)`, "m"))) {
  const [_, before, indent, arraybody, after] = match;
  const arraybody_transformed = arraybody.replace(RegExp(`^ {${indentSize}}`, "gm"), '').replace(RegExp(`^( *)- {${indentSize - 1}}`), `$1${' '.repeat(indentSize)}`).replace(RegExp(`^( *)- {${indentSize - 1}}`, "gm"), `$1,\n$1${' '.repeat(indentSize)}`);
  theCode = `${before}: [\n${arraybody_transformed}${indent}]\n${after}`;
}

edit: there is a (newer version) which also works without the x: part in the example. if someone needs this in the JS form like above, please leave a note

a proper fork of the cs lexer would be better, but that would have taken me about 500 times longer.

So maybe this helps out any similarly impacient ones.

Needless to say, I would love to see the proposed syntax implemented.

@lorefnon
Copy link

Until this feature becomes a part of the language, it is possible to get something very similar through yaml-to-js.macro

@fcostarodrigo
Copy link

Another interesting alternative is the approach in livescript.

Implicit lists created with an indented block. They need at least two items for it to work.

When implicitly listing, you can use an asterisk * to disambiguate implicit structures such as implicit objects and implicit lists. The asterisk does not denote an item of the list, but merely sets aside an implicit structure so that it is not muddled with the other ones being listed.

@vendethiel
Copy link
Collaborator

The livescript approach was not used on purpose.

@fcostarodrigo
Copy link

fcostarodrigo commented Jun 9, 2020

About the problem:

a = 
  - b

This wouldn't be a problem if use * instead of -. There is no unary * operator.

I read the previous issues and I think the only problem raised about creating arrays with * is that it is too specific, it only helps with array of objects. #645 (comment)

But I still think this would be a good addition to the language.

Examples:

matrix =
  * [1, 2, 3]
  * [4, 5, 6]
  * [7, 8, 9]

users =
  * name: 'John'
    age: 18
  * name: 'Mary'
    age: 21

@dyoder
Copy link

dyoder commented Aug 3, 2020

Strongly in favor of introducing either YAML-style or asterisk syntax. For two reasons:

  1. It's entirely consistent with the spirit of the language, which otherwise takes full advantage of whitespace to eliminate syntactical debris that otherwise adds nothing semantically.

  2. Perhaps more important, we have a pretty compelling scenario where this adds value. Namely, function composition.

One of the nice things about doing composition in CoffeeScript is that you can write things like this:

compose [
  foo
  bar
  baz
]

We do a lot of this. And it turns out to be quite nice, most of the time. But it also turns out that sometimes it makes sense to have nested composition. Now, of course, we could use variadics:

compose foo,
  bar
  baz

but there are good reasons to avoid variadics, especially in functional programming, what with all the currying and passing around lists of functions. But if we use arrays, of course, we need to worry about closing brackets and our code at times becomes quite Lisp-like, with three or four closing brackets at the end.

As a result, we're spending increasing amounts of time just dealing with bracket matching. Which again, feels very much against the whole spirit of the language.

CoffeeScript shines in the context of functional programming, except for the brackets. I think that's a neat role for CoffeeScript to play in the JavaScript ecosystem and I'd like to encourage it. Of course, we may be the only people doing this. But our hope is that more people will move to this style, once they see how compelling it can be. But it's not quite as compelling right now because of those darned brackets.

@ghost
Copy link

ghost commented Jul 28, 2021

I'd really like this feature, JSON has always looked rather ugly to me and is pretty easy to break, whereas YAML isn't as smooth and versatile as the proposed syntax. I'd also like to be able to use the new array syntax in CSON files

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests