Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement string block literals. #11975

Closed
wants to merge 3 commits into from
Closed

Implement string block literals. #11975

wants to merge 3 commits into from

Conversation

Tronic
Copy link

@Tronic Tronic commented Aug 18, 2019

let str = ":
    formatted
        text
         is
      possible
echo str

echo(
  r":
    def foo():
        """Generated Python function without any escaping"""
        print("".join(["foo", "bar"]))
        print("Hello\nWorld!")
)

echo ":
  <pre>\
  lines split only in source code, \
  no newlines in output\
  </pre>\

Discussion in nim-lang/RFCs#161

@Tronic
Copy link
Author

Tronic commented Aug 18, 2019

Raw string in the code example is only needed for \n; everything else is fine with normal strings as well. In particular: ", "" or """ won't cause any trouble. Any indentation within the string block becomes part of the string. Indentation of the block itself (two spaces extra to the ": line) is not included.

@timotheecour
Copy link
Member

timotheecour commented Aug 21, 2019

not giving +1 or -1 but note:

triple quotes can represent any string (that does not contain a triple quote strictly inside; at the boundaries is actually ok); this would need to be the case here as well.

  • Right now you can't represent a string that doesn't end with a newline; how about stripping last newline so that any string can be represented?
    eg:
str = (":
  formatted
)

should produce "formatted", not "formatted\n"

  • special case for \ is a mis-feature (since that prevents writing strings verbatim that contain it), unless it's opt-in via some way to indicate we allow escaping it

@Tronic
Copy link
Author

Tronic commented Aug 22, 2019

@timotheecour In raw/proc ": literals there are no escapes; \-endline, just like the other \-escapes, may only be used in normal mode. In r": there is no way to represent missing final newline nor any control characters that cannot appear in source code (most importantly \r). The normal mode ": is able to present any characters and omit the final newline, at the cost of some \-escapes.

I concur that the missing final endline is an important enough special case to deserve some thought. By adding that feature the r": block could be strictly more powerful than r""" literals (i.e. do everything that r""" does and also triple-quotes and correct indentation without hacks), so I am definitely tempted to do as you suggest; as such would silence the final sensible critique against this literal.

I believe that the normal case for a multi-line literal is to output multiple lines, all terminated by newline (as per standard text file formatting). Thus, it cannot be the default to omit the final newline, as adding it manually would then be ugly and easily accidentally missed.

In normal mode (which may also be combined with fmt by fmt ": instead of the raw-string fmt":, as per normal Nim syntax), a single backslash at the end is a minimal but visible indication that newline is not to be included. As such it is a perfectly acceptable solution whenever the normal mode may be used. This also makes it practical to split some long lines but still keep newlines for the rest.

The question remains whether this can be solved in raw mode in a sensible manner, and if suitable, also extended to normal mode (there should be no surprising differences between the modes). A related issue irking me is the one with cr-lf newlines, seen in most Internet protocols, that usually have to be mixed with lf-endlined text or binary data (HTTP, Redis), as adding \r or \r\n escapes in literals is quite ugly.

In essence, if one could mark a source line as adding to the string either no newline, CR-LF, or (the default) LF, all these problems would be solved.

The first question is whether this should be a library function alike strutils unindent. I believe that to make them work at compile time, which would be ideal, some compiler magic cannot be avoided. This solution would quickly lead to endlCRLF, endlNoFinal and endlNone helper functions, to cater the most typical cases. The limitation of this approach is that all newlines would be handled the same, with no control over individual lines, and also that care must be taken to perform this prior to fmt that would include data that should not be processed.

Adding similar modes to the start tag (e.g. after the colon) would be easy and has the benefit that it can apply exclusively to source code line breaks, leaving any escaped newlines untouched and not affecting the use of fmt.

Another option that comes to mind, is supplying out-of-band control codes in the left margin of the string block, like so:

  let str = r":
    normal line with lf
   :terminated by cr-lf
    c:\windows\
   .no newline after this

This is the most powerful solution and it still keeps everything verbatim in the content, yet may well be seen ugly and is certainly not intuitively clear in meaning. It is debatable whether mode changes should stick, i.e. would c:\windows\ be terminated by CR-LF or just LF, or if control lines, avoiding the single character limitation, would be better.

Whether adding any such special syntax to address a border nuisance is justified, is an open question I cannot answer.

Perhaps you can think of something else?

@stale
Copy link

stale bot commented Aug 24, 2020

This pull request has been automatically marked as stale because it has not had recent activity. If you think it is still a valid PR, please rebase it on the latest devel; otherwise it will be closed. Thank you for your contributions.

@stale stale bot added the stale Staled PR/issues; remove the label after fixing them label Aug 24, 2020
let str = ":
    formatted
        text
         is
      possible
echo str

echo(
  r":
    def foo():
        """Generated Python function without any escaping"""
        print("".join(["foo", "bar"]))
        print("Hello\nWorld!")
)

echo ":
  <pre>\
  lines split only in source code, \
  no newlines in output\
  </pre>\
@stale stale bot removed the stale Staled PR/issues; remove the label after fixing them label Sep 7, 2020
@Tronic
Copy link
Author

Tronic commented Sep 7, 2020

Rebased on top of current nim-lang/devel branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants