Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Req] Public api for compile-time codegen? #2205

Closed
ig-sinicyn opened this issue Apr 23, 2015 · 9 comments
Closed

[Req] Public api for compile-time codegen? #2205

ig-sinicyn opened this issue Apr 23, 2015 · 9 comments
Labels
Area-Compilers Area-Language Design Concept-API This issue involves adding, removing, clarification, or modification of an API. Feature Request New Language Feature - Replace/Original Replace/Original
Milestone

Comments

@ig-sinicyn
Copy link

Hi!

The metaprogramming feature was discussed multiple times already (e.g. #98 and #2136) but all of discussions as so far ended with "too big to fit" resolution.

At the same time there are a lot of real-world codegen tasks, such as

  • resgen.exe (.resx files).
  • SettingsSingleFileGenerator (.settings files).
  • code behind generation for various designers (ASP.NET, WPF, WF, Winforms...).
  • code contracts code rewriter.
  • etc etc etc

Each of these uses its own code generation approach, but all of them work fine, nothing fails and no one hurts. So, lets start with assumption that adding one more way to apply some build-time-magic will not break anything.

Now, there are a lot of issues not related to the c# syntax and too specific to be supported directly by the c# compiler. To name a few: #1677 (support for INPC), dependency properties, #105 (easy Equals() implementation) and so on. It seems that allowing to add custom code rewriters is the only solution that can be safely done at c# side.

All of these issues perfectly fits into following restrictions:

  • No cross-dependencies. All code rewriters usually can be applied at arbitrary order, no one of them will introduce places to be rewritten by the others. So, the compilation should not take much more time than it is for now (at least I hope so :) )
  • Minimal scope of rewriting is a type member. No need to tamper the c# syntax, all transformations can be described by attribute annotations only.
  • No complex codegen. Most of custom rewriters will replace empty method body with pre-hardcoded implementation or will add a new type, property or field, nothing more. No await-style state machine translations, no chained rewriters, no cases when applying rewriter to particular method will lead to rewriting all callees.
  • No new conceptions. Code rewriting meant to be used to deal with code being too dumb to be written by hands. There will be no macros-style "write your own for if you do not like standard one" approach, no template-based magic to imitate algebraic types, no need for metalanguage and so on. Just write the code that automates writing some more code for you, thats all.
  • Rewrtiting can be done via Roslyn API. Actually you can apply rewriters using roslyn just now but there is no official way to embed the rewriters into build pipeline. There're some starting points on SO but these hacks are not officially supported, and meant to be used at your own risk.

In summary:

  1. Most of "metaprogramming, please!" requests can be covered with postsharp-like code rewriters.
  2. Biggest part of work is done already. Roslyn API is here, there's just no easy way to plug in custom rewrtiters.
  3. Adding support for custom rewriters will not be a principal problem. Roslyn supports custom code analysers already, why not allow rewriters too?
  4. There will be no versioning issues. Each assembly could use its own set of code rewriters and replacing one implementation with another will not break dependent code (at least until rewriters do not mess with assemly' public api).

So, what I've missed?:)

@VSadov VSadov added Concept-API This issue involves adding, removing, clarification, or modification of an API. Feature Request labels Apr 23, 2015
@AdamSpeight2008
Copy link
Contributor

I think it should be easier to express code, rather than building the node syntax tree.
The compiler itself could be used, to help (see #174)

@ig-sinicyn
Copy link
Author

@AdamSpeight2008
Well, I'm not sure that macros addition is the right way to go.

As for me it is too big and controversial feature to be easily fitted into c#.
Also, the main benefit from macros - no need to construct code tree manually - is perfectly covered by

var stubCode = SyntaxFactory.ParseExpression(@"
public StubType StubProperty
{
  get { return stubField; }
  set
  {
    stubField = value;
    OnPropertyChanged(StubPropertyName);
  }
}")

All you have to do is replace stubs with actual nodes. And if you're too lazy, it can done using string interpolation,

var stubCode = SyntaxFactory.ParseExpression(@$"
public {typeName} {propertyName}
{
  get { return {fieldName}; }
  set
  {
    {fieldName} = value;
    OnPropertyChanged({propertyName});
  }
}")

thats all.
If there're more real benefits from macro approach I'd be glad to hear it.

Now, here is what's wrong with macros:

  1. It's viral. Macros addition will significantly change the way code is written, like a linq, await and generics did.
    So, there will be no chance to fail, throw the errors out and redo it all over again.
    It's going to be extremely hard task and 'extremely' is underestimation here.
    Look at Nemerle 'macro-all-around' language. Despite very good, clean and versatile design, in Nemerle 2.0 they had to redo entire macro system from scratch. Now it is based on PEG grammar and it turns out that N2.0 core evolves into DSL generator on steroids. Too big change for general purpose lang, I guess.
    Sadly, there's not so much public information about Nemerle in English. As far as I know the core of Nemerle community is Russian-speaking developers and Nemerle 2.0 design is in early stage for now. So I'm afraid there is no detailed description in English.
  2. It's viral-2. With macros/templates it's too easy to introduce new template types and language constructs that had to be used in dependent assemblies too. So, welcome to the version hell collectors edition. Yes, I know, theoretically no one will do that. In practice, they will :(
  3. It's complex. Using rewriters guarantees that the minimal scope is a type member. With macros you can tune anything starting with a single expression. This means circular dependencies, order of applying matters, surprisingly odd side effects like this and so on.
  4. It's too easy to use. Look at Roslyn feature request discussions. Specification for "seems-to-be-simple" string interpolation feature was significantly changed at least twice. Here's v5 edition of nameof operator. "Simpiler" params IEnumerable and digit_separapors were not implemented at all.
    Yes, designing a new language feature is hard. Despite all efforts there always be some corner cases where proposed design will not fit. Imagine now that you had to support the code filled with all bad-designed features you may wish. What, "no one will do that" again?
    Well, why then spend time on feature the majority will not be able to use the right way?
  5. It's costly. I'm talking about cost of macro addition and support, not about using the macro that was written by someone else. You would need to invent a totally new set of use cases, there would be significant changes in Roslyn infrastructure, tooling support would take a lot effort, too. Compared with code rewriters, it's like "plain switch" vs "full-featured pattern matching".

As a summary: code rewriters offer almost ideal balance between developer qualification, scenarios supported and cost of support. As for me there's no place for macros-driven metaprogramming in c#. However I would love to hear if I'm wrong.

@ilmax
Copy link

ilmax commented Apr 24, 2015

I would like to upvote this proposal, seems like a natural addition to Roslyn and could allow a broader set of feature without chenging the language itself. There are a tons of possible use cases, perf improvement of serialization and aop just to name a few

@AdamSpeight2008
Copy link
Contributor

@ig-sinicyn wrote:
Also, the main benefit from macros - no need to construct code tree manually - is perfectly covered by

var stubCode = SyntaxFactory.ParseExpression(@"
public StubType StubProperty
{
  get { return stubField; }
  set
  {
   stubField = value;
    OnPropertyChanged(StubPropertyName);
  }
}")

All you have to do is replace stubs with actual nodes. And if you're too lazy, it can done using string interpolation,

You're right you can do it using a string, then converting it at runtime. In which lies an issue, you don't get compile-time type-checking of that section of code. My proposal / idea is to have compile-time checked templates. (like C++)

@ig-sinicyn
Copy link
Author

@AdamSpeight2008
Emm... I'm not talking about runtime:)

SyntaxFactory.ParseExpression() is common shortcut to get correct syntax three without doing something like this. After you've got valid AST it's relatively easy to replace with it the code you need to rewrite. Here's a roslyn code fix sample using the same approach.

So, the code above is meant to be executed at compile time as additional compilation step. No runtime magic at all.

If you want to prove that rewriter/macro is working correctly you had to write tests anyway. There always are some errors that cannot be caught by compiler.

@AdamSpeight2008
Copy link
Contributor

@ig-sinicyn The example from the first link, would look something like this use a template function.

template syntaxnode FormWithTicker( string ns , string fn ) 
{
  namespace %{ns}%
  {
    public class %{fn}%() : System.Windows.Forms.Form
    {
      public System.Windows.Forms.Timer Ticker
      {
        get; set;
      }

      [STAThread]
      public void Main
      {

      }
   }
}

The compiler would / should check at compile-time that the structure of the code is correct.

@ig-sinicyn
Copy link
Author

Well, from my point of view it is useless to check template itself.

As example, template code can derive from type not referenced by the template library. It could be derived from another template. It could use mixin-style template combinations. Or use members from type being passed as template argument. And so on and so on.

So, the only way to check template is to apply it to the real code. And at this moment there will be no difference between macros, code template and classical code rewriter.

Also, did you notice that your template could be easily replaced with usual base class?
Having multiple ways to do something usually means that there's something wrong with language design. One more point for code rewriters:)

It seems that the discussion begins to look like "macro vs aop" holywar, so I'll stop for mow.
Thanks, and let's wait for comments from someone from roslyn team:)

@russpowers
Copy link

I've written a working implementation of Roslyn compiler plugins similar to this, along with attribute macros as described. Take a look at #3248.

@gafter
Copy link
Member

gafter commented Nov 20, 2015

In progress in #5561.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-Compilers Area-Language Design Concept-API This issue involves adding, removing, clarification, or modification of an API. Feature Request New Language Feature - Replace/Original Replace/Original
Projects
None yet
Development

No branches or pull requests

6 participants