An attempt to replace my old regex based comment formatting found in CodeMaid.
The original regular expression based solution works is very inflexible and does not have a good responsibility separation. It's getting increasingly difficult to add new requested features and fix reported bugs. It uses the system XML parsing under the hood, which is very inflexible with broken XML.
Last but not least, rebuilding from scratch is always more fun.
This version attempts to bring the following changes:
- Use the Roslyn compiler to locate comments.
- Parse comment text into a tree representation rather than pattern scanning with regular expressions.
- Allow more configuration of output style.
- Use the
System.Span<T>
struct to optimize performance (is this even compatible with a Visual Studio plugin?). - Make a separate CLI version than can be run outside of Visual Studio.
The project is still in a very early development stage. Everything is likely to change around many times before getting anywhere near production ready.
Not thought out yet. Should use Roslyn, it already knows which parts of a file are comments.
- The
TokenReader
turns the input string is turned into a stream ofToken
s. Tokens denote special characters and words. - The
Lexer
turns the stream ofToken
s into building blocks. So far, it recognizesXml
andText
. It should be able to deal gracefully with malformed XML tags. - The
Parser
turns the building blocks from theLexer
into a tree representation of the comment.
At this point, we have a tree representation of the comment without any loss of information, i.e. the original comment can be 100% reconstructed from the tree.
Not thought out yet.
Not thought out yet.
There's work to do! Maybe this turns into something, maybe it doesn't. Feel free to comment, clone, copy, or all of the above.