Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discussion: TypeScript compile errors with mutation switching #2438

Closed
nicojs opened this issue Aug 28, 2020 · 17 comments · Fixed by #2446
Closed

Discussion: TypeScript compile errors with mutation switching #2438

nicojs opened this issue Aug 28, 2020 · 17 comments · Fixed by #2446
Labels
help wanted Extra attention is needed
Milestone

Comments

@nicojs
Copy link
Member

nicojs commented Aug 28, 2020

See #1514 .

In a mutation switching world, we will create (typescript/flow) type errors. This is by design; it would be almost impossible to prevent it (we would need a team and community at least as large as https://github.com/Microsoft/TypeScript).

I don't think flow will be an issue because flow code is transpiled without type checking (using a babel plugin). However, TypeScript is a different beast all together..

There are a couple of approaches I've tried:

Attempt 1: Transpile only

Running the typescript transpiler with configuration similar to ts-node's --transpile-only.

However, after a little investigation, I learned that transpileOnly only works with --isolatedModules. This means that only a subset of typescript will be supported (although a best practice IMHO). We can't rely on this for all our users.

The same goes for skipping type checking by using babel; it won't support all use cases.

Attempt 2: Skip Typechecking

With a little fudging in the TypeScript compiler api you can get a long way. It would allow us to basically transpile all typescript projects while skipping type checks. See microsoft/TypeScript#29651 (comment) for an example.

However, this approach falls short when you consider more typescript use cases. Calling the TypeScript compiler directly is just one of the ways typescript compilation might take place. A non-extensive list of ways to compile typescript:

  • ts-node
  • angular-cli
  • vue-cli
  • react-scripts
  • karma-typescript
  • webpack with ts-loader
  • webpack with awseome-ts-loader
  • jest-ts
  • rollup with typescript plugin
  • Probably 100s more of tools/plugins

Solution: // @ts-nocheck

The solution I ended up with is to use the // @ts-nocheck directive on top of each file (js or ts, since js files can also result in type errors). It works from TypeScript 3.7 onward and will work with all use cases (as long as you're using a fairly recent TypeScript version).

Note that errors can also be produced by test files, even though they're not mutated. Mutating production code can change the return type, for example function foo() { return 'bar'; } is mutated to function foo() { activeMutant === 1? {} : return 'bar'; }, which changes it's return type from string to string | undefined, which probably results in a type error somewhere in your test files

Problem 1: Still type checking...

However, this presented me with another issue. Even with @ts-nocheck, a file can still provide type errors, this is because other directives can result in errors, for example @ts-expect-error, but @ts-check has the same result.

That's why I decided to remove all comments from JS and TS files. We're using the strip-comments package for that, which uses regular expressions to identify the comments.

You can configure this with 2 settings:

  "sandbox": {
    "stripComments": "**/*+(.js|.ts|.cjs|.mjs)?(x)",
    "fileHeaders": {
      "**/*+(.js|.ts|.cjs|.mjs)?(x)": "/* eslint-disable */\n// @ts-nocheck\n"
    }
  }

(see readme for specifics)

Problem 2: significant comments

I assumed stripping comments wouldn't cause a problem, because comments should be comments. However, this is JavaScript we're talking about. This is a list of comments that are significant and we should actually keep if possible:

Personally, I feel like removing all comments from all files is a shotgun approach. Simply removing only @ts-xxx comments would be better, but how would we do that? I don't want to be parsing all input files (seriously, it is already a pain to parse only the files to be mutated) and strip-comments doesn't support comment filtering.

Problem 3: strip-comments has known issues

We've recently found an issue with the strip-comments package where it leaves a file in an invalid state. If this would happen to someone than the only thing he/she can do is to disable the removal of comments altogether (with "sandbox.stripComments": false) or only for that file specifically.

Problem 4: incompatible scenario with jest.

Even though users can customize to fit their use case with stripComments and fileHeaders, we still don't support all scenarios. For example, running @stryker-mutator/jest-runner together with the jest-ts jest plugin might result in a situation where you don't want to strip comments (because of @jest-environment), but need to strip comments because you're using @ts-expect-error somewhere in your tests.

Any input would be very helpful 😇

@hugo-vrijswijk @brodybits @Lakitna

@nicojs nicojs added the help wanted Extra attention is needed label Aug 28, 2020
@nicojs nicojs added this to the 4.0 milestone Aug 28, 2020
@brodycj

This comment has been minimized.

@brodycj
Copy link

brodycj commented Aug 28, 2020

updated:

I suspect that we could avoid the TypeScript compile errors by injecting @ts-ignore for each mutation. We may have to be smart to get @ts-ignore on the line right before the mutation.

Alternative from my previous comment is to consider a compile error the same as a killed mutation, but I am thinking it would be nicer if Stryker would help identify these untested mutations as well.


Adding this reference to help people like myself who are still not super experts with the TypeScript compiler: https://www.typescriptlang.org/docs/handbook/release-notes/typescript-2-6.html#suppress-errors-in-ts-files-using--ts-ignore-comments

@nicojs
Copy link
Member Author

nicojs commented Aug 28, 2020

Thanks for the response 🤗

What if we would just consider it a killed mutation if it fails with a TypeScript compile error?

That's basically how it used to work (a Compile error to be precise). However, this cannot be done anymore with mutation switching, since all mutants are placed in the code at once. That would mark all mutants as a compile error 🤷‍♂️.

Note, the @stryker-mutator/typescript-checker can be enabled to validate each mutant for compile errors, but that's after mutation switching takes place

Or could we inject @ts-ignore when applying each mutation?

That would not solve "Problem 1", tests can still fail to compile.
EDIT: also type errors can still happen in the mutated file as well. @ts-ignore only ignores errors in the direct line below. If a mutant creates an error on a different line, it will not be ignored.

Seriously, @ts-nocheck is your friend. It works almost in all scenario's (except for the problems I've mensioned of course 😢)

@nicojs
Copy link
Member Author

nicojs commented Aug 29, 2020

At this point, I'm contemplating implementing a simple parser that can distinguish between code and comments and use that to strip out only the @ts-xxx comments.

I don't see any other way.

@brodycj
Copy link

brodycj commented Aug 30, 2020

Why not just use Babel, TypeScript, or @typescript-eslint/typescript-estree to parse the code and look for the comments to strip out?

@nicojs
Copy link
Member Author

nicojs commented Aug 30, 2020

Ok, I've spent a couple of hours trying to write a custom parser to remove the @ts-xxx directives in comments, but I eventually gave up. There is no easy way of detecting code vs comments without actually parsing (most) nodes in the AST.

A couple of examples to illustrate why:
// @ts-check <-- this is a comment that needs to be removed
foo.bar();

foo.bar("foo \"/* @ts-expect-error */"); // <--No remove

const regex = / \/*@ts-check */; // <-- No remove, regex literal

const a = b / c; /*@ts-check */ // < -- yes remove, first `/` didn't start a regex literal

const b = <HelloJsx>
  // @ts-check {/* <-- no remove, jsx element */}
</HelloJsx>

I think we can still try this approach, but we should just rely on the typescript parser itself.

The idea I now have is that it would be disabled by default, and needed to be enabled using the sandbox.removeTSDirectives glob expression.

Parser code using the TypeScript compiler api
function removeTSDirectives2(file: File): string {
  const directives: TSDirective[] = [];

  const sourceFile = ts.createSourceFile(file.name, file.textContent, ts.ScriptTarget.Latest);

  const leadingCommentRanges = new Set<number>();
  const trailingCommentRanges = new Set<number>();

  function collectDirectives(node: ts.Node) {
    if (!leadingCommentRanges.has(node.pos)) {
      leadingCommentRanges.add(node.pos);
      ts.getLeadingCommentRanges(file.textContent, node.pos)?.forEach(({ pos, end }) => tryParseTSDirective(pos, end));
    }
    if (!trailingCommentRanges.has(node.end)) {
      trailingCommentRanges.add(node.end);
      ts.getTrailingCommentRanges(file.textContent, node.end)?.forEach(({ pos, end }) => tryParseTSDirective(pos, end));
    }
    ts.forEachChild(node, collectDirectives);
  }

  sourceFile.forEachChild(collectDirectives);

  let currentIndex = 0;
  let textWithoutTSDirectives = '';
  for (const directive of directives) {
    textWithoutTSDirectives += file.textContent.substring(currentIndex, directive.range[0]);
    currentIndex = directive.range[1];
  }
  textWithoutTSDirectives += file.textContent.substr(currentIndex);
  return textWithoutTSDirectives;

  function tryParseTSDirective(commentStartPos: number, commentEndPos: number): void {
    const commentDirectiveRegEx = /^(..\s*)@(ts-[a-z-]+).*$/;
    const commentText = file.textContent.substring(commentStartPos, commentEndPos);
    const match = commentDirectiveRegEx.exec(commentText);
    if (match) {
      const directiveStartPos = commentStartPos + match[1].length;
      directives.push({
        range: [directiveStartPos, directiveStartPos + match[2].length + 1],
      });
    }
    return undefined;
  }
}
Custom parser starter code (for reference)
enum CommentKind {
  SingleLine,
  MultiLine,
}

export function removeTSDirectives(text: string): string {
  const directives = findTSComments(text);
  let currentIndex = 0;
  let textWithoutTSDirectives = '';
  for (const directive of directives) {
    textWithoutTSDirectives += text.substring(currentIndex, directive.range[0]);
    currentIndex = directive.range[1];
  }
  textWithoutTSDirectives += text.substr(currentIndex);
  return textWithoutTSDirectives;
}

interface TSDirective {
  range: Range;
}

function findTSComments(text: string): TSDirective[] {
  let pos = 0;
  const directives: TSDirective[] = [];
  while (pos < text.length) {
    switch (currentCharCode()) {
      case CharacterCodes.slash:
        const nextChar = text.charCodeAt(pos + 1);
        if (nextChar === CharacterCodes.slash || nextChar === CharacterCodes.asterisk) {
          scanComment();
        } else {
          scanRegex();
        }
        break;
      case CharacterCodes.doubleQuote:
      case CharacterCodes.singleQuote:
        scanString();
        break;
      case CharacterCodes.backtick:
        scanTemplate();
        break;
      default:
        pos++;
    }
  }
  return directives;

  function currentCharCode() {
    return text.charCodeAt(pos);
  }

  function scanTemplate(): number {
    pos++;
    while (pos < text.length) {
      if (currentCharCode() === CharacterCodes.backtick) {
        pos++;
        break;
      }

      // '${'
      if (currentCharCode() === CharacterCodes.$ && pos + 1 < text.length && text.charCodeAt(pos + 1) === CharacterCodes.openBrace) {
        scanTemplateSpan();
        break;
      }

      pos++;
    }
    return pos;
  }

  function scanTemplateSpan(): void {
    pos += 2;
    while (pos < text.length) {
      const ch = currentCharCode();
      if (ch === CharacterCodes.closeBracket) {
        pos++;
        break;
      }
      if (ch === CharacterCodes.doubleQuote || ch === CharacterCodes.singleQuote) {
        scanString();
      } else if (ch === CharacterCodes.backtick) {
        scanTemplate();
      } else if (ch === CharacterCodes.slash) {
        scanRegex();
      } else {
        pos++;
      }
    }
  }

  function scanRegex(): void {
    pos++; // skip `/`
    let inCharacterClass = false;
    while (pos < text.length) {
      const ch = currentCharCode();
      if (isLineBreak(ch)) {
        // Premature ending, syntax error.
        pos++;
        break;
      } else if (ch === CharacterCodes.slash && !inCharacterClass) {
        // A slash within a character class is permissible,
        // but in general it signals the end of the regexp literal.
        pos++;
        break;
      } else if (ch === CharacterCodes.backslash) {
        // Escape next character
        pos++;
      } else if (ch === CharacterCodes.openBracket) {
        // Start a character class, for example `[a-Z]`
        inCharacterClass = true;
      } else if (ch === CharacterCodes.closeBracket) {
        // Close character class
        inCharacterClass = false;
      }
      pos++;
    }
  }

  function scanString(): void {
    const quote = currentCharCode();
    pos++;
    while (pos < text.length) {
      const ch = currentCharCode();
      if (ch === quote) {
        pos++;
        break;
      }
      if (ch === CharacterCodes.backslash) {
        pos = pos + 2;
      }
      if (isLineBreak(ch)) {
        pos++;
        break;
      }
      pos++;
    }
  }

  function scanComment(): void {
    const startPos = pos;
    const kind = text.charCodeAt(pos + 1) === CharacterCodes.slash ? CommentKind.SingleLine : CommentKind.MultiLine;
    pos += 2;
    switch (kind) {
      case CommentKind.MultiLine:
        while (pos < text.length) {
          if (currentCharCode() === CharacterCodes.asterisk && text.charCodeAt(pos + 1) === CharacterCodes.slash) {
            pos += 2;
            break;
          }
          pos++;
        }
        return;
      case CommentKind.SingleLine:
        while (pos < text.length) {
          if (isLineBreak(currentCharCode())) {
            break;
          }
          pos++;
        }
    }
    // Eureka, we've got a comment. Now find that @ts-xxx directive
    tryParseTSDirective(startPos, pos);
  }

  function tryParseTSDirective(commentStartPos: number, commentEndPos: number): void {
    const commentDirectiveRegEx = /^(..\s*)@(ts-[a-z-]+).*$/;
    const commentText = text.substring(commentStartPos, commentEndPos);
    const match = commentDirectiveRegEx.exec(commentText);
    if (match) {
      const directiveStartPos = commentStartPos + match[1].length;
      directives.push({
        range: [directiveStartPos, directiveStartPos + match[2].length + 1],
      });
    }
    return undefined;
  }
}

@nicojs
Copy link
Member Author

nicojs commented Aug 30, 2020

Why not just use Babel, TypeScript, or @typescript-eslint/typescript-estree to parse the code and look for the comments to strip out?

Yeah, I guess there is no other way. Will implement something this week. It should replace the sandbox.stripComments option.

@brodycj
Copy link

brodycj commented Aug 30, 2020

I am thinking there should be a pretty simple way to remove the unwanted comments using Babel. Babel supports a comments: false option, and it should be very easy to use @babel/traverse to remove the comments (all comments) like this (not tested):

using @babel/traverse to remove all comments
import { traverse } from '@babel/traverse'
import types from '@babel/types'

traverse(ast, {
  enter(path) {
    types.removeComments(path.node)
  }
})

ref:

While this may not be as refined as using tsc to only strip comments matching the pattern, I would favor using Babel over using multiple parsers in the Stryker Mutator.

I can think of a couple possible approaches:

  • Babel approach 1: use Babel twice: first for preprocessing and second in instrumentor for mutating
  • Babel approach 2: do the parsing on each of the files, in a separate step in packages/core/src/process, save the parsed ASTs in the context, and use the parsed AST to both remove the comments and apply the mutators

I think Babel approach 2 would be more elegant but would have a couple of major caveats:

  • The AST type, which would be HtmlAst | JSAst | TSAst, would affect the Stryker context;
  • More memory would be needed to store the AST for each source file. It should be possible to drop the AST context for each source from memory upon computation of the mutations, but this may not be so trivial.

I hope to give this a try sometime, no promise when due to some other priorities.

@nicojs
Copy link
Member Author

nicojs commented Aug 30, 2020

@brodybits great minds think alike 😎. I reached the same conclusion.

I'm opting for approach 1 for now. I'm adding the functionality to the instrumenter package. That way, we can keep any direct dependency on a specific parser out of Stryker core. It is called directly from the preprocessor.

I think the total performance impact might not be too bad because I'm thinking only to parse the file if a @ts-xxx is found somewhere in the code. Most files won't contain TS code, so those files are never parsed 🚅.

I also think of naming the option sandbox.disableTypeChecking. It will combine the removal of // @ts-xxx directives with the inserting of the // @ts-nocheck comment. I think this will allow for much better user experience.

An additional advantage is that this approach will also work for vue files that contain type checking (using the vue cli and some webpack magic) because the instrumenter can also parse HTML and vue files.

I think '**/*+(.js|.ts|.html|.vue)?(x)' is a valid default for most use cases.

I hope this solution is a good compromise. It is unfortunate, we have to implement it, but this will allow for the seamless use of Stryker in most projects.

@brodycj
Copy link

brodycj commented Aug 30, 2020

@nicojs I think that makes the most sense in general. More simple and probably less messy than what I had proposed.

I would also favor that the CLI adds the sandbox option when generating the config, and tells the user what is going on.

'**/*+(.js|.ts|.html|.vue)?(x)'

How about this: **/*.{js,ts,html,vue}?(x)

It should have the same meaning, but simplified a little bit.

@nicojs
Copy link
Member Author

nicojs commented Aug 30, 2020

Follow the work in progress here: #2446

I've tested it on stryker core and, even without the regex I proposed, I don't see a negative performance impact. And more importantly, it works like a charm! I'm really happy with this result. 😀

I think that makes the most sense in general

Glad you agree. 🙏 And thanks for the help figuring it out.

How about this: **/*.{js,ts,html,vue}?(x)

Sure. Maybe it's even better to use **/*.{js,jsx,ts,tsx,html,vue}, since htmlx and vuex don't make sense and listing them all might be even more clear.

@brodycj
Copy link

brodycj commented Aug 30, 2020

I think the total performance impact might not be too bad because I'm thinking only to parse the file if a @ts-xxx is found somewhere in the code. Most files won't contain TS code, so those files are never parsed 🚅.

I am not sure if I understand that comment. It looks to me like the changes in PR #2446 would happen in all files that match the pattern. Am I missing something here?

@nicojs
Copy link
Member Author

nicojs commented Aug 31, 2020

Yeah, that's why it's a draft PR 🙈

Hang on. Will ask you to review when it is done.

@Lakitna
Copy link
Contributor

Lakitna commented Aug 31, 2020

Just to get my head around the details:

Currently, you are first instrumenting for Stryker, and then instrumenting for Typescript. This is why you need to work with the @ts-xxx stuff.

The solution you have been discussing, while I was having a quiet weekend :), will: 1. Instrument for Stryker; 2. Preprocess for Typescript (@ts-xxx comment wrangling); 3. Instrument for Typescript

I was thinking about instrumenting the other way around. First, you instrument for Typescript and then for Stryker. This would remove the need for comment fiddling altogether. I guess you don't do this because of mutations inserted by the TS instrumentation? Source maps might be able to help you a bit here, but I'm not sure.

@nicojs
Copy link
Member Author

nicojs commented Aug 31, 2020

Almost correct @Lakitna :)

This is the process:

https://github.com/stryker-mutator/stryker/blob/9c846bf618af487815074e3b49d674cf1786cdcd/packages/core/src/process/2-MutantInstrumenterExecutor.ts#L30-L38

  1. Instrument the code (works for JS, TS files as well as HTML/Vue files (script tags are instrumented)).
  2. "Preprocess" the files. Disable type checking is one of the preprocessors

We might open up the "preprocess" API in the future, but for now, we're keeping it private. Public API's are a hassle to maintain.

To disable type checking, we will need to parse the file, but performance impact is kept to a minimum because it will only parse when "@ts-" can be found anywhere in the file 😎

nicojs added a commit that referenced this issue Sep 2, 2020
Disable type checks by inserting `// @ts-nocheck` and removing all other `// @ts-xxx` directives. 

It can be configured with:

```js
{
  "disableTypeChecks": "{test,src,lib}/**/*.{js,ts,jsx,tsx,html,vue}" // configure a pattern or `false` here
  "warnings": {
    "preprocessorErrors": true // enable/disable error logging here (default: true)
  }
}
```

You can disable it completely by setting `disableTypeChecks` to `false`. The default is `"{test,src,lib}/**/*.{js,ts,jsx,tsx,html,vue}"`.

The disabling of type checks is done in the `DisableTypeChecksPreprocessor` class. It calls the function in the `instrumenter` package to do the actual disabling. The reason for this is that files need to be parsed to distinguish between code and comments (`@ts-xxx` directives are only removed when they appear inside comments). It also works with HTML and vue files, so it also supports use cases where type checking is done inside vue files (with the vue-cli). 

Whenever parsing a file results in an error, a warning is logged and mutation testing will still proceed. The warning can be disabled by setting `warnings.preprocessorErrors` to `false`.

This setting replaces both `sandbox.fileHeaders` and `sandbox.stripComments`, since those resulted in a lot of problems (see #2438 ).
@nicojs
Copy link
Member Author

nicojs commented Sep 2, 2020

THIS ISSUE IS FINALLY CLOSED AND I'M VERY HAPPY WITH THAT!!!!

😎

@Lakitna
Copy link
Contributor

Lakitna commented Sep 2, 2020

@nicojs right now:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants