-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Template node has incorrect range if the template contains emoji or other multi-byte characters 💩 #45
Comments
Failing test PR here: #53 |
The spans/offsets provided are byte-offsets, not UTF-8-character-offsets, hence the discrepancy. It’s not obviously incorrect: it’s easier to grab a slice of a file with byte ranges than character ranges, and there may be good reasons for swc to make that choice (what does source map use?)? it should also be possible to convert the byte ranges into character ranges in the consumer if that is desirable |
yeah -- at a min we'll need to provide a byteToCharRange function of docs or somethin, i think |
I think the actual problem may be related to class vs template-only, rather than multi-byte strings. A demo of the problem: https://runkit.com/nullvoxpopuli/content-tag-byte-vs-char-offsets it matters what's before and after the https://runkit.com/nullvoxpopuli/content-tag-byte-vs-char-offsets I learned about oh, but this is probably behavior I'm seeing because |
Aahhhh ha! wasn't so bad: Using codeconst { Preprocessor } = require("content-tag");
const p = new Preprocessor();
let before = `
import Component from '@glimmer/component';
import { on } from '@ember/modifier';
import { getSnippetElement, toClipboard, withExtraStyles } from './copy-utils';
import Menu from './menu';
/**
* This component is injected via the markdown rendering
*/
export default class CopyMenu extends Component {
copyAsText = (event: Event) => {
let code = getSnippetElement(event);
navigator.clipboard.writeText(code.innerText);
};
copyAsImage = async (event: Event) => {
let code = getSnippetElement(event);
await withExtraStyles(code, () => toClipboard(code));
};
`;
let open = `<template>`;
let content = `안녕하세요 세계`
let close = `</template>`;
let after = `
}
`;
let contentLength = content.length;
let openLength = open.length;
let closeLength = close.length;
function runAndPrint(input) {
let output = p.parse(input);
let r = output[0];
let range = JSON.stringify(r.range);
let sliced = input.slice(r.range.start, r.range.end);
let rLength = r.range.end - r.range.start;
let asArray = Array.from(input);
let arraySliced = asArray.slice(r.range.start, r.range.end).join('');
let buffer = Buffer.from(input, 'utf8');
let bufferSliced =(Buffer.from([...buffer].slice(r.range.start, r.range.end)).toString());
console.log(`
results: ${output.length}
range: ${range}
range length: ${rLength}
slice: [${sliced}]
sliced length: ${sliced.length}
array: [${arraySliced}]
array length: ${arraySliced.length}
buffer: [${bufferSliced}]
buffer length: ${bufferSliced.length}
`);
}
runAndPrint(`${before}${open}${content}${close}${after}`);
runAndPrint(`${open}${content}${close}`); |
version: content-tag 1.1.2
In investigating the root cause of ember-tooling/prettier-plugin-ember-template-tag#191 I discovered that
content-tag
is returning incorrect ranges when templates include multi-byte characters, such as emoji.Reproduction:
For a 4-byte character:
Note that the range has gobbled up the following character(s).
Similarly, for a two-byte character:
Interestingly, it gobbles fewer characters this time.
In the expression position, the issue is less noticeable, but still there:
The text was updated successfully, but these errors were encountered: