Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow handing in pre-parsed AST #79

Closed
pke opened this issue Jun 11, 2020 · 13 comments
Closed

Allow handing in pre-parsed AST #79

pke opened this issue Jun 11, 2020 · 13 comments
Labels
6.x.x enhancement New feature or request

Comments

@pke
Copy link

pke commented Jun 11, 2020

For my component I'd like to extract all the text links from the markdown to build an action sheet out of the links.

So I would like to pre-parse the markdown using markdown-it for all links and then hand in that AST to the component instead of a markdown-it prop.

I want to prevent thet parsing of the markdown twice, once in my component and second in yours.

Do you think that would be a possible addition?

@iamacup
Copy link
Owner

iamacup commented Jun 12, 2020

So we could do two things I think

  1. Make a prop that accepts the output of markdownit.parse - if this exists we just use this instead of running it internally
  2. Make a prop that is the AST, which would replace the output of AstRenderer.js if present

The problem with 2 is, AstRenderer.js is a very changed file - so there is potential for lots of breaking in the future.

What do you think?

@pke
Copy link
Author

pke commented Jun 12, 2020

The output of markdownit.parse is almost what I am using now, but I prefer a less nested AST (preferably flat even) so I am using your exported tokensToAST.

I am currently using this code and it works great but has a performance hit due to double parsing:

const links = useMemo(() => {
  const markdown = MarkdownIt({
    html: false,
    typographer: false,
  })
  const ast = tokensToAST(stringToTokens(children, markdown))
  const collectLinks = (result: FinePrintLink[], node: ASTNode): FinePrintLink[] => {
    if (node.type === "link") {
      result.push({
        uri: node.attributes.href,
        text: node.children?.map(({ content }) => content).join(" "),
      })
    }
    return node.children.reduce(collectLinks, result)
  }
  return ast.reduce(collectLinks, [])
}, [children])

I would not introduce a new prop, but rather define children additionally as union with ASTNode[] or even the return value of markdownit.parse

So my call to Markdown could look like this:

const ast = tokensToAST(stringToTokens(children, markdown))
<Markdown onLinkPress={onPress} style={style}>{ast}</Markdown>

Could also be possible

const tokens = stringToTokens(children, markdown)
// same as
const tokens = MarkdownIt().parse(children)
<Markdown onLinkPress={onPress} style={style}>{tokens}</Markdown>

What do you think?

@iamacup
Copy link
Owner

iamacup commented Jul 4, 2020

Hey

Sorry about the delay here - can you take a look at https://github.com/iamacup/react-native-markdown-display/releases/tag/6.2.x and let me know if this is what you were thinking?

Supports this syntax:

const markdownItInstance = MarkdownIt({typographer: true});

const copy = `
# H1

Paragraph
`;

const ast = tokensToAST(stringToTokens(copy, markdownItInstance))
            <Markdown
              debugPrintTree
            >
              {ast}
            </Markdown>

@iamacup iamacup added the enhancement New feature or request label Jul 4, 2020
@iamacup
Copy link
Owner

iamacup commented Jul 4, 2020

released this in 6.1.5 - re-open if we need to change how it works :) Readme is updated to reflect the above

@iamacup iamacup closed this as completed Jul 4, 2020
@pke
Copy link
Author

pke commented Jul 4, 2020

Thanks, I'll test it on Monday and report back here!

@pke
Copy link
Author

pke commented Jul 5, 2020

@iamacup works like a charm! Thanks for fulfilling this feature request!

@helsonxiao
Copy link

helsonxiao commented Jul 20, 2020

Two problems in 6.1.6:

  1. Passing in AST nodes is slower than plain markdown string, it's obvious in FlatList scenario. (even the AST nodes is not modified).
  2. Image node can't be parsed in AST nodes way.

p.s.
Currently I'm using React.memo for FlatList item cache, so problem 1 is not that serious, but it's really slow.

@pke
Copy link
Author

pke commented Jul 20, 2020

@helsonxiao thanks for reporting back.

to 1st: Could you provide a source snippet how you use the AST within the FlatList? The markdown component also uses just memoization internally so maybe your AST is re-calculated over and over again because of unstable dependencies?

to 2nd: You mean images are not displayed that way? That shouldn't make any difference. Maybe a test case could be made for that.

@iamacup iamacup reopened this Jul 20, 2020
@iamacup
Copy link
Owner

iamacup commented Jul 20, 2020

just opening to keep track of this in case it turns out to be library related - thanks for handling @pke !!

@helsonxiao
Copy link

helsonxiao commented Jul 22, 2020

@helsonxiao thanks for reporting back.

to 1st: Could you provide a source snippet how you use the AST within the FlatList? The markdown component also uses just memoization internally so maybe your AST is re-calculated over and over again because of unstable dependencies?

to 2nd: You mean images are not displayed that way? That shouldn't make any difference. Maybe a test case could be made for that.

  1. I haven't measured it with Performance API but I do feel the difference. I also tried to cache the AST processor or return AST nodes directly. Note: My FlatList Item use react-native-syntax-highlighter for code display, but it's still slower than plain string mode after I removed the highlighter. If it's still confused for you, I will check your codes. Anyway, React.memo is necessary so it's not a big deal.

  2. Try this input. It works well in plain string mode.

const input = "![背包四讲.png](https://media-test.jiuzhang.com/media/markdown/images/7/16/72644746-c710-11ea-8f89-0242ac1c0003.jpg)\n\n```\n[[java]]\nasfafagagag233\n[[golang]]\nasfgsagag\n```\n\n![背包四讲.png](https://media-test.jiuzhang.com/media/markdown/images/7/16/72644746-c710-11ea-8f89-0242ac1c0003.jpg)"

image

// turn on astPreprocessor then images disappear
return (
  <Markdown /* astPreprocessor={astPreprocessor} */ rules={rules}>
    {input}
  </Markdown>
);

// here I just return the ast, nothing changed.
// I will use this method to pick up important nodes.
const astPreprocessor = (ast: ASTNode[]) => {
  return ast;
};

const Markdown: React.FC<MarkdownProps> = props => {
  const _props = {
    rules,
    style: MarkdownStyles,
    ...props,
  };

  if (props.astPreprocessor) {
    const ast = tokensToAST(stringToTokens(props.children, markdownItInstance));

    return <_Markdown {..._props}>{props.astPreprocessor(ast)}</_Markdown>;
  }

  return <_Markdown {..._props}>{props.children}</_Markdown>;
};

image

@pke
Copy link
Author

pke commented Jul 22, 2020

About the speed difference:
At which point are you using memoizsation? The result coming back from astPreprocessor should be memoized in your component based on the children.

const {
  children,
  astPreprocessor,
} = props
const content= useMemo(() => {
  astPreprocessor ? astPreprocessor(tokenToAST(stringToTokens(children, markdownItInstance)) : children
}, [children, astPreprocessor])

return <_Markdown markdownIt={markdownItInstance} {...props}>{content}</_Markdown>

Make sure you are also re-using the markdownItInstance you globally defined, or the Markdown component will create its own.

About the images I'd have to check this out in a free minute, but without knowing your styles for images and what the preprocessor actually does with the AST, its hard to predict.
So your preprocessor divides the markdown based on code fences?

@helsonxiao
Copy link

I think the markdown input is not entirely settled. stringToTokens only does the job partly. There are still two more steps from this file. https://github.com/iamacup/react-native-markdown-display/blob/master/src/lib/parser.js#L19

I copied them into my project then it works fine 😄. It could be nicer if you export them or refine the stringToTokens function. The speed confusion is also resolved. Thanks!

@fiznool
Copy link

fiznool commented Jul 31, 2020

+1 to cleaning up the necessary imports for a custom parser function. At present both cleanupTokens and groupTextTokens need to be imported from the src/lib/util folder.

However I wouldn't be in favour of consolidating the current functions into stringToTokens, as for my use case I need to do some preprocessing before groupTextTokens is called:

import {
  MarkdownIt,
  stringToTokens,
  tokensToAST,
} from 'react-native-markdown-display';
import { cleanupTokens } from 'react-native-markdown-display/src/lib/util/cleanupTokens';
import groupTextTokens from 'react-native-markdown-display/src/lib/util/groupTextTokens';

const markdownIt = MarkdownIt({
  typographer: true,
});

export default function toAST(markdownContent) {
  // Process into an AST, to support the changing of links into blocklinks.
  let tokens = stringToTokens(markdownContent, markdownIt);
  tokens = cleanupTokens(tokens);
  // It's important to blockify the links before grouping the text tokens,
  // as this will exclude the links from being grouped, which would otherwise cause issues on Android.
  tokens = blockifyLinks(tokens);
  tokens = groupTextTokens(tokens);

  return tokensToAST(tokens);
}

// We need links to be treated as 'blocklinks' so that styling works correctly.
function blockifyLinks(tokens) {
  tokens.forEach((token) => {
    if (token.type === 'link') {
      token.block = true;
    }
  });
  return tokens;
}

@iamacup iamacup closed this as completed Dec 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
6.x.x enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants