Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🔄 Support recursive includes in myst and tex #1082

Merged
merged 8 commits into from
Apr 10, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .changeset/afraid-dots-look.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
---
'myst-to-typst': patch
---

Add newlines after typst include/embed
5 changes: 5 additions & 0 deletions .changeset/brown-kings-yawn.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
---
'myst-cli': patch
---

Support notebooks in include directive
6 changes: 6 additions & 0 deletions .changeset/dirty-steaks-wave.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
---
'myst-transforms': patch
'myst-cli': patch
---

Support tex includes directly in myst processing
7 changes: 7 additions & 0 deletions .changeset/funny-llamas-joke.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
---
'myst-transforms': patch
'myst-cli': patch
'mystmd': patch
---

Handle circular includes with nice errors and no infinite loops
6 changes: 6 additions & 0 deletions .changeset/sour-dancers-float.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
---
'myst-transforms': patch
'mystmd': patch
---

Revive basic recursive include
16 changes: 7 additions & 9 deletions docs/code.md
Original file line number Diff line number Diff line change
Expand Up @@ -86,11 +86,13 @@ project:
```
````

## Including Files
(docs:literalinclude)=

## Including Code Files

If your code is in a separate file you can use the {myst:directive}`literalinclude` directive (or the {myst:directive}`include` directive with the {myst:directive}`include.literal` flag).
This directive is helpful for showing code snippets without duplicating your content.

For parsing the file, see the documentation in [](#docs:include).
For example, a `literalinclude` of a snippet of the `myst.yml` such as:

````markdown
Expand All @@ -109,16 +111,12 @@ creates a snippet that has matching line numbers, and starts at a line including
:lineno-match:
```

:::{note} Auto Reload
If you are working with the auto-reload (e.g. `myst start`), currently you will need to save the file with the {myst:directive}`literalinclude` directive for the contents to update.code for the contents to update.
:::{important} Paths are Relative
The {myst:directive}`argument <include.arg>` of a `{literalinclude}` directive is the file path, which is relative to the file from which it was referenced.
:::

The argument of an include directive is the file path ({myst:directive}`docs <include.arg>`), which is relative to the file from which it was referenced.
By default the file will be parsed using MyST, you can also set the file to be {myst:directive}`include.literal`, which will show as a code-block; this is the same as using the {myst:directive}`literalinclude` directive.

If in {myst:directive}`include.literal` mode, the directive also accepts all of the options from the `code-block` (e.g. {myst:directive}`include.linenos`).
In {myst:directive}`include.literal` mode, the include directive also accepts all of the options from the `code-block` (e.g. {myst:directive}`include.linenos`).
To select a portion of the file to be shown using the {myst:directive}`include.start-at`/{myst:directive}`include.start-after` selectors with the {myst:directive}`include.end-before`/{myst:directive}`include.end-at`, which use a snippet of included text.

Alternatively, you can explicitly select the lines (e.g. `1,3,5-10,20-`) or the {myst:directive}`include.start-line`/{myst:directive}`include.end-line` (which is zero based for compatibility with Sphinx).

The include directive is based on [RST](https://docutils.sourceforge.io/docs/ref/rst/directives.html#including-an-external-document-fragment) and [Sphinx](https://www.sphinx-doc.org/en/master/usage/restructuredtext/directives.html#directive-literalinclude).
61 changes: 53 additions & 8 deletions docs/embed.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,11 @@
---
title: Embedding Content
title: Embedding & Including Content
description: You can embed labeled content (paragraphs, figures, notebook outputs, etc) across pages, allowing you to avoid re-writing the same information twice.
---

You can embed **labeled content** (paragraphs, figures, notebook outputs, etc) across pages, allowing you to avoid re-writing the same information twice.
You can [embed](#docs:embed) **labeled content** (paragraphs, figures, notebook outputs, etc) across pages, allowing you to avoid re-writing the same information twice. You can also [include files](#docs:include) that are not in your project.

::::{seealso} Creating a Label

Embedding content requires a label to be present.
To attach labels to blocks of content, see [](./cross-references.md).
To attach labels to Jupyter Notebook content, see [](./reuse-jupyter-outputs.md)
Expand All @@ -26,6 +25,8 @@ Here's a cool figure.
:::
::::

(docs:embed)=

## The `{embed}` directive

The {myst:directive}`embed` directive can be used like so:
Expand All @@ -51,7 +52,7 @@ For example, the following references the admonitions list in [](admonitions.md)

```

## The `![](#embed)` short-hand
### The `![](#embed)` short-hand

The embedding markdown shorthand lets you quickly embed content using the Markdown image syntax (see more about [images](./figures.md)).
It can be used like so:
Expand All @@ -62,7 +63,7 @@ It can be used like so:

![](#myLabel)

## Embed images into figures
### Embed images into figures

If you have a labeled image in your documentation, you can embed it as a Figure so that it contains figure metadata (like a caption, or adding alt-text).
To do so, use a **label attached to an image** instead of a filepath.
Expand Down Expand Up @@ -97,17 +98,61 @@ The new label can be referred to in this context, i.e. `[@sunset-figure]`: [@sun
This figure has been included from a Jupyter Notebook and can be referred to in cross-references through a different label. See [](./reuse-jupyter-outputs.md) for more information.
```

## Embed notebook content and outputs
### Embed notebook content and outputs

You can embed notebook content (for example, images generated by running a cell).
For instructions on how to embed notebook content, see [](./reuse-jupyter-outputs.md).

(docs:include)=

## The `{include}` directive

If a portion of your content is in a separate file that is **not already included in your project** you can use the {myst:directive}`include` directive to parse and include that content.
This directive is helpful for including content snippets, such as a table, equation, that you want to keep in a different file on disk, but present as if it were one document. In addition to Markdown, MyST will also parse `.ipynb`, `.tex`, and `.html`.

:::{prf:example} Equation Bank
:label: eg:equation-bank
It is common practice to keep complex equations out of the main document so that they can be shared between slides, papers, and different renderings of a document. The `equations` folder in our documentation has a `curl.tex` file:

```{literalinclude} equations/curl.tex
:filename: equations/curl.tex
```

We can `include` that content in this document using:

````markdown
```{include} equations/curl.tex

```
````

which includes the content with the LaTeX parser:

```{include} equations/curl.tex

```

You will notice that there is no difference visually in the content, it is as if the content were include directly in line in the source document.
:::

:::{warning} Relative Paths in Markdown vs LaTeX
The {myst:directive}`argument <include.arg>` of the include directive is the file path and is **relative** to the file from which it was referenced. When working in markdown, recursive includes follow that same pattern. In LaTeX, however, recursive includes are relative to the original source file. This difference is only apparent when you have nested imports and a non-flat folder structure. MyST will also give you helpful warnings if it cannot find the file you are referencing.
:::

By default the file will be parsed using MyST, you can also set the file to be {myst:directive}`include.literal`, which will show as a code-block; this is the same as using the {myst:directive}`literalinclude` directive which is documented in [](#docs:literalinclude).

:::{note} Auto Reload & Circular Dependencies
:class: dropdown
If you are working with the auto-reload (e.g. `myst start`), the file dependencies are auto-reloaded.
Circular dependencies are not allowed and MyST will issue a warning and not render the recursion.
:::

## `{embed}` vs. `{include}`

The {myst:directive}`include` directive is very similar to {myst:directive}`embed`, with a few key differences.

`{include}`
: parses source files (e.g. text files on your filesystem) and inserts them into the document structure as if you had written the content in your target source file.
`{include}` and `{literalinclude}`
: parses source files (e.g. text files on your filesystem) and inserts them into the document structure as if you had written the content in your target source file. These files are not listed in your project table of contents, and generally only contain snippets.

`{embed}`
: Pulls any labelled MyST content or outputs already parsed in your project.
5 changes: 5 additions & 0 deletions docs/equations/curl.tex
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
\begin{equation}
(~\nabla \times \vec{e}~) \cdot \mathbf{\hat{n}} \
\overset{\underset{\mathrm{def}}{}}{=}
\lim_{s \to 0}\left( \frac{1}{|s|}\oint_{c} \vec{e} \cdot d\mathbf{r}\right)
\end{equation}
10 changes: 0 additions & 10 deletions packages/myst-cli/src/process/file.ts
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,6 @@ import { createHash } from 'node:crypto';
import { tic } from 'myst-cli-utils';
import { TexParser } from 'tex-to-myst';
import { VFile } from 'vfile';
import type { GenericParent } from 'myst-common';
import { RuleId, toText } from 'myst-common';
import { validatePageFrontmatter } from 'myst-frontmatter';
import { SourceFileKind } from 'myst-spec-ext';
Expand All @@ -18,8 +17,6 @@ import { addWarningForFile } from '../utils/addWarningForFile.js';
import { loadCitations } from './citations.js';
import { parseMyst } from './myst.js';
import { processNotebook } from './notebook.js';
import { includeDirectiveTransform } from 'myst-transforms';
import { makeFileLoader } from '../transforms/include.js';
import { selectors } from '../store/index.js';

function checkCache(cache: ISessionWithCache, content: string, file: string) {
Expand Down Expand Up @@ -116,13 +113,6 @@ export async function loadFile(
const { sha256, useCache } = checkCache(cache, content, file);
if (useCache) break;
const tex = new TexParser(content, vfile);
await includeDirectiveTransform(tex.ast as GenericParent, vfile, {
Copy link
Collaborator Author

@fwkoch fwkoch Apr 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Live reload was not working well for tex include content. The fastProcessFile function was getting triggered correctly for dependent files, but this transform was never run on the file with \input{...} (useCache was still true).

Now, when this function is called in processMdast it handles tex as well, so live reload behaviour matches md processing, regardless of useCache for file loading. (And calling the function here is unnecessary.)

loadFile: makeFileLoader(session, vfile, file),
parseContent: (filename, input) => {
const subTex = new TexParser(input, vfile);
return subTex.ast.children ?? [];
},
});
const frontmatter = validatePageFrontmatter(
{
title: toText(tex.data.frontmatter.title as any),
Expand Down
90 changes: 68 additions & 22 deletions packages/myst-cli/src/transforms/include.ts
Original file line number Diff line number Diff line change
Expand Up @@ -7,24 +7,69 @@ import type { VFile } from 'vfile';
import { parseMyst } from '../process/myst.js';
import type { ISession } from '../session/types.js';
import { watch } from '../store/reducers.js';
import { TexParser } from 'tex-to-myst';
import { processNotebook } from '../process/notebook.js';

export const makeFileLoader =
(session: ISession, vfile: VFile, baseFile: string) => (filename: string) => {
const dir = path.dirname(baseFile);
const fullFile = path.join(dir, filename);
/**
* Return resolveFile function
*
* If `sourceFile` is format .tex, `relativeFile` will be resolved relative to the
* original baseFile; otherwise, it will be resolved relative to `sourceFile`.
*
* The returned function will resolve the file as described above, and return it if
* it exists or log an error and return undefined otherwise.
*/
export const makeFileResolver =
(baseFile: string) => (relativeFile: string, sourceFile: string, vfile: VFile) => {
const base = sourceFile.toLowerCase().endsWith('.tex') ? baseFile : sourceFile;
const fullFile = path.resolve(path.dirname(base), relativeFile);
if (!fs.existsSync(fullFile)) {
fileError(vfile, `Include Directive: Could not find "${fullFile}" in "${baseFile}"`, {
ruleId: RuleId.includeContentLoads,
});
fileError(
vfile,
`Include Directive: Could not find "${relativeFile}" relative to "${base}"`,
{
ruleId: RuleId.includeContentLoads,
},
);
return;
}
session.store.dispatch(
watch.actions.addLocalDependency({
path: baseFile,
dependency: fullFile,
}),
);
return fs.readFileSync(fullFile).toString();
return fullFile;
};

/**
* Return loadFile function
*
* Loaded file is added to original baseFile's dependencies.
*/
export const makeFileLoader = (session: ISession, baseFile: string) => (fullFile: string) => {
session.store.dispatch(
watch.actions.addLocalDependency({
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These include dependencies do not show up in the page json "dependency" field, but they are respected for live site reloading.

I think that's ok - the json "dependencies" are used for loading data from other pages; in the case of include, this data is actually moved to the page itself.

🤔 Maybe if we update include to allow including partial snippets from md files we may need to load the rest of the data from that page...? (Similar to embed pulling in a single figure.) But for now, it's unnecessary.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, the included files are a local convinience, not for how the final documents work.

path: baseFile,
dependency: fullFile,
}),
);
return fs.readFileSync(fullFile).toString();
};

/**
* Return paresContent function
*
* Handles html and tex files separately; all other files are treated as MyST md.
*/
export const makeContentParser =
(session: ISession) => async (filename: string, content: string, vfile: VFile) => {
if (filename.toLowerCase().endsWith('.html')) {
return [{ type: 'html', value: content }];
}
if (filename.toLowerCase().endsWith('.tex')) {
const subTex = new TexParser(content, vfile);
return subTex.ast.children ?? [];
}
if (filename.toLowerCase().endsWith('.ipynb')) {
const mdast = await processNotebook(session, filename, content);
return mdast.children;
}
return parseMyst(session, content, filename).children;
};

export async function includeFilesTransform(
Expand All @@ -33,12 +78,13 @@ export async function includeFilesTransform(
tree: GenericParent,
vfile: VFile,
) {
const parseContent = (filename: string, content: string) => {
if (filename.toLowerCase().endsWith('.html')) {
return [{ type: 'html', value: content }];
}
return parseMyst(session, content, filename).children;
};
const loadFile = makeFileLoader(session, vfile, baseFile);
await includeDirectiveTransform(tree, vfile, { loadFile, parseContent });
const parseContent = makeContentParser(session);
const loadFile = makeFileLoader(session, baseFile);
const resolveFile = makeFileResolver(baseFile);
await includeDirectiveTransform(tree, vfile, {
resolveFile,
loadFile,
parseContent,
sourceFile: baseFile,
});
}
4 changes: 2 additions & 2 deletions packages/myst-to-typst/src/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -304,10 +304,10 @@ const handlers: Record<string, Handler> = {
state.write(`)`);
},
embed(node, state) {
state.renderChildren(node);
state.renderChildren(node, 2);
},
include(node, state) {
state.renderChildren(node);
state.renderChildren(node, 2);
},
footnoteReference(node, state) {
if (!node.identifier) return;
Expand Down
33 changes: 28 additions & 5 deletions packages/myst-transforms/src/include.ts
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
import type { GenericNode, GenericParent } from 'myst-common';
import { fileWarn, RuleId } from 'myst-common';
import { fileError, fileWarn, RuleId } from 'myst-common';
import type { Code, Container, Include } from 'myst-spec-ext';
import { selectAll } from 'unist-util-select';
import type { Caption } from 'myst-spec';
Expand All @@ -8,7 +8,14 @@ import type { VFile } from 'vfile';

export type Options = {
loadFile: (filename: string) => Promise<string | undefined> | string | undefined;
parseContent: (filename: string, content: string) => Promise<GenericNode[]> | GenericNode[];
resolveFile: (includeFile: string, sourceFile: string, vfile: VFile) => string | undefined;
parseContent: (
filename: string,
content: string,
vfile: VFile,
) => Promise<GenericNode[]> | GenericNode[];
sourceFile: string;
stack?: string[];
};

/**
Expand All @@ -20,11 +27,22 @@ export type Options = {
export async function includeDirectiveTransform(tree: GenericParent, vfile: VFile, opts: Options) {
const includeNodes = selectAll('include', tree) as Include[];
if (includeNodes.length === 0) return;
if (!opts?.stack) opts.stack = [opts.sourceFile];
await Promise.all(
includeNodes.map(async (node) => {
// If the transform has already run, don't run it again!
if (node.children && node.children.length > 0) return;
const rawContent = await opts.loadFile(node.file);
const fullFile = opts.resolveFile(node.file, opts.sourceFile, vfile);
if (!fullFile) return;
// If we encounter the same include file twice in a single stack, return
if (opts.stack?.includes(fullFile)) {
fileError(vfile, `Include Directive: "${fullFile}" depends on itself`, {
ruleId: RuleId.includeContentLoads,
note: [...opts.stack, fullFile].join(' > '),
});
return;
}
const rawContent = await opts.loadFile(fullFile);
if (rawContent == null) return;
const { content, startingLineNumber } = filterIncludedContent(vfile, node.filter, rawContent);
let children: GenericNode[];
Expand Down Expand Up @@ -78,11 +96,16 @@ export async function includeDirectiveTransform(tree: GenericParent, vfile: VFil
children = [container];
}
} else {
children = await opts.parseContent(node.file, content);
children = await opts.parseContent(fullFile, content, vfile);
}
node.children = children as any;
if (!node.children?.length) return;
// Recurse!
// await includeDirectiveTransform(node as GenericParent, vfile, opts);
await includeDirectiveTransform(node as GenericParent, vfile, {
...opts,
stack: [...(opts.stack ?? []), fullFile],
sourceFile: fullFile,
});
}),
);
}
Expand Down
Loading
Loading