Skip to content

Commit

Permalink
Add --rebase-relative-paths option.
Browse files Browse the repository at this point in the history
- Add manual entry for `--rebase-relative-paths`.
- Add option `--rebase-relative-paths`, which rewrites
  relative image and link paths by prepending the (relative)
  directory of the containing file.
- Enable `rebase-relative-paths` in defaults files.
- Add `readerRebaseRelativePaths` to ReaderOptions record
  [API change].
- Make Markdown reader sensitive to `readerRebaseRelativePaths`.
- Add tests for #3752.

Closes #3752.
  • Loading branch information
jgm committed May 26, 2021
1 parent 6804f47 commit 0138fe3
Show file tree
Hide file tree
Showing 13 changed files with 109 additions and 7 deletions.
25 changes: 25 additions & 0 deletions MANUAL.txt
Original file line number Diff line number Diff line change
Expand Up @@ -526,6 +526,31 @@ header when requesting a document from a URL:
where X = NUMBER - 1.* Specify the base level for headings
(defaults to 1).

`--rebase-relative-paths`

: Rewrite relative paths for Link and Image elements, depending
on the path of the file containing the link or image link.
For each link or image, pandoc will compute the directory of
the containing file, relative to the working directory, and
prepend the resulting path to the link or image path.

The use of this option is best understood by example.
Suppose you have a a subdirectory for each chapter of a
book, `chap1`, `chap2`, `chap3`. Each contains a file
`text.md` and a number of images used in the chapter. You
would like to have `![image](spider.jpg)` in `chap1/text.md`
refer to `chap1/spider.jpg` and `![image](spider.jpg)` in
`chap2/text.md` refer to `chap2/spider.jpg`. To do this,
use

pandoc chap*/*.md --rebase-relative-paths

Without this option, you would have to use
`![image](chap1/spider.jpg)` in `chap1/text.md` and
`![image](chap2/spider.jpg)` in `chap2/text.md`. Links with
relative paths will be rewritten in the same way as images.
*This option currently only affects Markdown input.*

`--strip-empty-paragraphs`

: *Deprecated. Use the `+empty_paragraphs` extension instead.*
Expand Down
4 changes: 4 additions & 0 deletions pandoc.cabal
Original file line number Diff line number Diff line change
Expand Up @@ -214,6 +214,10 @@ extra-source-files:
test/command/C.txt
test/command/D.txt
test/command/01.csv
test/command/chap1/spider.png
test/command/chap2/spider.png
test/command/chap1/text.md
test/command/chap2/text.md
test/command/defaults1.yaml
test/command/defaults2.yaml
test/command/defaults3.yaml
Expand Down
1 change: 1 addition & 0 deletions src/Text/Pandoc/App.hs
Original file line number Diff line number Diff line change
Expand Up @@ -214,6 +214,7 @@ convertWithOpts opts = do
, readerIndentedCodeClasses = optIndentedCodeClasses opts
, readerDefaultImageExtension =
optDefaultImageExtension opts
, readerRebaseRelativePaths = optRebaseRelativePaths opts
, readerTrackChanges = optTrackChanges opts
, readerAbbreviations = abbrevs
, readerExtensions = readerExts
Expand Down
5 changes: 5 additions & 0 deletions src/Text/Pandoc/App/CommandLineOptions.hs
Original file line number Diff line number Diff line change
Expand Up @@ -277,6 +277,11 @@ options =
"section|chapter|part")
"" -- "Use top-level division type in LaTeX, ConTeXt, DocBook"

, Option "" ["rebase-relative-paths"]
(NoArg
(\opt -> return opt { optRebaseRelativePaths = True }))
"" -- "Rebase relative paths to directory of containing file"

, Option "" ["extract-media"]
(ReqArg
(\arg opt ->
Expand Down
5 changes: 5 additions & 0 deletions src/Text/Pandoc/App/Opt.hs
Original file line number Diff line number Diff line change
Expand Up @@ -144,6 +144,8 @@ data Opt = Opt
, optAscii :: Bool -- ^ Prefer ascii output
, optDefaultImageExtension :: Text -- ^ Default image extension
, optExtractMedia :: Maybe FilePath -- ^ Path to extract embedded media
, optRebaseRelativePaths :: Bool -- ^ Rebase relative link/image paths
-- to directory of containing file
, optTrackChanges :: TrackChanges -- ^ Accept or reject MS Word track-changes.
, optFileScope :: Bool -- ^ Parse input files before combining
, optTitlePrefix :: Maybe Text -- ^ Prefix for title
Expand Down Expand Up @@ -529,6 +531,8 @@ doOpt (k',v) = do
"extract-media" ->
parseYAML v >>= \x ->
return (\o -> o{ optExtractMedia = unpack <$> x })
"rebase-relative-paths" ->
parseYAML v >>= \x -> return (\o -> o{ optRebaseRelativePaths = x })
"track-changes" ->
parseYAML v >>= \x -> return (\o -> o{ optTrackChanges = x })
"file-scope" ->
Expand Down Expand Up @@ -657,6 +661,7 @@ defaultOpts = Opt
, optAscii = False
, optDefaultImageExtension = ""
, optExtractMedia = Nothing
, optRebaseRelativePaths = False
, optTrackChanges = AcceptChanges
, optFileScope = False
, optTitlePrefix = Nothing
Expand Down
2 changes: 2 additions & 0 deletions src/Text/Pandoc/Lua/Marshaling/ReaderOptions.hs
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,7 @@ instance Pushable ReaderOptions where
(indentedCodeClasses :: [Text.Text])
(abbreviations :: Set.Set Text.Text)
(defaultImageExtension :: Text.Text)
(rebaseRelativePaths :: Bool)
(trackChanges :: TrackChanges)
(stripComments :: Bool)
= ro
Expand All @@ -57,6 +58,7 @@ instance Pushable ReaderOptions where
LuaUtil.addField "indented_code_classes" indentedCodeClasses
LuaUtil.addField "abbreviations" abbreviations
LuaUtil.addField "default_image_extension" defaultImageExtension
LuaUtil.addField "rebase_relative_paths" rebaseRelativePaths
LuaUtil.addField "track_changes" trackChanges
LuaUtil.addField "strip_comments" stripComments

Expand Down
3 changes: 3 additions & 0 deletions src/Text/Pandoc/Options.hs
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,8 @@ data ReaderOptions = ReaderOptions{
-- indented code blocks
, readerAbbreviations :: Set.Set Text -- ^ Strings to treat as abbreviations
, readerDefaultImageExtension :: Text -- ^ Default extension for images
, readerRebaseRelativePaths :: Bool -- ^ If True, prepend relative
-- directory of containing file to paths in links and images
, readerTrackChanges :: TrackChanges -- ^ Track changes setting for docx
, readerStripComments :: Bool -- ^ Strip HTML comments instead of parsing as raw HTML
-- (only implemented in commonmark)
Expand All @@ -80,6 +82,7 @@ instance Default ReaderOptions
, readerIndentedCodeClasses = []
, readerAbbreviations = defaultAbbrevs
, readerDefaultImageExtension = ""
, readerRebaseRelativePaths = False
, readerTrackChanges = AcceptChanges
, readerStripComments = False
}
Expand Down
37 changes: 30 additions & 7 deletions src/Text/Pandoc/Readers/Markdown.hs
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,8 @@ import qualified Data.Set as Set
import Data.Text (Text)
import qualified Data.Text as T
import qualified Data.ByteString.Lazy as BL
import System.FilePath (addExtension, takeExtension)
import System.FilePath (addExtension, takeExtension, isAbsolute, takeDirectory,
(</>))
import Text.HTML.TagSoup hiding (Row)
import Text.Pandoc.Builder (Blocks, Inlines)
import qualified Text.Pandoc.Builder as B
Expand Down Expand Up @@ -1836,9 +1837,12 @@ regLink :: PandocMonad m
-> MarkdownParser m (F Inlines)
regLink constructor lab = try $ do
(src, tit) <- source
rebase <- getOption readerRebaseRelativePaths
pos <- getPosition
let src' = if rebase then rebasePath pos src else src
attr <- option nullAttr $
guardEnabled Ext_link_attributes >> attributes
return $ constructor attr src tit <$> lab
return $ constructor attr src' tit <$> lab

-- a link like [this][ref] or [this][] or [this]
referenceLink :: PandocMonad m
Expand All @@ -1854,6 +1858,8 @@ referenceLink constructor (lab, raw) = do
return (mempty, "")))
<|>
try ((guardDisabled Ext_spaced_reference_links <|> spnl) >> reference)
rebase <- getOption readerRebaseRelativePaths
pos <- getPosition
when (raw' == "") $ guardEnabled Ext_shortcut_reference_links
let labIsRef = raw' == "" || raw' == "[]"
let key = toKey $ if labIsRef then raw else raw'
Expand All @@ -1878,7 +1884,9 @@ referenceLink constructor (lab, raw) = do
Just ((src, tit), _) -> constructor nullAttr src tit <$> lab
Nothing -> makeFallback
else makeFallback
Just ((src,tit), attr) -> constructor attr src tit <$> lab
Just ((src,tit), attr) ->
let src' = if rebase then rebasePath pos src else src
in constructor attr src' tit <$> lab

dropBrackets :: Text -> Text
dropBrackets = dropRB . dropLB
Expand Down Expand Up @@ -1911,15 +1919,30 @@ autoLink = try $ do
return $ return $ B.linkWith attr (src <> escapeURI extra) ""
(B.str $ orig <> extra)

-- | Rebase a relative path, by adding the (relative) directory
-- of the containing source position. Absolute links and URLs
-- are untouched.
rebasePath :: SourcePos -> Text -> Text
rebasePath pos path = do
let fp = sourceName pos
in if isAbsolute (T.unpack path) || isURI path
then path
else
case takeDirectory fp of
"" -> path
"." -> path
d -> T.pack (d </> T.unpack path)

image :: PandocMonad m => MarkdownParser m (F Inlines)
image = try $ do
char '!'
(lab,raw) <- reference
defaultExt <- getOption readerDefaultImageExtension
let constructor attr' src = case takeExtension (T.unpack src) of
"" -> B.imageWith attr' (T.pack $ addExtension (T.unpack src)
$ T.unpack defaultExt)
_ -> B.imageWith attr' src
let constructor attr' src =
case takeExtension (T.unpack src) of
"" -> B.imageWith attr' (T.pack $ addExtension (T.unpack src)
$ T.unpack defaultExt)
_ -> B.imageWith attr' src
regLink constructor lab <|> referenceLink constructor (lab,raw)

note :: PandocMonad m => MarkdownParser m (F Inlines)
Expand Down
20 changes: 20 additions & 0 deletions test/command/3752.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
```
% pandoc command/chap1/text.md command/chap2/text.md --rebase-relative-paths --verbose -t docx > /dev/null
^D
[INFO] Loaded command/chap1/spider.png from ./command/chap1/spider.png
[INFO] Loaded command/chap1/../../lalune.jpg from ./command/chap1/../../lalune.jpg
[INFO] Loaded command/chap2/spider.png from ./command/chap2/spider.png
```

```
% pandoc command/chap1/text.md command/chap2/text.md --rebase-relative-paths -t html
^D
<h1 id="chapter-one">Chapter one</h1>
<p>A spider: <img src="command/chap1/spider.png" alt="spider" /></p>
<p>The moon: <img src="command/chap1/../../lalune.jpg" alt="moon" /></p>
<p>Link to <a href="command/chap1/spider.png">spider picture</a>.</p>
<p>URL left alone: <a href="https://pandoc.org/MANUAL.html">manual</a>.</p>
<p>Absolute path left alone: <a href="/foo/bar/baz.png">absolute</a>.</p>
<h1 id="chapter-two">Chapter two</h1>
<p>A spider: <img src="command/chap2/spider.png" alt="spider" /></p>
```
Binary file added test/command/chap1/spider.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
11 changes: 11 additions & 0 deletions test/command/chap1/text.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# Chapter one

A spider: ![spider](spider.png)

The moon: ![moon](../../lalune.jpg)

Link to [spider picture](spider.png).

URL left alone: [manual](https://pandoc.org/MANUAL.html).

Absolute path left alone: [absolute](/foo/bar/baz.png).
Binary file added test/command/chap2/spider.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
3 changes: 3 additions & 0 deletions test/command/chap2/text.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# Chapter two

A spider: ![spider](spider.png)

0 comments on commit 0138fe3

Please sign in to comment.