-
-
Notifications
You must be signed in to change notification settings - Fork 982
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
add new feature: pagebreak for every format (#1626)
- Loading branch information
Showing
13 changed files
with
326 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -9,3 +9,5 @@ | |
^tests/testthat/site/.*_files/ | ||
^\.github$ | ||
^pkgdown$ | ||
^doc$ | ||
^Meta$ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -2,3 +2,6 @@ | |
.Rhistory | ||
.RData | ||
.DS_Store | ||
inst/doc | ||
doc | ||
Meta |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,100 @@ | ||
--[[ | ||
pagebreak – convert raw LaTeX page breaks to other formats | ||
Copyright © 2017-2019 Benct Philip Jonsson, Albert Krewinkel | ||
Permission to use, copy, modify, and/or distribute this software for any | ||
purpose with or without fee is hereby granted, provided that the above | ||
copyright notice and this permission notice appear in all copies. | ||
THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES | ||
WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF | ||
MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR | ||
ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES | ||
WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN | ||
ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF | ||
OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. | ||
]] | ||
local stringify_orig = (require 'pandoc.utils').stringify | ||
|
||
local function stringify(x) | ||
return type(x) == 'string' and x or stringify_orig(x) | ||
end | ||
|
||
--- configs – these are populated in the Meta filter. | ||
local pagebreak = { | ||
epub = '<p style="page-break-after: always;"> </p>', | ||
html = '<div style="page-break-after: always;"></div>', | ||
latex = '\\newpage{}', | ||
ooxml = '<w:p><w:r><w:br w:type="page"/></w:r></w:p>', | ||
odt = '<text:p text:style-name="Pagebreak"/>' | ||
} | ||
|
||
local function pagebreaks_from_config (meta) | ||
local html_class = | ||
(meta.newpage_html_class and stringify(meta.newpage_html_class)) | ||
or os.getenv 'PANDOC_NEWPAGE_HTML_CLASS' | ||
if html_class and html_class ~= '' then | ||
pagebreak.html = string.format('<div class="%s"></div>', html_class) | ||
end | ||
|
||
local odt_style = | ||
(meta.newpage_odt_style and stringify(meta.newpage_odt_style)) | ||
or os.getenv 'PANDOC_NEWPAGE_ODT_STYLE' | ||
if odt_style and odt_style ~= '' then | ||
pagebreak.odt = string.format('<text:p text:style-name="%s"/>', odt_style) | ||
end | ||
end | ||
|
||
--- Return a block element causing a page break in the given format. | ||
local function newpage(format) | ||
if format == 'docx' then | ||
return pandoc.RawBlock('openxml', pagebreak.ooxml) | ||
elseif format:match 'latex' then | ||
return pandoc.RawBlock('tex', pagebreak.latex) | ||
elseif format:match 'odt' then | ||
return pandoc.RawBlock('opendocument', pagebreak.odt) | ||
elseif format:match 'html.*' then | ||
return pandoc.RawBlock('html', pagebreak.html) | ||
elseif format:match 'epub' then | ||
return pandoc.RawBlock('html', pagebreak.epub) | ||
else | ||
-- fall back to insert a form feed character | ||
return pandoc.Para{pandoc.Str '\f'} | ||
end | ||
end | ||
|
||
local function is_newpage_command(command) | ||
return command:match '^\\newpage%{?%}?$' | ||
or command:match '^\\pagebreak%{?%}?$' | ||
end | ||
|
||
-- Filter function called on each RawBlock element. | ||
function RawBlock (el) | ||
-- Don't do anything if the output is TeX | ||
if FORMAT:match 'tex$' then | ||
return nil | ||
end | ||
-- check that the block is TeX or LaTeX and contains only | ||
-- \newpage or \pagebreak. | ||
if el.format:match 'tex' and is_newpage_command(el.text) then | ||
-- use format-specific pagebreak marker. FORMAT is set by pandoc to | ||
-- the targeted output format. | ||
return newpage(FORMAT) | ||
end | ||
-- otherwise, leave the block unchanged | ||
return nil | ||
end | ||
|
||
-- Turning paragraphs which contain nothing but a form feed | ||
-- characters into line breaks. | ||
function Para (el) | ||
if #el.content == 1 and el.content[1].text == '\f' then | ||
return newpage(FORMAT) | ||
end | ||
end | ||
|
||
return { | ||
{Meta = pagebreaks_from_config}, | ||
{RawBlock = RawBlock, Para = Para} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,27 @@ | ||
.generate_md_and_convert <- function(content, output_format) { | ||
input_file <- tempfile(fileext = ".Rmd") | ||
output_file <- tempfile() | ||
on.exit(unlink(c(input_file, output_file)), add = TRUE) | ||
xfun::write_utf8(content, input_file) | ||
res <- rmarkdown::render(input_file, output_format = output_format, output_file = output_file, quiet = TRUE) | ||
xfun::read_utf8(res) | ||
} | ||
|
||
# lua filters exists only since pandoc 2.0 | ||
skip_if_not(rmarkdown::pandoc_available("2.0")) | ||
|
||
test_that("pagebreak lua filters works", { | ||
rmd <- "# HEADER 1\n\\newpage\n# HEADER 2\n\\pagebreak\n# HEADER 3" | ||
res <- .generate_md_and_convert(rmd, "html_document") | ||
expect_match(res[grep("HEADER 1", res)+1], "<div style=\"page-break-after: always;\"></div>") | ||
expect_match(res[grep("HEADER 2", res)+1], "<div style=\"page-break-after: always;\"></div>") | ||
# add a class instead of inline style | ||
rmd2 <- paste0("---\nnewpage_html_class: page-break\n---\n", rmd) | ||
res <- .generate_md_and_convert(rmd2, "html_document") | ||
expect_match(res[grep("HEADER 1", res)+1], "<div class=\"page-break\"></div>") | ||
expect_match(res[grep("HEADER 2", res)+1], "<div class=\"page-break\"></div>") | ||
# For tex document this is unchanged | ||
res <- .generate_md_and_convert(rmd, "latex_document") | ||
expect_match(res[grep("HEADER 1", res)+2], "\\newpage", fixed = TRUE) | ||
expect_match(res[grep("HEADER 2", res)+2], "\\pagebreak", fixed = TRUE) | ||
}) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
*.html | ||
*.R |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,159 @@ | ||
--- | ||
title: "Add a pagebreak in Rmarkdown document" | ||
output: rmarkdown::html_vignette | ||
vignette: > | ||
%\VignetteIndexEntry{pagebreak} | ||
%\VignetteEngine{knitr::rmarkdown} | ||
%\VignetteEncoding{UTF-8} | ||
--- | ||
|
||
```{r, include = FALSE} | ||
knitr::opts_chunk$set( | ||
collapse = TRUE, | ||
comment = "#>" | ||
) | ||
``` | ||
|
||
## Usage | ||
|
||
Adding a pagebreak in document was always possible using custom output specific syntax in a rmarkdown file but one drawback was the compatibility with several output format. | ||
|
||
Since rmarkdown >= 1.15 and with RStudio >= 1.2 (or with pandoc >= 2.0), it is possible to add a `\newpage` or `\pagebreak` command in a new line to include a pagebreak in any of these formats: `pdf_document()`, `html_document()`, `word_document()` and `odt_document()`. | ||
|
||
```md | ||
# Header 1 | ||
|
||
Some text | ||
|
||
\newpage | ||
|
||
# Header 2 on a new page | ||
|
||
Some other text | ||
|
||
\pagebreak | ||
|
||
# Header 3 on a third page | ||
|
||
``` | ||
|
||
rmarkdown will convert those commands in the correct output format syntax using a [lua filter](#lua-filter) during pandoc conversion. | ||
|
||
### Using with PDF/ latex documents {#pdf} | ||
|
||
As the commands are the ones already used in latex syntax, this works as expected in a tex output document, and thus with pdf. Adding a pagebreak was already possible with rmarkdown when output is `pdf_document()` or `latex_document()`, without any restriction about the version of pandoc. | ||
|
||
### Using with HTML documents {#html} | ||
|
||
A `\newpage` or `\pagebreak` command in a rmarkdown document with output as HTML will be converted by default in this html code with inline style using CSS rule [`page-break-after`](https://developer.mozilla.org/en-US/docs/Web/CSS/page-break-after) | ||
|
||
```html | ||
<div style="page-break-after: always;"></div> | ||
``` | ||
|
||
This will always insert a pagebreak after this div. | ||
|
||
To get more flexibility, you can use a HTML class and some custom CSS instead of an inline style. You need to add a metadata field `newpage_html_class` in your yaml header to set the class. | ||
|
||
Then you can control the behavior using custom CSS as in this example | ||
|
||
````md | ||
--- | ||
output: | ||
html_document: default | ||
newpage_html_class: page-break | ||
--- | ||
|
||
```{css, echo = FALSE}`r ''` | ||
// display the pagebreak only when printing the html page | ||
@media all { | ||
.page-break { display: none; } | ||
} | ||
@media print { | ||
.page-break { display: block; break-after: page; } | ||
} | ||
``` | ||
|
||
# Header 1 | ||
|
||
Some text | ||
|
||
\newpage | ||
|
||
# Header 2 on a new page | ||
|
||
Some other text | ||
```` | ||
|
||
`\newpage` will be converted here to | ||
|
||
```html | ||
<div class="page-break"></div> | ||
``` | ||
|
||
and the style will be applied to this class from the CSS included in the chunk. | ||
|
||
This customisation can also be achieved by setting the environnement variable `PANDOC_NEWPAGE_HTML_CLASS` in the R session that will render the document (or in `.Renviron` file for example) | ||
|
||
Let's note that in this example we use [`break-after`](https://developer.mozilla.org/en-US/docs/Web/CSS/break-after) property instead of `page-break-after` as it is recommended now to use the former which is the replacement. The latter is kept around for [compatibility reason with browsers](https://caniuse.com/#search=page-break). | ||
|
||
### Using with Word documents {#word} | ||
|
||
A `\newpage` or `\pagebreak` command in a rmarkdown document with output as Word document will be converted in a pagebreak for word document. Manually, this would mean adding this in your rmarkdown | ||
|
||
````md | ||
```{=openxml} | ||
<w:p><w:r><w:br w:type="page"/></w:r></w:p> | ||
``` | ||
```` | ||
|
||
For example, using the pagebreak feature, this will add the first header in the second page of the work document | ||
|
||
````md | ||
--- | ||
title: My main title | ||
output: word_document | ||
--- | ||
|
||
\newpage | ||
|
||
# First Header | ||
```` | ||
|
||
### Using with ODT documents {#odt} | ||
|
||
To use the pagebreak feature with `odt_document()`, you need to provide a reference document that includes a paragraph style with, by default, the name _Pagebreak_. This named paragraph style should have no extra space before or after and have a pagebreak after it. (see [libre office documentation](https://help.libreoffice.org/Writer/Text_Flow) on how to create a style). | ||
|
||
The name of the named paragrah style could be customized using `newpage_odt_style` metadata in yaml header or `PANDOC_NEWPAGE_ODT_STYLE` environment variable (as in [html document](#html)). | ||
|
||
As the previous one, this example will lead to a two pages document, with first header on the second page. | ||
|
||
````md | ||
--- | ||
title: My main title | ||
output: | ||
odt_document: | ||
reference_odt: reference.odt | ||
--- | ||
|
||
\newpage | ||
|
||
# First Header | ||
```` | ||
|
||
|
||
## About lua filters {#lua-filter} | ||
|
||
Since pandoc 2.0, it is possible to use lua filters to add some extra functionality to pandoc document conversion. Adding a pagebreak command in markdown to be compatible with several output documents is one of them. You can find some more informations about lua filters in [pandoc's documentation](https://pandoc.org/lua-filters.html) and also some examples in [a collection of lua filters for pandoc](https://github.com/pandoc/lua-filters). These examples, and any other lua filters, can be use in your Rmarkdown document directly by adding [a pandoc argument](https://bookdown.org/yihui/rmarkdown/html-document.html#pandoc-arguments) in yaml header | ||
|
||
```html | ||
--- | ||
output: | ||
html_document: | ||
pandoc_args: ["--lua-filter=filter.lua"] | ||
--- | ||
``` | ||
|
||
The package [rmdfiltr](https://github.com/crsh/rmdfiltr) provides a collection of lua filters and helpers functions to use them. | ||
|
||
Before pandoc 2.0, [using filter](https://pandoc.org/filters.html) with pandoc was already available through programs that modifies the AST. `pandoc-citeproc` is an example used to deal with citations. The package [pandocfilter](https://cran.r-project.org/web/packages/pandocfilters/) is useful to create filters using R. |