-
-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MD/LaTeX -> PDF: bibliography running off page when penalties=10000 #3255
Comments
This seems more of a LaTeX issue than a pandoc issue, unless you have some positive suggestion about how the latex pandoc emits should be different.
|
The penalties were your addition, not default pandoc output. You say
Can you be more precise about the "poor formatting" when clubpenalty and widowpenalty aren't used? Have you tried setting these penalties to lower values? |
Without clubpenalty and widowpenalty, orphans and widows appear. See the bottom line (left) and first line (right) — ideally, it should be both lines on either page as 1 line goes against convention: With widowpenalty and clubpenalty at 5000, the page skipping happens again. At 3000 and 4000 though, the output is identical to if I don't set it at all. On the middle page (11), you can see how the text runs far past the bottom margin. It does eventually go to the next page, but there is text lost (i.e. not visible on either page 10 or 12). The bibliography should start on page 10. |
OK. This seems to be a LaTeX issue. I don't know what the best solution is, but there's no reason to have an issue here unless you can suggest some specific way in which pandoc's latex output should be different. |
This seems similar to jgm/pandoc-citeproc#264, but I have also seen such effects myself in the past. I'm not sure what the exact cause of this is – but I think a case could be made that pandoc should output a latex list rather than ordinary paragraphs. latex's native Both to avoid formatting issues as described in the OP and to obtain a hanging indent format, I have been using a filter for years now that transforms latex reference list entries such as
to
which, in combination with a suitable definition of the references environment in a latex preamble:
gives a nice hanging indent format without ever running into any of the issues described in the OP. I feel it would be a good idea if pandoc tried to implement this natively, both to avoid underfull pages and to enable the hanging indent format required by a vast number of styles. Even better, pandoc could start outputting references as two distinct elements, as required by In practical terms for latex, this would mean to output a reference item in form of a command with two arguments, say
in combination of course with a suitable (re)definition of the references environment. |
Currently pandoc-citeproc gives you a structure like this: Div ("refs",["references"],[])
[Div ("ref-item1",[],[])
[Para [Str "Doe",Str ",",Space,Str "John",Str ".",Space,Str "2005",Str ".",Space,Emph [Str "First",Space,Str "Book"],Str ".",Space,Str "Cambridge",Str ":",Space,Str "Cambridge",Space,Str "University",Space,Str "Press",Str "."]]
,Div ("ref-item2",[],[])
[Para [Str "\8212\8212\8212",Str ".",Space,Str "2006",Str ".",Space,Str "\8220",Str "Article",Str ".",Str "\8221",Space,Emph [Str "Journal",Space,Str "of",Space,Str "Generic",Space,Str "Studies"],Space,Str "6",Str ":",Space,Str "33\8211\&34",Str "."]]]] One approach (A) would be to make an ad hoc modification to the LaTeX writer, so that a structure matching this is rendered as you suggest above (though with the hypertargets that pandoc would normally support). Such a change would be relatively easy and would only affect pandoc, not pandoc-citeproc. A definition of the references environment would need to be inserted into the default latex (and beamer) template, and preferably made conditional on the actual presence of a bibliography. A more radical approach (B) would be to have pandoc-citeproc emit a list, something like (schematically)
This would have the drawback of causing bibliographies to be rendered as bullet lists in all formats, unless something special was done to style them differently. In HTML you can just use CSS, but this could be a serious problem in many formats, e.g. docx or plain text formats like Markdown. I think it would be better not to do this, unless (C) we added a generic (neither bulleted nor ordered) list type to pandoc-types. But (C) would be a huge amount of work, requiring changes to all writers and readers. A fourth possible change (D) would be to have pandoc-citeproc emit something more complex than a Div containing a single Para for each entry. For example, as you suggested, we could have something like
This would require more extensive changes to pandoc-citeproc than B, and changes to all the pandoc writers, which would have to be taught something intelligent to do with these constructions. |
I tend to favour "D" as the cleanest approach. Just note that
However, for
Spans could work, too (since all of this takes place within one paragraph), and since strictly speaking only the first field needs to be tagged for special treatment, the first example could also look like this:
or like this:
Unfortunately, there's a further complication: To fully implement the CSL specs, pandoc-citeproc would also have to cater for CSL's "display" attributes (http://docs.citationstyles.org/en/stable/specification.html#display). Example "B" from the CSL specs, with "notes" added, would have to be represented by something like this:
What's more, since @rmzelle reported (in jgm/pandoc-citeproc#85) that he thinks the CSL specs do not preclude the bibliography-specific "whitespace" options (hanging-indent, second-field-align) from co-occurring with the "display" attributes, constructs like the following seem to be allowed, too:
Now, while I tend to think that there might be arguments for using Actually, I think that within Hence I feel pandoc should permit (To be continued, I'll have to think about some of the details more carefully ...) |
Unambitious first step:
Later we can think about changing pandoc-citeproc's output, etc. |
Isn't #2704 related? Does its fix also fix this issue? |
It would be nice if we had an actual test case so we could see if #2704 helps with this. |
@jgm commented on 2. 3. 2017 15:47 SEČ:
I think it does fix this issue as well. I have constructed a test case with all of my references. Adding the |
…aph. Closes #2704 (formatting problems in beamer citations). See http://tex.stackexchange.com/questions/22852/function-and-usage-of-leavevmode
Great, I'll close this then. |
Version 1.16.0.2
Problem: when
\clubpenalty=10000
and/or\widowpenalty=10000
in the preamble, the bibliography starts one page later than it should (i.e. leaves a blank page above — in the attached image, it starts on p.13, but should start on p.12), and tries to fit on one page — it goes past the bottom margin.Commenting out the penalties results in the bibliography displaying all the entries and respecting page margins, but with poor formatting — the rest of the document is also poorly formatted as a result.
Headers are Markdown-style
#
; LaTeX formatting commands are also used (\textsc{}
etc.). I have not set "sloppy" or anything similar.Expected: a bibliography that starts on p.12 and goes on, formatted without widows or orphans.
Command (bash variables because I use a script that reads the input files from a list):
preamble.tex:
The text was updated successfully, but these errors were encountered: