Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Juris-M missing multi-lingual fields #482

Closed
duncdrum opened this issue Apr 21, 2016 · 24 comments
Closed

Juris-M missing multi-lingual fields #482

duncdrum opened this issue Apr 21, 2016 · 24 comments

Comments

@duncdrum
Copy link

duncdrum commented Apr 21, 2016

OS X 10.11.4, FF 45.0.2, Juris-M 4.0.29.8m67, BBT1.6.48

The following item in Juris-M (for Firefox):
screenshot 2016-04-21 15 45 37

is exported as

@online{leishuku,
  title = {中國類書庫},
  url = {http://server.wenzibase.com},
  shorttitle = {leishuku},
  timestamp = {2016-04-21T13:06:09Z},
  langid = {pinyin},
  titleaddon = {中國類書庫},
  type = {Full-text databse},
  author = {{愛如生}},
  urldate = {2016-04-20},
  year = {n.d.},
  file = {Google-Ergebnis für http\://crossasia.org/uploads/tx_sbbtyponewsletter/rte/RTEmagicC_erudition.png.png:/Users/HALmob/Library/Application Support/Firefox/Profiles/6wgnt11i.default/zotero/storage/ZF99V7IR/imgres.html:}
}

none of the multi-lingual fields are included. Is this the expected behavior? I use JM Chicago style (but other styles have the same effect). This is the cite item result:

Ài rú shēng, 愛如生 Erudition. “Zhōngguó lèishū kù”, 中國類書庫 (Database of Chinese Encyclopeadias). Full-text databse. 中國類書庫 Zhōngguó lèishū kù, n.d. http://server.wenzibase.com.

@retorquere retorquere added the bug label Apr 21, 2016
@retorquere
Copy link
Owner

There is no expected behavior yet, as this release is the first that works at all with Juris-M :)

I have no access to that item, but if you right-click it in Zotero and select "Send Better BibTeX Error", I will get a copy. If you click through that dialog, you'll get an ID, please post that here so I know which is yours. If you could attach the resulting BibTeX (or BibLaTeX, please specify which) here, I can get on that.

@duncdrum
Copy link
Author

duncdrum commented Apr 21, 2016

@retorquere thanks for looking into this. No time like the first time.
ErrorID: NV6HSWWG

the contents. of the leishuku.bib file are in the op you can download it from here as well.

Because of UTF-8 I only work with (better)biblatex (and biber as the backend) I have no idea about bibtex.

Some initial impression of working with juris-m and bbt:

  1. autogenerated cite keys are a pain, regardless of the option to use or not use ascii for biblatex the initial keys are always __XXXX where "xxxx" is the year, and non-latin characters aren't processed.
    screenshot 2016-04-21 20 34 56
  2. it would be much better if bbt tried to generate a key using the transliteration, or translation information instead.
  3. from what I understand so far biblatex can take titleaddon and authoraddon information to store transliteration / translation info, however, names only work for single author records, kind of voiding the whole thing.
  4. the MWE from the above is special in that the author name field has both transliteration AND translation (it's a company). It is much more common for there to be only original and transliteration for names. (titles on the other hand often have both.)
  5. These SE threads describe common solutions for multi-lingual bibliographies in latex including cjk references.

tl;dr for bbt to play nice with juris-m transliteration and translation information should somehow be present in the exported biblatex files. Even if users have to edit the bib files to change the field names.

@retorquere
Copy link
Owner

retorquere commented Apr 21, 2016

Is leishuku.bib what you want it to export, or what it does export? I'm looking for an entry how you want it to export.

On the other points:

  1. The "as ASCII" setting only affects the fields, not the citekey. If you select "as ASCII" it will translate unicode characters to their LaTeX equivalent commands. What you want is "Force citation key to ASCII", which you have disabled currently. This force uses the Zotero transliteration -- it usually does OK with Latin-like languages, but Chinese (is that Chinese?) I think it doesn't do so well. If you know of any projects that transliterate such characters well I'll be happy to look into those.
  2. BBT does do that by default for most patterns, although there are some patterns that yield untransliterated keys, which is when the "Force" option is useful.
  3. Given that, what would you suggest?
  4. I'm fine with that as long as I know how you want it
  5. You wildly overestimate how well I understand BibLaTeX 😁 I don't know what I should take away from those threads.

@retorquere
Copy link
Owner

(the reason for the first question above my list is that leishuku.bib doesn't seem to provide a solution for points 3 and 4)

@duncdrum
Copy link
Author

duncdrum commented Apr 22, 2016

no leishuku.bib is what is currently exported, and yes its a chinese reference. Maybe we should open a new issue for the citation keys. Since it is a separate thing.
So after having another look at the biblatex docs there is three option to work around the fact that juris-m is capable of things that biblatex just isn't. I ll use the examples from the Stackexchange threads.

  • Monkey patch (aka do not use any fancy biblatex stuff) and let the users hack their document preambles; e.g.:
author = {Li, 李无未, Wuwei} %note the very counterintuitive order and use of commas
\newbibmacro*{name:cjk}[3]{%
  \usebibmacro{name:delim}{#2#3#1}%
  \usebibmacro{name:hook}{#2#3#1}%
  \mkbibnamelast{#1}%
  \ifblank{#2}{}{\bibnamedelimd\mkbibnamefirst{#2}}%
  \ifblank{#3}{}{\bibnamedelimd\mkbibnameaffix{#3}}}
  • add titleaddon and nameaddon to the default better biblatex export. This will occasionally lead to bad .bib files due to a bug in biblatex.
Author = {{Li Wuwei}}, %Author as institution not individual this info is would be pulled from juris-m transliteration field
nameaddon = {李无未}, % this would be the original author in juris-m
  • add a new export format biblatexml which requires biber (most juris-m users probably do already), but it is the only thing that can handle everything that juris-m is throwing at it.
<?xml version="1.0" encoding="UTF-8"?>
<?xml-model href="biblatexml.rng"
            type="application/xml"
            schematypens="http://relaxng.org/ns/structure/1.0"?>
<bltx:entries xmlns:bltx="http://biblatex-biber.sourceforge.net/biblatexml">
  <bltx:entry id="key1" entrytype="book">
    <bltx:names type="author" morenames="1" useprefix="true">
      <bltx:name part xml:lang="zh" type="family">李</bltx:namepart>
      <bltx:name part xml:lang="zh" type="given">无未</bltx:namepart>
      <bltx:name part xml:lang="zh-alac97" type="family">Lǐ</bltx:namepart>
      <bltx:name part xml:lang="zh-alac97" type="given">Wúwèi</bltx:namepart>
    </bltx:name>
  </bltx:entry>
</bltx:entries>

I also found this version, so I ll need to do some testing

<?xml version="1.0" encoding="UTF-8"?>
<bib:entries xmlns:bib="http://biblatex-biber.sourceforge.net/biblatexml">
  <bib:entry id="key1" entrytype="collection">
    <bib:editor>
      <bib:person gender="sm">李无未</bib:person>
    </bib:editor>
    <bib:editor mode="romanised">
      <bib:person>
        <bib:first>
          <bib:namepart initial="Ww"> Wúwèi </bib:namepart>
        </bib:first>
        <bib:last>Lǐ</bib:last>
      </bib:person>
    </bib:editor>
  </bib:entry>
</bib:entries>

I'm quite swamped atm, but i ll try to upload a working .bib examples for bilatex and biblatexml, I don't think monkey-patch is the way to go here. Let me know what you think.

@retorquere
Copy link
Owner

Option 2 should be easy. Don't really fancy option 1. Option 3 is obviously desirable, but it will be more work (and the next few weeks I'll be swamped), and for the life of me I can't find documentation on what biblatexml is supposed to look like.

@duncdrum
Copy link
Author

yes I contacted the biblatex and biber devs, waiting to hear back. It seems biblatexml is in pre-documentation beta. There seems to have been a multi-script branch of biblatex that is already in public but i can't find documentation for it either. I ll just export and process some items and see what the logs have to say.

@retorquere
Copy link
Owner

Holy ... biblatexml is a schizo format. Parts XML, parts LaTeX. Why didn't they just settle on CSL-JSON?!

@duncdrum
Copy link
Author

duncdrum commented May 4, 2016

mark: leishuku:1

Ok after a bunch of testing, Biblatex just can't do it. There is a discussion thread for biblatex here and one from july 2015 on the zotero forums.

  • Option one could only work with access to zotero language prefs, where users already configure how transcriptions and translations should be handled in citations. Monkey-patching this for every possible combination of languages seems unfeasible. So users would need to be able to set this to fit their previous Tex documents and workflow.
  • Option two is on hold and not working with experimental or official branches of biblatex.
  • That leaves option three, but without better documentation for biblatexml I also don't see a way to implement this.

The only thing that seems both safe to use and not mess things up, is titleaddon. Names, Publishers, Places etc. are beyond reach. Currently titleaddon uses the same data as title, which doesn't make much sense [see op]. Instead BBT could do the following:

  1. If there is only one title: use current behaviour, but drop the superfluous titleaddon.
  2. If there is a title which has a variant from the same language family or has only one titlevariant: put it in titleaddon
  3. If there is more then one variant or variants in other languages user userz to usera.

so for the example in the op error ID: PMNPC378

@online{leishuku,
  title = {中國類書庫}, % this is "zh" in jurism
  titleaddon={Zhōngguó lèishū kù}, % this is "zh-alalc97" in jurism
  %userd={Datenbank der chinesischen Encyclopädien}, this would be further variants
  usere={Database of Chinese Encyclopeadias}, % this is "en" title variant in jurism
  url = {http://server.wenzibase.com},
  shorttitle = {leishuku},
  timestamp = {2016-04-21T13:06:09Z},
  langid = {pinyin}, %babel for "zh"
  type = {Full-text databse},
  author = {{愛如生}},
  urldate = {2016-04-20},
  year = {n.d.},
  file = {Google-Ergebnis für http\://crossasia.org/uploads/tx_sbbtyponewsletter/rte/RTEmagicC_erudition.png.png:/Users/HALmob/Library/Application Support/Firefox/Profiles/6wgnt11i.default/zotero/storage/ZF99V7IR/imgres.html:}
}

this would at least safe users from manually copying the title information, and play nice with current releases of biblatex and biber.

@retorquere
Copy link
Owner

Holy poo, that is a royal mess.

The author of Juris-M suggested a while ago everyone should just give up on Bib(La)TeX and wrap citeproc instead -- I'm beginning to believe that's actually the right approach. Fortunately, the aux/bbl/bcf process/format is exceedingly well documented (ahem) so that should happen any day now (right).

@retorquere retorquere added enhancement and removed bug labels May 5, 2016
@duncdrum
Copy link
Author

duncdrum commented May 5, 2016

Yes and after looking into this again a few years after my last foray into bibtex I agree with Frank. What do you make of my suggestion about titles? Any idea why BBT currently repeats the title and puts it into titleaddon on export?

@retorquere
Copy link
Owner

Oh yeah I have an idea why -- I didn't know what I was doing when I implemented that. The titles sounds sensible, and fits easily in the current implementation. Certainly a hell of a lot easier than biblatexml.

@retorquere
Copy link
Owner

So for the comment I've marked leishuku:1:

  1. How would I decide what goes into titleaddon, and what goes into usere? The reference doesn't provide a preference
  2. Why usere instead of usera?
  3. The data for userd isn't in the reference

@duncdrum
Copy link
Author

duncdrum commented May 6, 2016

Based on swarm intelligence usere seems to be most common, no clue why. It is the last mentioned of the pack in the biblatex documentation, but no clue if thats part of the reason.

Yes, I just put the German into the commented section for demonstration purposes.

Since neither titleaddon nor user[-z] have defined uses I m trying to be consistent with example cases I found in the wild.

User[a-z] is a last resort thing. So if there are only two titlefields use title and titleaddon
If there are three or more use titleaddon for lang variants of the main-title's language (= primary language, langid) based on their iso lang tags; e.g. the title is "zh" titleaddon is "zh-alalc97". " and user[a-z] for the rest.

@duncdrum
Copy link
Author

duncdrum commented May 6, 2016

I did but with leishuku it gave an an error. ID VHCCSTTK I've tried the reference from the citekey example and that worked fine for titleaddon:

@collection{__2000-4,
  location = {{北京}},
  edition = {Revised Edition},
  title = {春秋左傳注},
  isbn = {7-101-00262-5},
  volumes = {4},
  timestamp = {2016-05-05T10:12:00Z},
  langid = {pinyin},
  titleaddon = {Chunqiu Zuozhuan Zhu},
  publisher = {{中华书局}},
  editor = {{楊伯峻}},
  date = {2000},
  keywords = {Chun qiu,Confucius,Zuoqiu; Ming,Zuo zhuan,左丘明,左傳,春秋},
  file = {Yang Bojun 楊伯峻 - 1981 - Chunqiu Zuozhuan Zhu 春秋左傳注.pdf:/Users/HALmob/Library/Application Support/Firefox/Profiles/6wgnt11i.default/zotero/storage/JMGHFDD2/Yang Bojun 楊伯峻 - 1981 - Chunqiu Zuozhuan Zhu 春秋左傳注.pdf:application/pdf},
  origdate = {1981}
}

Titleaddon also works in this example JGV925XZ with no transcription just Chinese and english.

@thesis{__2003-5,
  title = {明清時期出版與文化─以「才子佳人」小說為中心},
  url = {http://ndltd.ncl.edu.tw/cgi-bin/gs32/gsweb.cgi?o=dnclcdr&s=id=%22091NCNU0493006%22.&searchmode=basic},
  pagetotal = {214},
  timestamp = {2016-05-06T19:43:11Z},
  titleaddon = {Publishing and Culture in Ming-Qing period : The Scholar-Beauty Novels as an Example},
  institution = {{國立暨南國際大學}},
  type = {Ph.{{D}}. {{Dissertation}}},
  author = {{顏采容}},
  date = {2003},
  file = {Yan Cairong 顏采容 Ming-Qing Publishing Culture 明清時期出版與文化─以「才子佳人」小說為中心 (200X).pdf:/Users/HALmob/Library/Application Support/Firefox/Profiles/6wgnt11i.default/zotero/storage/K9JJ2XHZ/Yan Cairong 顏采容 Ming-Qing Publishing Culture 明清時期出版與文化─以「才子佳人」小說為中心 (200X).pdf:application/pdf}
}

It only seems to struggle with more items which have both transcription and translation, in addition to the main title.

@retorquere
Copy link
Owner

The reference in VHCCSTTK doesn't have any multi fields as I hadn't merged #483 yet. Could you submit again with https://github.com/retorquere/zotero-better-bibtex/releases/download/builds/zotero-better-bibtex-1.6.50-circle-2373.xpi ?

@retorquere
Copy link
Owner

retorquere commented May 8, 2016

New version at https://github.com/retorquere/zotero-better-bibtex/releases/download/builds/zotero-better-bibtex-1.6.51-circle-2379.xpi -- .51 is out, and it would update over 2373 had you already installed it.

@duncdrum
Copy link
Author

duncdrum commented May 9, 2016

New 2379 error id for leishuku is DHXCG57F still an empty export. I also noticed that there are . in the auto-generated citekey with this version which were absent before.
38EJAWMJ on the other hand works fine:

@thesis{yan_cairong__2003,
  title = {明清時期出版與文化─以「才子佳人」小說為中心},
  url = {http://ndltd.ncl.edu.tw/cgi-bin/gs32/gsweb.cgi?o=dnclcdr&s=id=%22091NCNU0493006%22.&searchmode=basic},
  pagetotal = {214},
  timestamp = {2016-05-06T19:43:11Z},
  titleaddon = {Publishing and Culture in Ming-Qing period : The Scholar-Beauty Novels as an Example},
  institution = {{國立暨南國際大學}},
  type = {Ph.{{D}}. {{Dissertation}}},
  author = {{顏采容}},
  date = {2003},
  file = {Yan Cairong 顏采容 Ming-Qing Publishing Culture 明清時期出版與文化─以「才子佳人」小說為中心 (200X).pdf:/Users/halalpha/Library/Application Support/Firefox/Profiles/sklfgs3h.default/zotero/storage/K9JJ2XHZ/Yan Cairong 顏采容 Ming-Qing Publishing Culture 明清時期出版與文化─以「才子佳人」小說為中心 (200X).pdf:application/pdf}
}

@retorquere
Copy link
Owner

I don't see a . in the citekey? But yeah the [zotero] pattern isn't really great. The only reason it's the default is to help people over.

I've found the problem triggered by the leishuku example, try the updated https://github.com/retorquere/zotero-better-bibtex/releases/download/builds/zotero-better-bibtex-1.6.51-circle-2380.xpi

There's a separate problem that makes my tests fail as multi-lingual references cause Juris-M to error out on the import, which makes tests currently impossible. I could work around it, but I don't know what the side effects of that would be, I've lodged a new issue at Juris-M.

@duncdrum
Copy link
Author

duncdrum commented May 9, 2016

very nice 2380 no more error and the output is as expected. Also the . seems to have been a 2379 artefact, from year = {n.d.}

@online{leishuku,
  title = {中國類書庫},
  url = {http://server.wenzibase.com},
  shorttitle = {leishuku},
  timestamp = {2016-04-21T13:06:09Z},
  langid = {pinyin},
  titleaddon = {Zhōngguó lèishū kù},
  usere = {Database of Chinese Encyclopeadias},
  type = {Full-text databse},
  author = {{愛如生}},
  urldate = {2016-04-20},
  year = {n.d.},
  file = {Google-Ergebnis für http\://crossasia.org/uploads/tx_sbbtyponewsletter/rte/RTEmagicC_erudition.png.png:/Users/halalpha/Library/Application Support/Firefox/Profiles/sklfgs3h.default/zotero/storage/ZF99V7IR/imgres.html:}
}

@retorquere
Copy link
Owner

I can't explain right now how 2379 would be different from 2380 when it comes to generating the citekey when the date is n.d., but if you're happy with the results of 2380, that's good enough for me.

I'm waiting for feedback on Juris-M/zotero#20 before I merge this into master, as I want to have tests in place, and I can't until that issue is either fixed in Juris-M, or I get feedback that my proposed workaround is safe to use.

@retorquere
Copy link
Owner

For confirmation, the latest build passes all tests, including the newly added tests for this issue; the biblatex they export to can be found here. If you could confirm that biblatex looks good, I can merge and release, unless you have more test cases you want me to tackle.

@duncdrum
Copy link
Author

I ve checked with about 10 different items with different types and multi-lingual fields, all worked well. citekeys are solid, and titleaddon and usere show probably more consistence then in biblatex itself. All good from my end, thanks again for the efforts.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Feb 18, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants