Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add duplicate title, subject, and description fields for text in multiple languages #4633

Open
amberleahey opened this issue Apr 30, 2018 · 4 comments
Labels

Comments

@amberleahey
Copy link

amberleahey commented Apr 30, 2018

Having fields like title and description offer alternative language versions would also be helpful for discovery.

From Julian: from what I can tell so far, DataCite 3.1 schema lets you specify the language of Title, Subject and Description with a long attribute (4.1 adds the xml lang attribute to Rights) - https://schema.datacite.org/meta/kernel-4.1/doc/DataCite-MetadataKernel_v4.1.pdf. The schema says it accepts only IETF BCP 47 and ISO 639-1 language codes. But I don't think Dataverse knows the ISO language codes for the languages it displays in the Citation block (I vaguely remember a comment about this in a github issue or maybe a Google Group post but can't find it). The Consorcio Madroño Dataverse does this with the DataCite metadata they publish for each dataset: https://edatos.consorciomadrono.es/api/datasets/export?exporter=oai_datacite&persistentId=doi%3A10.21950/O53TLR

And most or all of the DDI elements that Dataverse uses can include a lang attribute (http://www.ddialliance.org/Specification/DDI-Codebook/2.5/XMLSchema/field_level_documentation_files/schemas/xml_xsd/attributes/lang.html). Looks like it accepts any value for now.

see related ticket #4632 for adding record language qualifier

@pdurbin
Copy link
Member

pdurbin commented Jun 28, 2018

@amberleahey I'm not sure if you caught this during the talk by @pengchengluo at the Dataverse Community Meeting the other week (slides at https://schd.ws/hosted_files/dataversecommunitymeeting20/eb/Slides%20-%20Support%20University%20Students%E2%80%99%20Data%20Driven%20Research%20in%20a%20National%20Contest%20with%20PKU%20Open%20Research%20Data%20Platform--v0.3.pdf ), but a while back he implemented the ability to enter metadata in multiple languages. Here are English and Chinese screenshots from http://opendata.pku.edu.cn/dataset.xhtml?persistentId=doi:10.18170/DVN/CX1SM6 for example:

screen shot 2018-06-27 at 11 21 36 pm

screen shot 2018-06-27 at 11 21 41 pm

@pengchengluo
Copy link
Contributor

pengchengluo commented Jun 29, 2018

Hi, @pdurbin and @amberleahey, do you know about CERIF which is a European standard on Current Research Information Systems and supported by several commerical and opensource CRIS, for example Elsevier Pure .

CERIF supports multiple languages. In its database model, the attributes of some entity have a cfLangCode field. For example, for the Project entity, it has cfProjTitle attribute and this cfProjTitle table has a cfLangCode field.
image

(https://www.eurocris.org/Uploads/Web%20pages/CERIF-1.6/documentation/MImage.html)

In our Peking University implementation, we added additional field for dataverse to store Chinese metadata and added additinal metadata blocks for dataset. We didn't support Chinese and English metadata for data file. So, we also hope harvard dataverse itself can support multiple language metadata for dataverse, dataset and datafile coherently, although it seems need to change a lot and multiple language information retrieval also is a challenge.

@pdurbin
Copy link
Member

pdurbin commented Jun 29, 2018

@pengchengluo no, I wasn't aware of that standard. @scolapasta and @JayanthyChengan check this out. ^^

@pdurbin pdurbin added Type: Feature a feature request User Role: Depositor Creates datasets, uploads data, etc. labels Oct 8, 2023
@DS-INRAE DS-INRAE moved this to ⚠️ Needed/Important in Recherche Data Gouv Jul 10, 2024
@cmbz
Copy link

cmbz commented Sep 16, 2024

2024/09/16: Reviewed, will investigation further.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: ⚠️ Needed/Important
Development

No branches or pull requests

5 participants