Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request/Idea: Support for .eln file format #9363

Closed
NicolasCARPi opened this issue Feb 3, 2023 · 8 comments
Closed

Feature Request/Idea: Support for .eln file format #9363

NicolasCARPi opened this issue Feb 3, 2023 · 8 comments

Comments

@NicolasCARPi
Copy link
Contributor

Hello everyone,

I am Nicolas CARPI, lead dev of eLabFTW, an Electronic Lab Notebook. I am also one of the founder of The ELN Consortium, where we managed to group several ELN software together to improve interoperability of these softwares.

We achieved this by designing a file format called .eln: see ELN file format repo.

Fear not, we didn't re-invent the wheel, it's basically a RO-Crate (Research Object crate) inside a zip.

This format allows for import/export of research data such as experimental results, protocols, samples descriptions, experiments templates, etc...

Basically, there is a .json file at the root that describes the content + some data such as author name/affiliation/orcid, etc... And the nice thing is that all the fields are standardized.

I'm opening this issue to:

  • Let you know about this format
  • Ask you if you're interested eventually to add support for it, to facilitate depositing raw data along with how the data was produced: meaning the lab notebook entries corresponding to the raw data

You can look here to get a precise idea of what it looks like.

I believe adding support for this format would not be too difficult because opening a zip and reading json are very basic things in any language.

Users of ELN often ask the question about Data repositories, so there is definitely a strong demand from researchers to be able to upload their procedures/experiments to a Data repo without having to input everything manually again.

Please let me know what you think about this proposal.

Best,
~Nico

Overview of the Feature Request

Support upload of .eln files.

What kind of user is the feature intended for?

Depositor.

What inspired the request?

The wish to increase interoperability between ELNs and data repositories.

What existing behavior do you want changed?

I want depositors to be able to upload a .eln file that will be correctly understood by Dataverse, and populate fields based on what it finds in the .eln file.

Any brand new behavior do you want to add to Dataverse?

Ability to parse .eln files, which are json files inside a zip.

Any related open or closed issues to this feature request?

Not that I know of.

@pdurbin
Copy link
Member

pdurbin commented Feb 3, 2023

@NicolasCARPi hello! Thanks for our interest in Dataverse!

As you many know, any file can be uploaded to Dataverse but currently we don't detect the .eln extension:

Screen Shot 2023-02-03 at 10 22 20 AM

That's easy to fix by adding ".eln" and its contentType to a couple text files. Here's an example of .geojson being added:

Would you (or someone you know) be interested in making a similar pull request to add .eln?

Once the contentType is known, the next step is to add it to our list of files that get unzipped such a gzipped FITS files and Shapefiles (and vanilla zip files, of course).

Here's how that code looks:

// A few special cases:
// if this is a gzipped FITS file, we'll uncompress it, and ingest it as
// a regular FITS file:
if (finalType.equals("application/fits-gzipped")) {

https://github.com/IQSS/dataverse/blob/v5.12.1/src/main/java/edu/harvard/iq/dataverse/util/FileUtil.java#L506

It's cool that under the covers .eln is RO-Crate, which we've been asked about:

I like that there's a json file with author name/orcid, etc., like you said. Parsing those files would another step, of course, and populating metadata automatically We do this for FITS and we're talking about doing it for NetCDF/HDF5. Metadata is stored at the dataset level so you might want to think about what should be done if multiple .eln files are uploaded with divergent metadata.

Anyway, again, I'd say the first step is to even recognize .eln files. Then unzip them. Then think about what do to with metadata. Yes, researchers shouldn't have to enter it over and over again!

@NicolasCARPi
Copy link
Contributor Author

Thank you @pdurbin, it's quite helpful to guide me in your codebase like you did 👍

Please have a look at: #9366

@pdurbin
Copy link
Member

pdurbin commented Feb 10, 2023

@NicolasCARPi merged! Thanks! Looking good!

Screen Shot 2023-02-10 at 7 28 46 AM

Another thought. Since this is just a zip file under the covers, you could probably contribute a slight variation on the Zip previewer/downloader at https://github.com/gdcc/dataverse-previewers by changing the final line to this:

  "contentType":"application/application/vnd.eln+zip"

(We do this for images too. The same image previewer handles PNG, JPG, and GIF but it's technically a different external tool with its own database id once it's loaded up.)

To see how the zip previewer/downloader works, please check out https://edmond.mpdl.mpg.de/file.xhtml?fileId=199941&version=11.0 or the screenshots at IQSS/dataverse.harvard.edu#196

@NicolasCARPi
Copy link
Contributor Author

Yep, it would indeed work fine to re-use the zip preview. I'll look into it.

@pdurbin
Copy link
Member

pdurbin commented Feb 10, 2023

@NicolasCARPi cool. @haarli added it. Perhaps they can help? Worth asking or at least mentioning here. 😄

By the way, you're in luck. We're about to cut a release (5.13) and your PR is in. Timing is everything. 😄

Have a nice weekend!

@haarli
Copy link
Contributor

haarli commented Feb 13, 2023

@NicolasCARPi Sure, let us know if you have any question about the Zip Previewer. It might work without any modifications, just by registering the Zip Previewer as a new previewer with the modified content type, as @pdurbin described above. Might be worth a try.

@pdurbin
Copy link
Member

pdurbin commented Feb 13, 2023

@haarli it works without modifications! @NicolasCARPi just made a PR. Check it out:

@pdurbin
Copy link
Member

pdurbin commented Oct 8, 2023

@NicolasCARPi we have some initial support for .eln so I'm closing this. Please feel free to open follow up issues.

Also, thanks for hosting the ELN Consortium call the other day! It was fun!

@pdurbin pdurbin closed this as completed Oct 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants