Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Additional compression filters #51

Open
kubaraczkowski opened this issue May 15, 2023 · 11 comments
Open

Additional compression filters #51

kubaraczkowski opened this issue May 15, 2023 · 11 comments

Comments

@kubaraczkowski
Copy link

Hi,
First of all - this project is an AWESOME idea, thanks for implementing it!

We're currently using the h5wasm since it's part of the (another great idea) https://myhdf5.hdfgroup.org/.
However, we were wondering if it's possible to add filter libraries to the system.
In particular, we're selecting the compression filter to fit our data. We've seen good results with zstandard, especially compared to gzip, however the MyHDF5 doesn't come with any additional filters.

On desktop (HDFView) we can easily install additional plugins, but since I suppose the wasm version is 'sandboxed' it might not be possible to easily add filters? Is it possible to create a custom version with zstandard compiled in and use that binary as base for a customized, self-hosted myhdf5 viewer?

Thanks again for the great tool!

@bmaranville
Copy link
Member

Do you have any example HDF5 files with zstandard compression applied, that you can attach here?

@bmaranville
Copy link
Member

I wouldn't call it easy per se, but it is possible:

Building h5wasm to use filter plugins: Example

There are modifications required to every part of the build, mostly to enable dynamic linking using the MAIN_MODULE and SIDE_MODULE flags to the emscripten compiler.

Here is an example for building a zstandard plugin for h5wasm.

Build a version of libhdf5-wasm as a shared lib

git clone https://github.com/usnistgov/libhdf5-wasm

Make this change:

diff --git a/CMakeLists.txt b/CMakeLists.txt
index 1a96d3d..48babf4 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -19,7 +19,7 @@ FetchContent_Populate(hdf5)
 # set the project name
 project(libhdf5-wasm-build)
 
-option(BUILD_SHARED_LIBS "Build shared libs" OFF)
+option(BUILD_SHARED_LIBS "Build shared libs" ON)
 option(HDF5_BUILD_EXAMPLES "Build Examples" OFF)
 option(HDF5_BUILD_TOOLS "Build Tools" OFF)
 option(HDF5_BUILD_UTILS "Build Utils" OFF)

...and then build with

make libhdf5-1_12_1-wasm.tar.gz

Build a version of h5wasm as MAIN_MODULE

Make this change (substitute your local path to libhdf5 lib):

diff --git a/CMakeLists.txt b/CMakeLists.txt
index ce8c6cb..1221ccd 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -8,9 +8,9 @@ project(H5WASM
 
 FetchContent_Declare(
   libhdf5-wasm
-#  URL file:///home/brian/dev/libhdf5-wasm/libhdf5-wasm.tar.gz
-   URL https://github.com/usnistgov/libhdf5-wasm/releases/download/v0.1.1/libhdf5-1_12_1-wasm.tar.gz
-   URL_HASH SHA256=e9bb11d89c4f26fa79b9cf1dab6159640c7b184ebf00dc97b098cd4f6de49bfe
+   URL file:///home/bbm/dev/libhdf5-wasm/libhdf5-1_12_1-wasm.tar.gz
+#   URL https://github.com/usnistgov/libhdf5-wasm/releases/download/v0.1.1/libhdf5-1_12_1-wasm.tar.gz
+#   URL_HASH SHA256=e9bb11d89c4f26fa79b9cf1dab6159640c7b184ebf00dc97b098cd4f6de49bfe
 )
 FetchContent_MakeAvailable(libhdf5-wasm)
 
@@ -22,6 +22,7 @@ set_target_properties(hdf5_util PROPERTIES
     LINK_FLAGS "-O3 --bind  \
     -lidbfs.js \
     -lworkerfs.js \
+    -s MAIN_MODULE=1 \
     -s ALLOW_TABLE_GROWTH=1 \
     -s ALLOW_MEMORY_GROWTH=1 \
     -s WASM_BIGINT \
@@ -34,6 +35,7 @@ set_target_properties(hdf5_util PROPERTIES
     -s EXPORTED_FUNCTIONS=\"['_H5Fopen', '_H5Fclose', '_H5Fcreate', '_malloc', '_free']\""
     RUNTIME_OUTPUT_DIRECTORY ${CMAKE_CURRENT_SOURCE_DIR}/dist/esm
     RUNTIME_OUTPUT_NAME hdf5_util
+    POSITION_INDEPENDENT_CODE ON
 )
 
 add_executable(hdf5_util_node src/hdf5_util.cc)

and rebuild h5wasm with

make clean
make
npm run build

Build libzstd

git clone https://github.com/kig/zstd-emscripten
mkdir build && cd build && CFLAGS=-fPIC emcmake cmake ../cmake
emmake make -j4
make DESTDIR=../dist/ install

Build HDF5Plugin-Zstandard

git clone https://github.com/aparamon/HDF5Plugin-Zstandard

overwrite CMakeLists.txt with this (changing to match the path to your local libhdf5-wasm and your local libzstd):

cmake_minimum_required(VERSION 3.14)
include(FetchContent)

project(zstd_hdf5)

FetchContent_Declare(
  libhdf5-wasm
   URL file:///home/bbm/dev/libhdf5-wasm/libhdf5-1_12_1-wasm.tar.gz
)
FetchContent_MakeAvailable(libhdf5-wasm)

# options
set(PLUGIN_INSTALL_PATH "/usr/local/hdf5/lib/plugin" CACHE PATH
      "Where to install the dynamic HDF5-plugin")

# sources
set(SOURCES zstd_h5plugin.c)
set(PLUGIN_SOURCES zstd_h5plugin.c)

include_directories("/home/bbm/dev/zstd-emscripten/dist/usr/local/include")

# HDF5 plugin as static library
add_library(zstd_h5_plugin ${PLUGIN_SOURCES})
set_target_properties(zstd_h5_plugin PROPERTIES
    OUTPUT_NAME H5Zzstd
    POSITION_INDEPENDENT_CODE ON
)
target_link_libraries(zstd_h5_plugin hdf5-wasm)
install(TARGETS zstd_h5_plugin DESTINATION ${PLUGIN_INSTALL_PATH} COMPONENT HDF5_FILTER_DEV)

compile:

mkdir build && cd build && emcmake cmake ..
emmake make
emcc -s SIDE_MODULE=1 -s LLD_REPORT_UNDEFINED -s EXPORT_ALL=1 \
  libH5Zzstd.a /home/bbm/dev/zstd-emscripten/dist/usr/local/lib/libzstd.a \
  -o libH5Zzstd.so

Use this new plugin on a web page:

// Using the hdf5_hl.js you compiled above as MAIN_MODULE:
import h5wasm from "./dist/esm/hdf5_hl.js";

const { FS } = await h5wasm.ready;

const plugin_path = "/usr/local/hdf5/lib/plugin";

FS.mkdirTree(plugin_path);

let plugin_response = await fetch("./h5z/libH5Zzstd.so");
let plugin_ab = await plugin_response.arrayBuffer();
FS.writeFile(`${plugin_path}/libH5Zzstd.so`, new Uint8Array(plugin_ab));

let response = await fetch("./test_zstd.h5");
let ab = await response.arrayBuffer();
FS.writeFile("test_zstd.h5", new Uint8Array(ab));

const f = new h5wasm.File("test_zstd.h5", "r");
console.log(f.get("zstd_data").value)

@bmaranville
Copy link
Member

bmaranville commented May 16, 2023

I'm not sure what the downside would be to compiling h5wasm as MAIN_MODULE all the time... I found that if I use MAIN_MODULE=2 and I export stderr, memset and memcpy the plugin loads just fine, and the wasm code does not grow significantly from the way it was built before. Then people making plugins could use the default build of h5wasm and not have to build it themselves.

(Though I don't know if the exports above are sufficient for all plugins)

@kubaraczkowski
Copy link
Author

Wow, to be honest this doesn't look too bad at all! No code changes and the CMake differences are small as well.
Indeed, why wouldn't libhdf5 be a 'main module' by default? Probably simply because it doesn't have to be ?

@bmaranville
Copy link
Member

Yes, because it's an extra flag that we didn't need (until now). I'm preparing a new release with MAIN_MODULE enabled...

@bmaranville
Copy link
Member

Ok, I just released

  • h5wasm v0.5.0 (with MAIN_MODULE=2)
  • libhdf5-wasm v0.3.0_3.1.28 (SHARED lib, can be use to build SIDE_MODULE and MAIN_MODULE)

So you don't need a special build of h5wasm or libhdf5-wasm anymore to build your plugins, you can use the released versions.

@bmaranville
Copy link
Member

If you want to help out I'd be happy to join an effort on a central repository for h5wasm plugins, probably based on https://github.com/HDFGroup/hdf5_plugins in the same spirit as https://github.com/silx-kit/hdf5plugin

@kubaraczkowski
Copy link
Author

Wow, let me just say it - you're awesome @bmaranville, and it seems wasm rocks as well!
I'm going to try it with myHDF5 asap, or try to get them to implement it immediately!

@axelboc
Copy link
Collaborator

axelboc commented May 22, 2023

Totally on board with this! I've opened an issue in the myHDF5 repo: https://gitlab.esrf.fr/ui/myhdf5/-/issues/3 -- since our knowledge of WASM and Emscripten is very limited, contributions are very welcome! 😁

@bmaranville
Copy link
Member

work has started at https://github.com/h5wasm/h5wasm-plugins (early alpha)

If you want to give the plugins there a try and let me know if they work for you, I'd appreciate it!
There are half a dozen or so plugins built so far (including zstd).

@axelboc
Copy link
Collaborator

axelboc commented Oct 25, 2023

Awesome stuff @bmaranville! 🏆 I'll play around with it in H5Web today.

FYI, I've just moved over the repo of myHDF5 to GitHub to facilitate external contributions: https://github.com/silx-kit/myhdf5 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants