Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lazy loading in backend environment? (Deno) #69

Open
aLemonFox opened this issue Mar 4, 2024 · 10 comments
Open

Lazy loading in backend environment? (Deno) #69

aLemonFox opened this issue Mar 4, 2024 · 10 comments

Comments

@aLemonFox
Copy link

aLemonFox commented Mar 4, 2024

I have a backend server where I want to lazy load a dataset to limit my bandwidth usage. Based on #4 I tried to implement it like so:

try {
  const Modules = await h5wasm.ready;
  const { FS } = Modules;

  FS.createLazyFile("/", "current.h5", signedUrl, true, false);
  const file = new h5wasm.File("current.h5");

  console.log(file);
} catch (err) {
  console.error(err);
}

My file is stored in a s3 compatible storage bucket with range request supported, but this doesn't seem to work as I get:

TypeError: Cannot read properties of null (reading 'length')
    at FSNode.get [as usedBytes] (file:///.../AppData/Local/deno/npm/registry.npmjs.org/h5wasm/0.7.1/dist/esm/hdf5_util.js:8:3748767)
    at Object.getattr (file:///.../AppData/Local/deno/npm/registry.npmjs.org/h5wasm/0.7.1/dist/esm/hdf5_util.js:8:3707616)
    at stat (file:///.../AppData/Local/deno/npm/registry.npmjs.org/h5wasm/0.7.1/dist/esm/hdf5_util.js:8:3732696)
    at Object.doStat (file:///.../AppData/Local/deno/npm/registry.npmjs.org/h5wasm/0.7.1/dist/esm/hdf5_util.js:8:3750096)
    at ___syscall_fstat64 (file:///.../AppData/Local/deno/npm/registry.npmjs.org/h5wasm/0.7.1/dist/esm/hdf5_util.js:8:3752899)
    at <anonymous> (wasm://wasm/00a81936:1:557751)
    at <anonymous> (wasm://wasm/00a81936:1:273169)
    at <anonymous> (wasm://wasm/00a81936:1:2316618)
    at <anonymous> (wasm://wasm/00a81936:1:252235)
    at <anonymous> (wasm://wasm/00a81936:1:295992)

I think my environment is not setup properly as it is not web-based. Is there anything else needed to configure lazy url based access in Deno (and Node)?

@bmaranville
Copy link
Member

I'm not completely sure - the createLazyFile function seems to have been written specifically for the browser context, as it uses
new XMLHttpRequest... in the code (see https://github.com/emscripten-core/emscripten/blob/53f661cb11ba849403c060b97208f88775484d98/src/library_fs.js#L1678C1-L1678C42)

You might be able to use this shim library to get it to work: https://www.npmjs.com/package/xmlhttprequest

@aLemonFox
Copy link
Author

That could work. Do you have an idea on how to patch it as I don't think the FS comes bundled with this lib right?

@bmaranville
Copy link
Member

deno apparently has a solution for xhr, and I was able to get past your first error:

> let xhr = await import("https://deno.land/x/xhr@0.3.1/mod.ts")
undefined
> try {
  const Modules = await h5wasm.ready;
  const { FS } = Modules;

  FS.createLazyFile("/", "current.h5", signedUrl, true, false);
  const file = new h5wasm.File("current.h5");

  console.log(file);
} catch (err) {
  console.error(err);
}
Cannot do synchronous binary XHRs outside webworkers in modern browsers. Use --embed-file or --preload-file in emcc
undefined

More investigation required... I don't know the current status of web workers in Deno.

@aLemonFox
Copy link
Author

Hm yeah it gets through until your error. Deno has built in support for web workers, but it does not seem to make a difference when using those. I am trying to see if I can get something to work as well.

// main.ts
const worker = new Worker(
  new URL("./worker.ts", import.meta.url).href,
  {
    type: "module",
  },
);
worker.postMessage({ example: 'hello world' });
// worker.ts
self.onmessage = async (e) => {
  const Modules = await h5wasm.ready;
  const { FS } = Modules;

  FS.createLazyFile("/", "current.h5", signedUrl, true, false);
  // ^ results in the same error
  const file = new h5wasm.File("current.h5");
  self.close();
};

@bmaranville
Copy link
Member

It looks like Deno doesn't support synchronous fetch/xhr even in a web worker. I think the sync flag is required for the createLazyFile implementation in Emscripten, and it's the test for that flag that is failing and throwing the current error. I don't see any way to do a synchronous fetch in Deno, and I don't see any way to do an async file read in HDF5 (without writing a new Virtual File Driver), so I'm not sure if there's an easy path forward. If you find something, please let me know!

@aLemonFox
Copy link
Author

aLemonFox commented Mar 16, 2024

I havn't found a way to make it work using your lib, but made a workaround using gdal-async. It has support for writing COG geotiff files (https://www.cogeo.org/) which also allow for on demand loading of the dataset.

Then using geoblaze and georaster I am able to slice the needed values of my dataset on demand.

Nevermind it has the same issue :/

@aLemonFox
Copy link
Author

aLemonFox commented Mar 16, 2024

Well, there seems to be an interesting difference in the way dependencies are handled in Deno. For example importing georaster using npm:georaster vs https://esm.sh/georaster result in different outcomes for worker based requests. Where the esm version does not work, the npm: import works fine.

By the way, I don't know if this makes sense for the lib, but also publishing on JSR might simplify the build flow for different runtimes. I am not sure since I have never used any lib from jsr but it seems cool.

@bmaranville
Copy link
Member

Thanks for the tip... I'll check out JSR.

@bmaranville
Copy link
Member

Is it possible the npm: import is using a different implementation of a web worker, instead of the one distributed with Deno?

@aLemonFox
Copy link
Author

Yeah that seems like it, but I can't figure out how to change it. I've worked on another solution using some nodejs serverless functions to handle this as a service.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants