-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement Presto Feature Extractor UDF #15
Comments
@kvantricht This is related to the worldcereal-classification repository? Could you complete the details for this task? |
not really, because this is something we'll mainly use for inference. It's something that can (should) be started in parallel to extraction workflow. We need to come up with a first UDF recipe the fetches required inputs, preprocesses according to our needs and then runs Presto to compute the embeddings.
yes, as discussed the UDF for this is worldcereal-specific and needs to import the |
currently being tackled by @HansVRP |
A UDF for both the presto as presto+classification were created and tested. Branch: hv_mvp_inferenceUDF Locally the correct results are obtained on the belgium_good_2020-12-01_2021-11-30.nc dataset. Remotely the UDF runs but an incorrect output comes out, which could be attributed to lacking METEO data or how the lacking data is being filled in. In order to verify, the preprocessing pipeline needs to be adapted to include the AGERA5 dataset as well |
Functional UDF available here: https://github.com/WorldCereal/worldcereal-classification/blob/kvt_mvp_inferenceUDF/minimal_wc_presto/udf_long_worldcereal_inference.py Next steps:
|
@kvantricht I suppose this needs to be done once the extraction pipeline (higher priority) is finished
The text was updated successfully, but these errors were encountered: