Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add inference API of AMR #40

Merged
merged 20 commits into from
Oct 22, 2024
Merged

Add inference API of AMR #40

merged 20 commits into from
Oct 22, 2024

Conversation

h-munakata
Copy link
Contributor

Summary

  1. Add inference API of AMR
  • Add msclap in dependencies
  • Add configuration for AMR in lighthouse/models.py
    I implemented encode_audio() separately from encode_video().
model.encode_audio("api_example/1a-ODBWMUAE.wav")
  1. Modularize PANNs and CLAP
  • Class AudioEncoder in lighthouse/feature_extractor/audio_encoder.py only have model selector and lighthouse/feature_extractor/audio_encoders/{clap_a | pann}.py has individual model.

Future work

  • Add Gradio demo

@h-munakata
Copy link
Contributor Author

The test has failed... I'm checking.

setup.py Outdated
@@ -5,6 +5,6 @@
version='0.1',
install_requires=['easydict', 'pandas', 'tqdm', 'pyyaml', 'scikit-learn', 'ffmpeg-python',
'ftfy', 'regex', 'einops', 'fvcore', 'gradio', 'torchlibrosa', 'librosa',
'clip@git+https://github.com/openai/CLIP.git'],
'clip@git+https://github.com/openai/CLIP.git', 'msclap'],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add msclap before 'clip@git+https://github.com/openai/CLIP.git' ?

@@ -439,7 +439,7 @@ def check_valid_combination(dataset, feature, domain):
is_valid = check_valid_combination(args.dataset, args.feature, args.domain)

if is_valid:
option_manager = BaseOptions(args.model, args.dataset, args.feature, args.domain)
option_manager = BaseOptions(args.model, args.dataset, args.feature, False, args.domain)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you remove magic number of False? Instead, please insert variable X = False and then use it here for readability?

@awkrail
Copy link
Contributor

awkrail commented Oct 21, 2024

@h-munakata In addition, no tests are newly provided. Hence, could you add some tests for CLAP feature extractor and AMR inference API?

@h-munakata
Copy link
Contributor Author

@h-munakata In addition, no tests are newly provided. Hence, could you add some tests for CLAP feature extractor and AMR inference API?

I see. I will add some tests in tests/test_models.py later.

@h-munakata
Copy link
Contributor Author

For the test and the Gradio demo, I think {moment | qd | cg | ...}-detr for CLAP features is useful.
Should I prepare the pre-trained weights of all seven models trained with Clotho-Moment?

@awkrail
Copy link
Contributor

awkrail commented Oct 21, 2024

For the test and the Gradio demo, I think {moment | qd | cg | ...}-detr for CLAP features is useful.
Should I prepare the pre-trained weights of all seven models trained with Clotho-Moment?

Yes, I agree. Could you prepare the pre-trained weights on your Zenodo?

@h-munakata
Copy link
Contributor Author

h-munakata commented Oct 21, 2024

Yes, I agree. Could you prepare the pre-trained weights on your Zenodo?

Sure. I will upload after the training.

@awkrail
Copy link
Contributor

awkrail commented Oct 21, 2024

@h-munakata Sorry, I misunderstood your question. I think that in your paper, CG-DETR (or QD-DETR) achieved the highest performance. So, no need for training. All you need to do is upload the current trained models on Zenodo and accessible from Lighthouse.

@h-munakata
Copy link
Contributor Author

My intention in training all models is to add a CLAP feature to the double for loop of FEATURE and MODEL used in the demo and test to make it easier to handle.
As you said, I will stop training all models and define AUDIO_FEATURE variable in the demo and test.

All you need to do is upload the current trained models on Zenodo and accessible from Lighthouse.

I see. I'll upload the model in the next commit for the test.

@awkrail
Copy link
Contributor

awkrail commented Oct 21, 2024

@h-munakata BTW, could you finish implementing web demo tomorrow? I will tag the current version as v1.0, and wondering whether you can finish this implementation by tomorrow.

@h-munakata
Copy link
Contributor Author

h-munakata commented Oct 21, 2024

BTW, could you finish implementing web demo tomorrow?

Yes, I want to make it in time for the DCASE workshop the day after tomorrow.

@awkrail awkrail merged commit c2d1d3e into line:main Oct 22, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants