-
Notifications
You must be signed in to change notification settings - Fork 196
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Questions Evaluation #176
Comments
|
Thanks for the answer! Unfortunately, another problem occured now: In the json-file, many routes get the result "Failed - Agent couldn't be set up", while others are "Completed". I never got this error when I evaluated only a single model at a time. I am running CARLA 0.9.10.1 in a Docker container with the installed additional maps on a A100 GPU, 1TB RAM and 128 Core CPU. I startet the Docker with the following command: Do you have any explaination for this? |
Well you have to look at the python error logs to see exactly why the code failed during agent setup. Having some failures on larger clusters is normal I think. |
The cases are all nonreproducable but it might be the timout that is too low with 60 seconds for a busy cluster. I think I might try 240 seconds for --timeout. Here sare some of the error logs why the agent setup failed: Traceback (most recent call last): Traceback (most recent call last): Traceback (most recent call last): |
We typically use |
I have one final question regarding this topic: Your advices worked (Thank you!) and the evaluation of your provides models brought the following resuts: |
Hm two things come to mind that might be the problem. Your numbers look a bit like they were not parsed with the result_parser. As for the blocked metric. Do you use the leaderboard client from this repository for evaluation or a different one? For the expert the DS, RC and IS look the same. If the results from TransFuser are a retrained model than I think it's possible that you happended to end up with a model that is more passive and gets blocked instead of pushing other cars out of the way (which gets more RC and more collisions so lower IS). |
The result parser gives the following results: I use this repository except that I run the evaluation in a Docker with a 0.9.10.1 CARLA image to which I added the additional maps. I use your "leaderboard_evaluator_local.py". For the expert the DS, RC and IS look the same. If the results from TransFuser are a retrained model than I think it's possible that you happended to end up with a model that is more passive and gets blocked instead of pushing other cars out of the way (which gets more RC and more collisions so lower IS). This makes sense, but the results are from the three models that you provided. Shouldn't the evaluation result then be more similar to yours? |
The numbers look reasonable now with the result parser I think. You can check that your simulator runs with the -opengl option but I think there is likely no problem with you setup anymore. |
I started CARLA in the docker with the -opengl option (see third comment in this issue). Your explaination makes sense, thank you very much for the quick responses! |
Have your evaluation results improved? To achieve a relatively good score, is it necessary to use the model from the 31st epoch for evaluation and integrate three models together for the evaluation? |
Hello,
first of all, I want to thank you for your detailed code. I had the same problem as #175, but now I have soem questions regarding the ensemble evaluation:
Thank you!
The text was updated successfully, but these errors were encountered: