Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Runtime Error for Windows Setup #1

Closed
zhenchen-jay opened this issue Dec 13, 2024 · 4 comments
Closed

Runtime Error for Windows Setup #1

zhenchen-jay opened this issue Dec 13, 2024 · 4 comments

Comments

@zhenchen-jay
Copy link

I attempted to run the code on my local machine following the provided instructions but encountered some issues.

I successfully set up Docker and compiled your code within the environment. However, when I ran it using Jupyter Notebook, I was unable to get the expected result previews and instead encountered some error messages.

For example, while running curtain.ipynb. I was able to correctly view the meshes after executing the first cell:
image

But the second cell give me the runtime error as:
image

I also tried running the drape.ipynb example but observed slightly different outcomes:

The first cell executed without issues:
image

The second cell, however, produced different results. While the rendering appeared correct, the CCD operation failed with the following error:
image

image

I am not sure what might be causing these discrepancies. I would greatly appreciate it if you could provide some guidance on resolving these issues. Here is my GPU information:

+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 560.35.02              Driver Version: 560.94         CUDA Version: 12.6     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 4090        On  |   00000000:41:00.0  On |                  Off |
|  0%   39C    P8             20W /  450W |    1461MiB /  24564MiB |      5%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A        25      G   /Xwayland                                   N/A      |
|    0   N/A  N/A        31      G   /Xwayland                                   N/A      |
+-----------------------------------------------------------------------------------------+
@ryichando
Copy link
Collaborator

Thank you for reporting the issue! After reviewing the log, it seems there is an issue with the contact collection/allocation, because the collision should not occur at the very first simulation step, and the contact count is reported as negative. This means that things are already wrong before the CCD is dispatched.

I’ve checked that all the demos are working on my end in my Linux environment. Please take a look at the long-shot video (https://drive.google.com/file/d/1MtOEOvm5KEPwvgO-oeMY736he7H2DoNp/view?usp=sharing). I usually don’t work on Windows, but I will set up my Windows environment again to see if I can reproduce the same issue.

In the meantime, here are some possible solutions:

  • Try accessing the Jupyter frontend from a different PC or a smartphone over the LAN. This way, the GPU solver can use all computational resources during the simulation, so this reduces the chance of issues.

  • Set up an instance on vast.ai. The cost is less than $1 per hour. If you encounter an issue with this setup, I should be able to reproduce it and expect to fix it quickly.

@ryichando
Copy link
Collaborator

I was able to reproduce the exact same issues on a driver version 560.94, as shown in this video.

I have also confirmed that upgrading to the latest driver 566.36 fixed the issue, as demonstrated in this video.

I don't know what the problem is, but could you try upgrading to the latest graphics driver on your host, unless you have a specific reason to stick with version 560.94, and see if that resolves the issue on your side?

@ryichando
Copy link
Collaborator

More updates: I’ve removed the thrust dependencies, and this did the trick: the solver now runs on the driver 560.94 for Windows. Here’s a link to the recorded video.

To try this new fix, please delete the Docker container and start over by recreating a new container. This ensures the process is clean and minimizes errors carried over from previous traces.

@zhenchen-jay
Copy link
Author

The new fix works on my machine! Thanks for the help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants