Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Software emulation of mlx.float64 on GPU #1905

Open
kyrollosyanny opened this issue Feb 25, 2025 · 3 comments
Open

Feature Request: Software emulation of mlx.float64 on GPU #1905

kyrollosyanny opened this issue Feb 25, 2025 · 3 comments

Comments

@kyrollosyanny
Copy link

Hello,

I am very excited that MLX currently supports mx.float64 on CPU. I know that Metal does not support float64. However, I believe it can be added with software emulation. It would be extremely helpful to optimization and inverse design problems to add this feature. float32 is just not enough for simulating physics (for example, optical ray tracing, image processing, VR simulations) and running large data on the CPU is slow.

In my opinion, having an option to run float64 on GPU is the one of the remaining big differences between PyTorch and MLX. I've mainly switched to MLX, but running into accuracy errors because of float32 is starting to be more of an issue.

Thank you

@awni
Copy link
Member

awni commented Feb 26, 2025

Emulating FP64 on the GPU is going to be quite slow and there's a good chance it will wipe out any speed improvements you might expect from running on the GPU.

I think your best bet for running locally in higher precision is:

  • Find a way to make the CPU faster. If there are specific ops that are slow, file an issue and we can look into speeding them up.
  • Offload parts of your computation that can be lower precision (hopefully large matrix multiplies) to the GPU and then run the higher precision stuff on the CPU.

@kyrollosyanny
Copy link
Author

Got it. Naive question, When you say locally, does it mean there is a way to run MLX on the cloud in higher precision? Thanks a lot

@awni
Copy link
Member

awni commented Feb 26, 2025

What I meant by that is any framework that uses an Apple gpu wiil have the same problem including PyTorch's MPS back-end (which does not support double for the same reason).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants