Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CMake: Enabled using Accelerate on x86_64 / x64 #1625

Merged
merged 2 commits into from
Nov 28, 2024

Conversation

stemann
Copy link
Contributor

@stemann stemann commented Nov 26, 2024

Proposed changes

CMake: Enabled using Accelerate on x86_64 / x64.

Contributes to #1201 .

Cf. JuliaPackaging/Yggdrasil#9761

Checklist

Put an x in the boxes that apply.

  • I have read the CONTRIBUTING document
  • I have run pre-commit run --all-files to format my code / installed pre-commit prior to committing changes
  • I have added tests that prove my fix is effective or that my feature works
  • I have updated the necessary documentation (if needed)

@awni
Copy link
Member

awni commented Nov 27, 2024

This change seems pretty reasonable.. but I think we can just remove MLX_BUILD_ARM since it's entirely unused at this point. Do you mind removing it?

@awni
Copy link
Member

awni commented Nov 27, 2024

Also if we remove it we can also close #1626 which will be irrelevant.

@stemann
Copy link
Contributor Author

stemann commented Nov 28, 2024

Right, I will look into it.

@stemann stemann force-pushed the feature/cmake_x86_64 branch from 1c8faa0 to a63d05a Compare November 28, 2024 10:57
Copy link
Member

@awni awni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, will merge when tests clear!

@awni awni merged commit 974bb54 into ml-explore:main Nov 28, 2024
6 checks passed
@stemann stemann deleted the feature/cmake_x86_64 branch November 30, 2024 10:46
@zcbenz
Copy link
Contributor

zcbenz commented Dec 7, 2024

@stemann @awni This change makes mac x64 build crash when running tests, you can simply reproduce by running python -m unittest discover python/tests:

Crashed Thread:        5

Exception Type:        EXC_BAD_ACCESS (SIGSEGV)
Exception Codes:       UNKNOWN_0xD at 0x0000000000000000
Exception Codes:       0x000000000000000d, 0x0000000000000000

Termination Reason:    Namespace SIGNAL, Code 11 Segmentation fault: 11
Terminating Process:   exc handler [21651]

VM Region Info: 0 is not in any region.  Bytes before following region: 4312813568
      REGION TYPE                    START - END         [ VSIZE] PRT/MAX SHRMOD  REGION DETAIL
      UNUSED SPACE AT START
--->  
      __TEXT                      101105000-101107000    [    8K] r-x/r-x SM=COW  /usr/local/Cellar/python@3.12/3.12.6/Frameworks/Python.framework/Versions/3.12/Resources/Python.app/Contents/MacOS/Python

Thread 0::  Dispatch queue: com.apple.main-thread
0   libsystem_kernel.dylib        	    0x7ff804e9cf7a __psynch_cvwait + 10
1   libsystem_pthread.dylib       	    0x7ff804eda6f3 _pthread_cond_wait + 1211
2   libc++.1.dylib                	    0x7ff804e13d12 std::__1::condition_variable::wait(std::__1::unique_lock<std::__1::mutex>&) + 18
3   libmlx.dylib                  	       0x103d3336b mlx::core::Event::wait() + 75
4   libmlx.dylib                  	       0x103139b42 mlx::core::array::wait() + 50
5   libmlx.dylib                  	       0x103207099 mlx::core::eval(std::__1::vector<mlx::core::array, std::__1::allocator<mlx::core::array>>) + 137
6   libmlx.dylib                  	       0x103139bf8 mlx::core::array::eval() + 152
7   core.cpython-312-darwin.so    	       0x101ad130e 0x101a8e000 + 275214
8   core.cpython-312-darwin.so    	       0x101ab17d6 0x101a8e000 + 145366
9   core.cpython-312-darwin.so    	       0x101a949f0 0x101a8e000 + 27120
10  Python                        	       0x101e53d1f _PyEval_EvalFrameDefault + 50484
11  Python                        	       0x101d5f290 method_vectorcall + 219
12  Python                        	       0x101e55c9b _PyEval_EvalFrameDefault + 58544
13  Python                        	       0x101d5bd5f _PyObject_FastCallDictTstate + 87
14  Python                        	       0x101d5d254 _PyObject_Call_Prepend + 116
15  Python                        	       0x101dceb71 slot_tp_call + 114
16  Python                        	       0x101d5bf3f _PyObject_MakeTpCall + 140
17  Python                        	       0x101e53ec1 _PyEval_EvalFrameDefault + 50902
18  Python                        	       0x101d5f290 method_vectorcall + 219
19  Python                        	       0x101e55c9b _PyEval_EvalFrameDefault + 58544
20  Python                        	       0x101d5bd5f _PyObject_FastCallDictTstate + 87
21  Python                        	       0x101d5d254 _PyObject_Call_Prepend + 116
22  Python                        	       0x101dceb71 slot_tp_call + 114
23  Python                        	       0x101d5bf3f _PyObject_MakeTpCall + 140
24  Python                        	       0x101e53ec1 _PyEval_EvalFrameDefault + 50902
25  Python                        	       0x101d5f290 method_vectorcall + 219
26  Python                        	       0x101e55c9b _PyEval_EvalFrameDefault + 58544
27  Python                        	       0x101d5bd5f _PyObject_FastCallDictTstate + 87
28  Python                        	       0x101d5d254 _PyObject_Call_Prepend + 116
29  Python                        	       0x101dceb71 slot_tp_call + 114
30  Python                        	       0x101d5bf3f _PyObject_MakeTpCall + 140
31  Python                        	       0x101e53ec1 _PyEval_EvalFrameDefault + 50902
32  Python                        	       0x101d5f290 method_vectorcall + 219
33  Python                        	       0x101e55c9b _PyEval_EvalFrameDefault + 58544
34  Python                        	       0x101d5bd5f _PyObject_FastCallDictTstate + 87
35  Python                        	       0x101d5d254 _PyObject_Call_Prepend + 116
36  Python                        	       0x101dceb71 slot_tp_call + 114
37  Python                        	       0x101d5bf3f _PyObject_MakeTpCall + 140
38  Python                        	       0x101e53ec1 _PyEval_EvalFrameDefault + 50902
39  Python                        	       0x101d5bdda _PyObject_FastCallDictTstate + 210
40  Python                        	       0x101dcf8b8 slot_tp_init + 209
41  Python                        	       0x101dc67ff type_call + 135
42  Python                        	       0x101d5bf3f _PyObject_MakeTpCall + 140
43  Python                        	       0x101e53ec1 _PyEval_EvalFrameDefault + 50902
44  Python                        	       0x101e4762f PyEval_EvalCode + 197
45  Python                        	       0x101e43da3 builtin_exec + 469
46  Python                        	       0x101dab4cb cfunction_vectorcall_FASTCALL_KEYWORDS + 94
47  Python                        	       0x101e53d1f _PyEval_EvalFrameDefault + 50484
48  Python                        	       0x101ecf89f pymain_run_module + 234
49  Python                        	       0x101ecef0e Py_RunMain + 553
50  Python                        	       0x101ecf615 pymain_main + 378
51  Python                        	       0x101ecf6c8 Py_BytesMain + 42
52  dyld                          	    0x7ff804b4e366 start + 1942

Thread 1:
0   libsystem_kernel.dylib        	    0x7ff804e9cf7a __psynch_cvwait + 10
1   libsystem_pthread.dylib       	    0x7ff804eda6f3 _pthread_cond_wait + 1211
2   libc++.1.dylib                	    0x7ff804e13d12 std::__1::condition_variable::wait(std::__1::unique_lock<std::__1::mutex>&) + 18
3   libmlx.dylib                  	       0x103cea31c ThreadPool::ThreadPool(unsigned long)::'lambda'()::operator()() const + 156
4   libmlx.dylib                  	       0x103cea232 void* std::__1::__thread_proxy[abi:ne180100]<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct>>, ThreadPool::ThreadPool(unsigned long)::'lambda'()>>(void*) + 50
5   libsystem_pthread.dylib       	    0x7ff804eda18b _pthread_start + 99
6   libsystem_pthread.dylib       	    0x7ff804ed5ae3 thread_start + 15

Thread 2:
0   libsystem_kernel.dylib        	    0x7ff804e9cf7a __psynch_cvwait + 10
1   libsystem_pthread.dylib       	    0x7ff804eda6f3 _pthread_cond_wait + 1211
2   libc++.1.dylib                	    0x7ff804e13d12 std::__1::condition_variable::wait(std::__1::unique_lock<std::__1::mutex>&) + 18
3   libmlx.dylib                  	       0x103cea31c ThreadPool::ThreadPool(unsigned long)::'lambda'()::operator()() const + 156
4   libmlx.dylib                  	       0x103cea232 void* std::__1::__thread_proxy[abi:ne180100]<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct>>, ThreadPool::ThreadPool(unsigned long)::'lambda'()>>(void*) + 50
5   libsystem_pthread.dylib       	    0x7ff804eda18b _pthread_start + 99
6   libsystem_pthread.dylib       	    0x7ff804ed5ae3 thread_start + 15

Thread 3:
0   libsystem_kernel.dylib        	    0x7ff804e9cf7a __psynch_cvwait + 10
1   libsystem_pthread.dylib       	    0x7ff804eda6f3 _pthread_cond_wait + 1211
2   libc++.1.dylib                	    0x7ff804e13d12 std::__1::condition_variable::wait(std::__1::unique_lock<std::__1::mutex>&) + 18
3   libmlx.dylib                  	       0x103cea31c ThreadPool::ThreadPool(unsigned long)::'lambda'()::operator()() const + 156
4   libmlx.dylib                  	       0x103cea232 void* std::__1::__thread_proxy[abi:ne180100]<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct>>, ThreadPool::ThreadPool(unsigned long)::'lambda'()>>(void*) + 50
5   libsystem_pthread.dylib       	    0x7ff804eda18b _pthread_start + 99
6   libsystem_pthread.dylib       	    0x7ff804ed5ae3 thread_start + 15

Thread 4:
0   libsystem_kernel.dylib        	    0x7ff804e9cf7a __psynch_cvwait + 10
1   libsystem_pthread.dylib       	    0x7ff804eda6f3 _pthread_cond_wait + 1211
2   libc++.1.dylib                	    0x7ff804e13d12 std::__1::condition_variable::wait(std::__1::unique_lock<std::__1::mutex>&) + 18
3   libmlx.dylib                  	       0x103cea31c ThreadPool::ThreadPool(unsigned long)::'lambda'()::operator()() const + 156
4   libmlx.dylib                  	       0x103cea232 void* std::__1::__thread_proxy[abi:ne180100]<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct>>, ThreadPool::ThreadPool(unsigned long)::'lambda'()>>(void*) + 50
5   libsystem_pthread.dylib       	    0x7ff804eda18b _pthread_start + 99
6   libsystem_pthread.dylib       	    0x7ff804ed5ae3 thread_start + 15

Thread 5 Crashed:
0   libmlx.dylib                  	       0x103d2d7d0 mlx::core::Reduce::eval_cpu(std::__1::vector<mlx::core::array, std::__1::allocator<mlx::core::array>> const&, mlx::core::array&) + 6464
1   libmlx.dylib                  	       0x1032118bc std::__1::__function::__func<mlx::core::eval_impl(std::__1::vector<mlx::core::array, std::__1::allocator<mlx::core::array>>, bool)::$_0, std::__1::allocator<mlx::core::eval_impl(std::__1::vector<mlx::core::array, std::__1::allocator<mlx::core::array>>, bool)::$_0>, void ()>::operator()() + 172
2   libmlx.dylib                  	       0x1032038e2 mlx::core::scheduler::StreamThread::thread_fn() + 482
3   libmlx.dylib                  	       0x103203a93 void* std::__1::__thread_proxy[abi:ne180100]<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct>>, void (mlx::core::scheduler::StreamThread::*)(), mlx::core::scheduler::StreamThread*>>(void*) + 67
4   libsystem_pthread.dylib       	    0x7ff804eda18b _pthread_start + 99
5   libsystem_pthread.dylib       	    0x7ff804ed5ae3 thread_start + 15

@awni
Copy link
Member

awni commented Dec 7, 2024

I guess the only difference is it's using the accelerate back-end (which should work). We'll have to debug that..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants