Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added numa options to allow finer grained control as well as plumbing for a new mirror mode that will require numa.h #5377

Merged
merged 46 commits into from
Feb 16, 2024

Conversation

bmtwl
Copy link
Contributor

@bmtwl bmtwl commented Feb 6, 2024

ref #5121

Attempt number two.

Removed sched.h from ggml.h, moved ggml_get_numa_affinity out of the public API and purely into ggml.c, removed trailing whitespace and fixed up a few inconsistent variables

More info: #5358 (comment)

ggml.h Outdated Show resolved Hide resolved
llama.h Outdated Show resolved Hide resolved
@cebtenzzre
Copy link
Collaborator

If mirror mode isn't implemented yet, the user should be shown a warning or error if they try to use it - "Mirror Mode Enabled" doesn't communicate that.

@bmtwl
Copy link
Contributor Author

bmtwl commented Feb 7, 2024

If mirror mode isn't implemented yet, the user should be shown a warning or error if they try to use it - "Mirror Mode Enabled" doesn't communicate that.

I figured that hiding it behind the #ifdef would be enough, but I can add a warning in for sure

…ies. Added a note about mirror mode note being implemented yet
@cebtenzzre
Copy link
Collaborator

I figured that hiding it behind the #ifdef would be enough, but I can add a warning in for sure

If there is currently no use for #defining GGML_NUMA_MIRROR, then the code that depends on it shouldn't be committed yet.

examples/server/server.cpp Outdated Show resolved Hide resolved
examples/server/server.cpp Outdated Show resolved Hide resolved
ggml.c Outdated Show resolved Hide resolved
ggml.c Outdated Show resolved Hide resolved
@bmtwl
Copy link
Contributor Author

bmtwl commented Feb 8, 2024

I have fixed the errors in the last test. I also fixed a few related errors in the "examples" folder
All tests now pass in both make and cmake :

100% tests passed, 0 tests failed out of 22

common/common.cpp Outdated Show resolved Hide resolved
ggml.c Outdated Show resolved Hide resolved
@bmtwl
Copy link
Contributor Author

bmtwl commented Feb 14, 2024

I'm currently installing VS on a Windows box to do local regression testing and clear up these errors before requesting this be re-run

@bmtwl
Copy link
Contributor Author

bmtwl commented Feb 14, 2024

I'm trying to troubleshoot the build errors on Android and Vulkan under Windows.
I have a Windows build env going, so I'm hopeful I can get to the bottom of that, but the error on Android seems to point to a lingering ggml_backend_init(bool numa) that I can't find in the codebase for the life of me, and I don't have android to test against (or a cross-compiling environment set up)
Can someone with more experience on the automatic regression testing tools or the Android dev stuff help me to troubleshoot and get the final tests green?

@slaren
Copy link
Collaborator

slaren commented Feb 15, 2024

The Android example fetches llama.cpp from the master branch, so it breaks when the API changes, you can ignore that error.

FetchContent_Declare(
llama
GIT_REPOSITORY https://github.com/ggerganov/llama.cpp
GIT_TAG master
)

The Windows error also seems unrelated to this PR, the Vulkan build is broken at the moment.

@bmtwl
Copy link
Contributor Author

bmtwl commented Feb 15, 2024

Thanks @slaren
Is there anything else for me to do before this is committed?

@slaren
Copy link
Collaborator

slaren commented Feb 15, 2024

Looks good to me, but let's wait for @ggerganov review.

llama.h Show resolved Hide resolved
ggml.h Outdated Show resolved Hide resolved
ggml.c Outdated Show resolved Hide resolved
ggml.c Outdated Show resolved Hide resolved
examples/server/server.cpp Outdated Show resolved Hide resolved
examples/server/server.cpp Outdated Show resolved Hide resolved
common/common.cpp Outdated Show resolved Hide resolved
bmtwl and others added 7 commits February 15, 2024 07:11
Align enum values

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Remove whitespace

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
align paremeters

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
remove whitespace and align brace

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Remove whitespace and align brace

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
ggml.c Outdated Show resolved Hide resolved
bmtwl and others added 2 commits February 15, 2024 09:39
simplified return for platforms without NUMA support

Co-authored-by: Jared Van Bortel <cebtenzzre@gmail.com>
@bmtwl
Copy link
Contributor Author

bmtwl commented Feb 15, 2024

I've made the final proposed code changes, brought the branch in to sync with current, built and run the regression tests locally on both Linux and Windows.

common/common.cpp Outdated Show resolved Hide resolved
@ggerganov ggerganov merged commit f486f6e into ggerganov:master Feb 16, 2024
51 of 54 checks passed
jordankanter pushed a commit to jordankanter/llama.cpp that referenced this pull request Mar 13, 2024
* Added numa options to allow finer grained control as well as plumbing for a new mirror mode that will require numa.h

* Reverted Makefile

* Fixed include

* Removed sched.h from ggml.h, moved ggml_get_numa_affinity into ggml.c, removed trailing whitespace and fixed up a few inconsistent variables

* removed trailing whitespace

* Added numa options to allow finer grained control as well as plumbing for a new mirror mode that will require numa.h

* Reverting Makefile

* Fixed a number of issues with the move from BOOL to ggml_numa_strategies. Added a note about mirror mode note being implemented yet

* Removing MIRROR_MODE code for this PR

* Removing last bit of MIRROR_MODE code for this PR

* Removing unneeded branch in server.cpp example and moving get_numa_affinity and making it static

* Fixed lingering init_llama_backend() bool calls in tests and examples

* Remote enum llama_numa_strategies

* Revert bad merge with dynatemp flags

* add missing enum ggml_numa_strategies declaration and revert sync problem with master

* add missing enum ggml_numa_strategies declaration

* fixed ggml_init_numa variable

* Update ggml.h

Co-authored-by: Jared Van Bortel <cebtenzzre@gmail.com>

* Update READMEs with info about numa flags, change INTERLEAVE strategy name to DISTRIBUTE everywhere, implement the improved distribution strategy from @rankaiyx, fix a spelling mistake and un-merge some bad merges

* split numa init out from llama_backend_init and created llama_numa_init. Updated all code paths and samples

* Fix up some boolean vs enum comparisons

* Added #ifdefs for non-Linux OS that don't have cpu_set_t datatype

* Update ggml.h

Align enum values

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* Update ggml.c

Remove whitespace

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* Update ggml.c

align paremeters

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* Update examples/server/server.cpp

remove whitespace and align brace

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* Update common/common.cpp

Remove whitespace and align brace

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* unified ggml_numa_strategy enum and fixed text alignment in server.cpp example

* Update ggml.c

simplified return for platforms without NUMA support

Co-authored-by: Jared Van Bortel <cebtenzzre@gmail.com>

* removed redundant else from cli argument processing of --numa

* whitespace

---------

Co-authored-by: root <root@nenya.lothlorien.ca>
Co-authored-by: Jared Van Bortel <cebtenzzre@gmail.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Co-authored-by: Jared Van Bortel <jared@nomic.ai>
hodlen pushed a commit to hodlen/llama.cpp that referenced this pull request Apr 1, 2024
* Added numa options to allow finer grained control as well as plumbing for a new mirror mode that will require numa.h

* Reverted Makefile

* Fixed include

* Removed sched.h from ggml.h, moved ggml_get_numa_affinity into ggml.c, removed trailing whitespace and fixed up a few inconsistent variables

* removed trailing whitespace

* Added numa options to allow finer grained control as well as plumbing for a new mirror mode that will require numa.h

* Reverting Makefile

* Fixed a number of issues with the move from BOOL to ggml_numa_strategies. Added a note about mirror mode note being implemented yet

* Removing MIRROR_MODE code for this PR

* Removing last bit of MIRROR_MODE code for this PR

* Removing unneeded branch in server.cpp example and moving get_numa_affinity and making it static

* Fixed lingering init_llama_backend() bool calls in tests and examples

* Remote enum llama_numa_strategies

* Revert bad merge with dynatemp flags

* add missing enum ggml_numa_strategies declaration and revert sync problem with master

* add missing enum ggml_numa_strategies declaration

* fixed ggml_init_numa variable

* Update ggml.h

Co-authored-by: Jared Van Bortel <cebtenzzre@gmail.com>

* Update READMEs with info about numa flags, change INTERLEAVE strategy name to DISTRIBUTE everywhere, implement the improved distribution strategy from @rankaiyx, fix a spelling mistake and un-merge some bad merges

* split numa init out from llama_backend_init and created llama_numa_init. Updated all code paths and samples

* Fix up some boolean vs enum comparisons

* Added #ifdefs for non-Linux OS that don't have cpu_set_t datatype

* Update ggml.h

Align enum values

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* Update ggml.c

Remove whitespace

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* Update ggml.c

align paremeters

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* Update examples/server/server.cpp

remove whitespace and align brace

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* Update common/common.cpp

Remove whitespace and align brace

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* unified ggml_numa_strategy enum and fixed text alignment in server.cpp example

* Update ggml.c

simplified return for platforms without NUMA support

Co-authored-by: Jared Van Bortel <cebtenzzre@gmail.com>

* removed redundant else from cli argument processing of --numa

* whitespace

---------

Co-authored-by: root <root@nenya.lothlorien.ca>
Co-authored-by: Jared Van Bortel <cebtenzzre@gmail.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Co-authored-by: Jared Van Bortel <jared@nomic.ai>
@bartowski1182
Copy link
Contributor

Sorry to necro this @bmtwl but I'm wondering if you happen to know what the appropriate option is for a single 7702. I believe it has NUMA in a single socket, so wondering what options if any I should be using or how to test it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
need feedback Testing and feedback with results are needed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants