Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: improve distributed backend selection #8683

Closed
1 task done
youkaichao opened this issue Sep 20, 2024 · 4 comments · Fixed by #8823
Closed
1 task done

[Feature]: improve distributed backend selection #8683

youkaichao opened this issue Sep 20, 2024 · 4 comments · Fixed by #8823
Assignees

Comments

@youkaichao
Copy link
Member

🚀 The feature, motivation and pitch

We have three ways to start a new process:

  • multiprocessing by fork
  • multiprocessing by spawn
  • ray

by default, we use ray for multi-node serving, and multiprocessing by fork for single node setting.

however, if users initialize cuda context, multiprocessing by fork will not work.

if we set multiprocessing by spawn by default, it will not work when users don't have if __name__ == "__main__".

if we can figure out whether users have if __name__ == "__main__" automatically, we can improve the default user experience.

the proposed solution is:

if we find that cuda is initialized, we inspect the current function call stack, and trace back the stack until we reach the __main__ module, check the current line to see if we are under if __name__ == "__main__", if yes, switch the multiprocessing method from fork to spawn.

cc @russellb

Alternatives

No response

Additional context

No response

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.
@russellb
Copy link
Member

if we find that cuda is initialized, we inspect the current function call stack, and trace back the stack until we reach the main module, check the current line to see if we are under if name == "main", if yes, switch the multiprocessing method from fork to spawn.

@youkaichao I looked into this a bit, and it doesn't seem straightforward. Let me know if you have a particular method in mind.

Another idea:

  • Set the default to spawn when the vllm CLI entry point is used. In that case, we know we are in full control, so we can set the default.
  • Add detection for the case where cuda is already initialized and the method is not spawn. In this case, add a warning log message to give clear instructions on what should be changed.
  • Expand developer documentation on the topic for those using the Python APIs.

(for reference, this was a related PR where I was changing the default: #8576)

@youkaichao
Copy link
Member Author

Add detection for the case where cuda is already initialized and the method is not spawn. In this case, add a warning log message to give clear instructions on what should be changed.

then maybe you can add this. I think it can be enough.

@russellb
Copy link
Member

OK - you can assign this to me.

@russellb
Copy link
Member

I thought this was interesting. While testing out the behavior of different scenarios, I see that Python already does a nice job of detecting the case where spawn is used when the parent code isn't protected by if __name__ == "__main__".

...
  File "/usr/lib64/python3.11/multiprocessing/spawn.py", line 140, in _check_not_importing_main
    raise RuntimeError('''
RuntimeError: 
        An attempt has been made to start a new process before the
        current process has finished its bootstrapping phase.

        This probably means that you are not using fork to start your
        child processes and you have forgotten to use the proper idiom
        in the main module:

            if __name__ == '__main__':
                freeze_support()
                ...

        The "freeze_support()" line can be omitted if the program
        is not going to be frozen to produce an executable.

        To fix this issue, refer to the "Safe importing of main module"
        section in https://docs.python.org/3/library/multiprocessing.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants