Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CNCF LFX 2024 01-Mar-May]Volcano support multi-clusters AI workload scheduling. #3310

Closed
2 tasks
Monokaix opened this issue Jan 24, 2024 · 11 comments
Closed
2 tasks
Labels
kind/feature Categorizes issue or PR as related to a new feature.

Comments

@Monokaix
Copy link
Member

Monokaix commented Jan 24, 2024

What would you like to be added:

Volcano supports multi-cluster AI workload scheduling and provides rich scheduling strategies to choose a appropriate cluster for jobs.

Why is this needed:

Volcano has provided rich AI workloads scheduling capabilities in the field of single-cluster. With the development of multi-cluster management, more and more users use multiple clusters to uniformly manage and run their AI workloads. Volcano needs to support multi-cluster AI job scheduling and provide a series of scheduling capabilities, such as job management, gang scheduling, queue management, etc., so as to select the appropriate cluster for the job, this is the first level of scheduling, the scheduler of each cluster selects the appropriate node for the job, this is second-level scheduling. Here we only need first-level scheduling.

@Monokaix Monokaix added the kind/feature Categorizes issue or PR as related to a new feature. label Jan 24, 2024
@lowang-bh
Copy link
Member

@Monokaix
Copy link
Member Author

Repo is here: https://github.com/volcano-sh/federation

We should keep working on this: )

@RohanMishra315
Copy link

RohanMishra315 commented Jan 31, 2024

Hey @Monokaix I would love to work on this ! I have previous experience working with Karmada. Would love to take it as a challenge , looking forward to it.

@Monokaix Monokaix changed the title Volcano support multi-clusters AI workload scheduling. [CNCF LFX 2024 01-Mar-May]Volcano support multi-clusters AI workload scheduling. Feb 1, 2024
@Monokaix
Copy link
Member Author

Monokaix commented Feb 1, 2024

Hey @Monokaix I would love to work on this ! I have previous experience working with Karmada. Would love to take it as a challenge , looking forward to it.

Hi, thanks for your enthusiasm! Sorry that I didn't mention it's a CNCF LFX project, and you can apply for this project here : )

@SpringWiz11
Copy link

Hey @Monokaix,

I just noticed that this project is a CNCF LFX project, and I am thrilled to work on this.

Having worked extensively on multi-cluster scheduling and AI, I bring valuable industrial experience to the table. I have experience building scalable cloud-native and AI applications, ranging from traditional deep learning models to cutting-edge Federated Learning models deployed in production environments using frameworks like flower, FedML and PySyft

I also have hands-on experience with Karmada and would love to explore more and do valuable contribution.

By getting this opportunity I would like to leverage my Multi-cloud, multi-cluster and AI skillset under the guidance of the establised engineers at Volcano.

@Monokaix
Copy link
Member Author

Monokaix commented Feb 2, 2024

Hey @Monokaix,

I just noticed that this project is a CNCF LFX project, and I am thrilled to work on this.

Having worked extensively on multi-cluster scheduling and AI, I bring valuable industrial experience to the table. I have experience building scalable cloud-native and AI applications, ranging from traditional deep learning models to cutting-edge Federated Learning models deployed in production environments using frameworks like flower, FedML and PySyft

I also have hands-on experience with Karmada and would love to explore more and do valuable contribution.

By getting this opportunity I would like to leverage my Multi-cloud, multi-cluster and AI skillset under the guidance of the establised engineers at Volcano.

Welcome! And you can apply here.

@TrungBui59
Copy link

Hi @Monokaix,

I just applied to the CNCF LFX Mentorship program for this project. I am very interested in this project and would love to contribute to it. Is there any advice you have for me to get to understand the codebase and start with the good-first-issue issues?

@Vacant2333
Copy link
Contributor

hi! im very interested on this issue, and i just aplied the lfx now, im the karmada reviewer now, u can take a look about my github page~

@Monokaix
Copy link
Member Author

Monokaix commented Dec 2, 2024

New repo and design:https://github.com/volcano-sh/volcano-global

@Monokaix
Copy link
Member Author

Monokaix commented Dec 2, 2024

/close

@volcano-sh-bot
Copy link
Contributor

@Monokaix: Closing this issue.

In response to this:

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature.
Projects
None yet
Development

No branches or pull requests

7 participants