-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: sched: Assigner experiments #10356
Conversation
Ran som tests for these here. And these definetly make things way better for --no-default workers (not every sector assign to a single storage worker).
Only other idea I hear been floating around is a round robin throught the list of workers, but unsure if that would be possible with the current limitations of the built-in scheduler. I do however think that the random assigner with a possible fix for #9407 (comment), will be sufficient for spreading these sectors around. |
Pushed a commit which implements the fix for #9407 (comment). |
13d9b3d
to
2316363
Compare
8ef4bec
to
c484c38
Compare
The one and only scheduler implementation I'd love to have would be "round-robin". For example, say our setup has this layout:
When we have the current scheduler, it would assign AP to the AP workers, decently spread. AP workers take work as they can. When we say the PC1 worker prefers work from their local disk, you end up with PC1_1 getting a lot of work. If we pledge every 4 minutes;
etc.. Only once both AP_1 and AP_2 are doing things, will PC1_2 receive jobs, because then the AP will be happening on MachineB. Eventually: This generates a lot of network strain. Best solution: round-robin the workers. Keep a list of workers, keep track of "assigned job count"
Sort the list by descending job count, assign a job, increment the job counter. You will end up with (no matter the time / spread) all AP_1 machines getting an even amount of jobs, thus the PC1 workers getting an even amount of jobs, as well as the PC2 / C2 workers. This will hold up if you pledge every 10 seconds, of every 50 minutes. In the case a machine has been selected by round-robin but it doesn't have the resources (this should never happen in case of similar hardware / times), you can always skip over it and take the next one in the list. It will end up with a skewed list where the "overloaded" worker always comes first due to a low job count, but this shouldn't matter in my opinion. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I skimmed this though this part of the system is probably the most foreign to me so not more than that. I can go over this with you in more depth later this week if you want a more extensive review. Approving now in the interest of getting in by code freeze.
Related Issues
Proposed Changes
Currently lotus-miner can be configured with one of two assigners:
utilization
(default) - For each task pick a worker with a lowest 'utilization factor', which is based on ratios of compute resourcesspread
- In each assign loop try to assign to a worker with least tasks assigned so farThis PR adds a few experimental assigners:
experiment-spread-qcount
- Like spread, but also takes into account task counts which are running/preparing/queuedexperiment-spread-tasks
- Like spread, but counts running tasks on a per-task-type basisexperiment-spread-tasks-qcount
- The two above combined - count running tasks grouped by task type, also takes into account tasks which are running/preparing/queuedexperiment-random
- In each schedule loop figure a set of all workers which can handle the task.. then pick a random oneAdditional Info
experiment-random
/experiment-spread-tasks-qcount
work better for--no-default
storage workersChecklist
Before you mark the PR ready for review, please make sure that:
<PR type>: <area>: <change being made>
fix: mempool: Introduce a cache for valid signatures
PR type
: fix, feat, build, chore, ci, docs, perf, refactor, revert, style, testarea
, e.g. api, chain, state, market, mempool, multisig, networking, paych, proving, sealing, wallet, deps