-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: use -hl to enforce per job and per host memory limits #5
Conversation
Bad news: |
The issue with this is that it is actually (at least on my server) requesting the mem * threads in terms of resource request, but killing the job once it reaches mem. This is the worst of both worlds. Here is an example where I requested 128 MB of memory and 6 threads, and wrote a dummy rule with the help of chatGPT to use 256 MB of memory (tested and confirmed). You can see here that it was killed at 128 MB.
I really think we need to set it to use memory/core by default and change configuration either using an environment variable or resource request. |
Do you have
|
The recommendation by our local admins was to not try to parse too much of the configuration settings, as these can be configured in loads of different places. And some of them can even be set / altered during submission (by things like the So unless we find a restricted set of informative settings, we might have to resort to setting some kind of environment variable manually for each cluster configuration. This is annoying, because every user will have to first check out their cluster setup by trial and error, to find a working manual setting... |
What about a variable |
I made some changes here: https://github.com/BEFH/snakemake-executor-plugin-lsf/tree/flexible_mem_behavior I don't know if I should make a competing pull request or if you want to take a look? You should be able to set SNAKEMAKE_LSF_MEMFMT=perjob to get the desired behavior. |
With the other fixes and the added documentation, I think starting an alternative pull request with your branch is probably a good idea. One last thought: The So I just wanted to double-check if you have tried the version in this pull request, or only the version with the |
The output I sent where it requested mem*threads but enforced mem was with
-hl. I would appreciate if you could try my opinion without -hl, as asking
the cluster for extra enforcement is often counterproductive.
…On Fri, Mar 22, 2024, 9:45 AM David Laehnemann ***@***.***> wrote:
With the other fixes and the added documentation, I think starting an
alternative pull request with your branch is probably a good idea. One last
thought:
The -hl command line argument might be a way to get memory enforcement to
more generally work on a per-job (and not per task / cpu) basis. See the
option description here:
https://www.ibm.com/docs/en/spectrum-lsf/10.1.0?topic=options-hl
So I just wanted to double-check if you have tried the version in this
pull request, or only the version with the /job syntax? Because if the -hl
version at least works in both our setups, that would be much nicer than
having to add an environment variable to the mix...
—
Reply to this email directly, view it on GitHub
<#5 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAZ2Z2FVWJG42JRLL4MBNN3YZQYXZAVCNFSM6AAAAABE544GGSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMJVGE2DEOBUG4>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
I've merged the other pull request and made a release. Could you please approve this and test on your cluster: bioconda/bioconda-recipes#46693 I also updated the docs to more fully cover configuration. You can use |
Done. And I'll test the new version next week. |
Perfect, thanks. We can use the environment variable to make more changes if we need them for your cluster. I have tested that the env variable works to do what I say, so now we need to see if it will work for you. |
We might have to additionally add a check whether
LSB_RESOURCE_ENFORCE
contains the"memory"
string, and if it doesn't, we might have to resort back to the/job
syntax in therusage[]
statement.