-
Notifications
You must be signed in to change notification settings - Fork 884
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Command line issues #10698
Comments
Actually, it should map to
Correct - should work. I can check it
I'll check it - it was working last I checked. Maps to
Hmmm...I've been using it extensively and it works fine, so probably just a translation issue. Maps to
The code is still there, but I'll have to look at why it isn't working
Code is unchanged, so I assume it is still working. Perhaps @jjhursey can check (since he wrote it)?
Yeah, that option is way old. I suspect it isn't being mapped correctly.
Actually, |
It appears this option no longer exists:
does it need to be re-implemented or should it just be removed? |
We should also double check the |
I'll take a look - might just be an error. |
The PRRTE default param file is definitely working as I am using it and all get set as specified. I cannot speak to the OMPI one so that might need checking - it should get picked up by the pml/ompi component, but it might not be implemented there yet? Probably should first check that the OMPI personality is getting passed to PMIx (pretty sure that is happening, but worth checking). |
For the problem listed in #10702 (comment), what MCA param is he trying to set? I suspect it is either an incorrect param (due to it being PRRTE and not ORTE) or something that no longer exists as a param due to PRRTE treating things as job-specific instead (so maybe it needs to be restored as setting a default policy that is overridden by cmd line directives) |
I looked further at the description there and saw where he set the param - it looks okay. I suspect the problem is the same as was just reported to me offlist by the DDT folks - he isn't hitting a slot limit, but rather he is hitting the overload limit. Setting oversubscribe isn't turning off the check on CPUs. I'll have to fix that separately. |
Should be fixed in openpmix/prrte#1463 |
Fixed in openpmix/prrte#1464 |
I'm sensing some confusion over the user-facing options vs what PRRTE accepts, particularly with regard to the So please give me a bit to split all this out. Should be done by end of week, but I need some flexibility on the timing as my back has really flared up and so my time-at-keyboard is becoming hit/miss. |
I have implemented the |
Just to keep track of things here: @awlauria implemented the fix for this (which has been committed) |
Thanks @rhc54 . Feel free to update the top level comment with links to PR's. I haven't marked any of them as complete until they make their way to release branches. |
Afraid I lack permissions to checkoff your boxes, so please feel free to do so when you have time and see them as complete. |
I have backported all the fixes to-date to PMIx and PRRTE release branches. Note that PRRTE v3.0 now requires a minimum of PMIx v4.2.1 due to all the fixes on both sides. I'm most of the way done with the refactor - will update here when complete. |
openpmix/prrte#1468 completes the refactoring. Still need to decide what cmd line options to move into the I'll go back to looking into some of the above operations to ensure they are functional. |
$ prterun --help N
Specify number of application processes per node to be started Probably not in the schizo/ompi help file. Note that it is the OMPI community that required this option - IIRC, it might be part of the standard? |
The only one left in your list that I can resolve is I don't have a way of testing |
I'll look at the stream-buffering option. Once upon a time (way back in 1.7 in 4dd9f89a9) it was an MCA option ( |
@rhc54 the scale was fairly small with
Specifically the |
The "show_progress" option has been re-enabled here openpmix/prrte#1471 Note that it is no longer an MCA param and is now under the |
Okay, I can look at that one. Should be easy to fix. |
I believe it still is an MCA param - I'm not sure you can make it a per-job cmd line option, can you? |
Stack trace issue is fixed here openpmix/prrte#1472 |
Thanks! |
I retested all of these using prrte/openpmix latest main this morning: --display-devel-allocation, should map to --display devel-map
--show-progress, doesn't seem to do anything. Removal candidate?
-N doesn't appear in --help, possible removal candidate. Maps to --map-by ppr:1:node, so when used with --map-by this is confusing. rankfile - some known issues already reported, needs verification
|
Re:
We can now set it via an MCA parameter, instead of only being able to set it from the PRRTE CLI. This restores some functionality from when it was originally introduced in 4dd9f89a9 |
Cross-reference Issue #10705 |
Closing this in favor of #10705 |
Here's an overview of issues @gpaulsen found when testing ompi command line options.
NOTE: Box's should be checked as complete when fixes find their way to their release branches (openpmix v4.2/prrte v3.0).
As they come in feel free to edit in links to PR's in this comment.
--display-devel-allocation
, should map to --display devel-map--display-topo/--display topo
, does nothing. Should be removed? --display-topo should map to --display topo--merge-stderr-to-stdout
does not seem to work as expected. fixes in master: iof: Fix merging of stderr to stdout. openpmix/openpmix#2709 + schizo/ompi: Add translation for --merge-stderr-to-stdout. openpmix/prrte#1462--do-not-launch
does not work as expected (processes are launched) Fixed in main: schizo/ompi: Fix --do-not-launch. openpmix/prrte#1459--show-progress
, doesn't seem to do anything. Removal candidate?--stream-buffering
- Can't tell if this is working as expected--app
, deprecated translation is not working-N
doesn't appear in --help, possible removal candidate. Maps to--map-by ppr:1:node
, so when used with --map-by this is confusing.--get-stack-traces
seems to fail at larger scales via a hang or crash--stop-in-app
shouldn't require an argument. Fixed in main: Fix minor things openpmix/prrte#1458--mca mca_base_env_list
does not work on recent prrte updates--ppr
- not well documented. Slot detection when used in conjuction with --host foo (without specifying slots)seems broken, for example:
Removing the --host and it works, or specifying
hostA:44
.rankfile
- some known issues already reported, needs verification--cpu-set
/--cpu-list
Works - somewhat. If you ask for more ranks than cpusyou get:
The text was updated successfully, but these errors were encountered: