Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mpirun binding says my first core is only used #3294

Closed
hmofrad opened this issue Jun 13, 2018 · 8 comments
Closed

mpirun binding says my first core is only used #3294

hmofrad opened this issue Jun 13, 2018 · 8 comments

Comments

@hmofrad
Copy link

hmofrad commented Jun 13, 2018

Hi, I'm using my laptop with a corei7 8th gen processor with 12 cpus to write MPI programs in WSL. I noticed that the mpirun can launch the given number of processes each on a separate core (e.g. running 8 processes on 8 different cores), but when you try to see what is the binding of the processes to cores, it always returns core id 1. This can be seen in this htop image where from the upper part almost all cpus are utilized yet the PROCESSOR column says all of these processes are running on core id 1.
untitled

To further test this, I also wrote a C program that reports the binding of MPI processes to cores. The binding always report core id 1 for all MPI ranks. The situation holds for both mpich and openmpi libraries.

// Compile: mpicc getcpu.c -o getcpu
// Run: mpirun -np 4 ./getcpu
#include <mpi.h>
#include <stdio.h>
#include <sched.h>
int sched_getcpu(void);

int main(int argc, char** argv)
{
    int id, np;
    char processor_name[MPI_MAX_PROCESSOR_NAME];
    int processor_name_len;

    MPI_Init(&argc, &argv);

    MPI_Comm_size(MPI_COMM_WORLD, &np);
    MPI_Comm_rank(MPI_COMM_WORLD, &id);
    MPI_Get_processor_name(processor_name, &processor_name_len);

    int cpu_id = sched_getcpu();
    printf("Hello from process %03d out of %03d, hostname %s, cpu_id %d\n",
        id, np, processor_name, sched_getcpu());
   // while(1);
    MPI_Finalize();
    return 0;
}
@shoffmeister
Copy link

Could this be an artefact of #1115?

Does lscpu show the expected number of cores on your system?

@hmofrad
Copy link
Author

hmofrad commented Jun 15, 2018

Thanks for your response. It does and here is the output:

Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                12
On-line CPU(s) list:   0-11
Thread(s) per core:    2
Core(s) per socket:    6
Socket(s):             1
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 158
Model name:            Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz
Stepping:              10
CPU MHz:               2208.000
CPU max MHz:           2208.0000
BogoMIPS:              4416.00
Virtualization:        VT-x
Hypervisor vendor:     vertical
Virtualization type:   full

Is there a way to fix this issue? Because I'm writing a program that requires to be aware of exact process affinity and right now it seems all processes are bound to core id 1.

@WSLUser
Copy link

WSLUser commented Jun 15, 2018

This is a dupe. It's not supported and is considered a feature request per #1115 (comment).

@shoffmeister
Copy link

I can only refer to #1115, as I only noticed the similarity.

Try following the suggestions in #1115 - i.e. add a "UserVoice" / provide some feedback etc?

What I sense is that the scientific community is starting to pick up WSL, and with that then come less common items such as needing to know about the max file handles (apparently for auto-tuning parallelism) or having acute awareness of sockets (presumably for tuning locality of reference). Try to raise those points, there, as a use case?

@WSLUser
Copy link

WSLUser commented Jun 15, 2018

the scientific community is starting to pick up WSL

I don't think that'll be the case until the number one UserVoice is implemented as it's required for most use cases.

Try to raise those points, there, as a use case?

Yes, clearly based on the age of the comment (despite all the refs), it fell off the tracker. It would be good to revive it but even better to vote on the UserVoice page.

@therealkenc
Copy link
Collaborator

I don't think that'll be the case until the number one UserVoice is implemented as it's required for most use cases.

A lot of the number crunching folks go the AVX route because it suits their use case. You can crunch plenty with 12 cores and AVX. But yeah you're not wrong.

While it might be tangentially related, I am not sure this is #1115. That was about (damn rare) two-socket architectures. If this is a (say) a single-socket 8750/8850 then that's a little different. Or different enough that a hard dupe would be premature. To me anyway, #1115 got sent to UserVoice purgatory with a silently implied "yeah, you and your 44 core friends can go vote for that or something". This is about binding cores and is a little different. There's a well formed test case here and that seems legit enough.

@shoffmeister
Copy link

Reading again, I agree - there is exactly one socket only, while #1115 talks about more than one socket.

This here is all about binding to a specific CPU core on the socket, much as SetProcessAffinityMask and SetThreadAffinityMask do (https://msdn.microsoft.com/en-us/library/windows/desktop/ms686223.aspx). That would help the lower-level caches, I guess ...

sudo apt install mpich will install the packages required to build, by the way.

@therealkenc
Copy link
Collaborator

Didn't actually confirm via repro of the OP, but this is near certainly #5239 and fixed in WSL2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants