Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update documentation for setting Python environment #778

Merged
merged 3 commits into from
Jan 17, 2020

Conversation

ggsun
Copy link
Contributor

@ggsun ggsun commented Jan 16, 2020

This adds a step in the documentation that sets $OPENBLAS_NUM_THREADS=1, which @tahorst pointed out as a solution for slow simulation times in #448. Adding this cut down the simulation time on my local machine from 16 minutes to 10 minutes even when running a single simulation.

Copy link
Contributor

@1fish2 1fish2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

This brings up some questions, below, but those needn't slow you down.

@@ -77,7 +77,7 @@ This page goes through the Python environment setup steps in more detail and wit
WARNING: The Python readline extension was not compiled. Missing the GNU readline lib?
WARNING: The Python sqlite3 extension was not compiled. Missing the SQLite3 lib?

2. Install the required version of Python via `pyenv`, and _remember to enable it as a shared library_ so Theano can call into it:
1. Install the required version of Python via `pyenv`, and _remember to enable it as a shared library_ so Theano can call into it:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the "shared library" part also needed on macOS? I forgot! I suspect that I installed 2.7.16 locally without it.


```bash
pip install --upgrade pip setuptools virtualenv virtualenvwrapper virtualenv-clone wheel
```

3. Install OpenBLAS 0.3.5 or later.
1. Install OpenBLAS 0.3.5 or later.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add that openblas 0.3.7 (as installed by brew install openblas) works fine on macOS but not inside Docker on macOS unless you compile it with NO_AVX2=1. Maybe this is too complicated to even get into. OpenMathLib/OpenBLAS#2244

1. Add the following line to your bash profile.
```
export OPENBLAS_NUM_THREADS=1
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good thing. (And switching to plain 1. everywhere is also good.) But do we always do this? I do not know. Or always configure it this way and sometimes override it with a different local value?

FYI, Intel's alternative, Math Kernel Library (MKL) implementation has a Thread Building Blocks (TBB) feature to avoid oversubscription of threads while not limiting itself to one thread per process. The last time I tried it, it was slower than Openblas. It turns out there's a new release for 2020 and there were several releases in 2019, so maybe we should retest it. See #36

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been using the export for awhile. Maybe it's linux specific but you get much better performance. Maybe we could say it's optional or add some info about why you want to add it.

Testing MKL again could be worthwhile but do you think they're optimized for intel processor instruction sets/architecture and not so much for AMD/others?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yuk. Very good point that we'd have to test on at least one Intel CPU and at least one AMD CPU, maybe more, and the Docker-on-Mac-on-Intel case. At least it only takes replacing some pips -- on each of those system installations. I'm not excited to test it again.

1. Add the following line to your bash profile.
```
export OPENBLAS_NUM_THREADS=1
```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been using the export for awhile. Maybe it's linux specific but you get much better performance. Maybe we could say it's optional or add some info about why you want to add it.

Testing MKL again could be worthwhile but do you think they're optimized for intel processor instruction sets/architecture and not so much for AMD/others?

@ggsun ggsun merged commit 65b45a5 into master Jan 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants