Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NVIDIA: fix wrong number of threads #2003

Merged
merged 1 commit into from
Oct 25, 2018

Conversation

psychocrypt
Copy link
Collaborator

In the cuda backend for monero we start always twice as much threads as needed.
Those threads are than removed after the AES matrix is copied to the shared memory.
Never the less it is the result of an copy past bug.

  • start correct number of threads for monero

In the cuda backend for monero we start always twice as much threads as needed.
Those threads are than removed after the AES matrix is copied to the shared memory.
Never the less it is the result of an copy past bug.

- start correct number of threads for `monero`
@fireice-uk fireice-uk merged commit eefd057 into fireice-uk:dev Oct 25, 2018
@psychocrypt psychocrypt deleted the fix-cudaWrongNumberOfThreads branch October 25, 2018 20:31
@psychocrypt
Copy link
Collaborator Author

CC-ing: @xmrig

@Spudz76
Copy link
Contributor

Spudz76 commented Oct 25, 2018

@xmrig brings Fermi to 75% perf of CN1 ("v7") instead of 50%

@Simaex
Copy link

Simaex commented Oct 26, 2018

Not less then 3% hasrate back for Maxwell (GTX 970). Good job, thanks!

@sergneo
Copy link

sergneo commented Oct 27, 2018

No, the hashrate fell, while "threads" : 8, "blocks" : 128, for GTX 980 at 2.5.1 513 H / s 2.5.2 500H / s Monero v8

@psychocrypt
Copy link
Collaborator Author

psychocrypt commented Oct 27, 2018 via email

@sergneo
Copy link

sergneo commented Oct 27, 2018

Such poor results on 2.5.2 because I use compiled CUDA 8.0, before on CUDA 9.0 the results were quite bad, the hashrate was 1.5-2 times lower. Looks like this bug already exists, you just now fixed in 2.5.2, now tested with CUDA 9.0, Hasrat increased to 522 H/s. Here's the same config on 2.5.1 CUDA 9.0: Benchmark nvidia Thread 0: 286.9 H/S

@Spudz76
Copy link
Contributor

Spudz76 commented Oct 27, 2018

CUDA 8.0 is non optimal for Maxwell, only really still supported for Fermi.
9.x is proper, 10.x also works very well.

CUDA 8.0 never worked very well via backward compatibility after CUDA 9.1 was in the drivers. So compiled for CUDA 8.0 works best on last available CUDA 8.0-containing driver so ~386.28

Compile for whatever the driver contains always works best (check driver release note under supported technologies, it shows CUDA version bundled) Backward compatibility is a crutch at best.

@sergneo
Copy link

sergneo commented Oct 27, 2018

This is the latest available in the Nvidia archive
416.34 CUDA - 10.0
411.70 CUDA - 10.0
411.63 Added support for CUDA 10.0
399.24 CUDA - 9.2
399.07 CUDA - 9.2
398.82 CUDA - 9.2
398.36 CUDA - 9.2
398.11 CUDA - 9.2
397.93 Added support for CUDA 9.2
397.64 CUDA - 9.1
391.35 CUDA - 9.1
391.24 CUDA - 9.1
391.01 CUDA - 9.1
390.77 CUDA - 9.1
390.65 CUDA - 9.1
388.71 Added support for CUDA 9.1
388.59 CUDA - 9.0
388.43 CUDA - 9.0
388.31 CUDA - 9.0
388.13 CUDA - 9.0

@Spudz76
Copy link
Contributor

Spudz76 commented Oct 27, 2018

Also pending accept, new docs section here

@sergneo
Copy link

sergneo commented Oct 28, 2018

xmr stack does not correctly detect the CUDA version of the Nvidia driver. With 388.13 miner writes at startup (9.1/9.0), 388.71 (9.2/9.0). A should have been (9.0/9.0) and (9.1/9.0) respectively.

@psychocrypt
Copy link
Collaborator Author

psychocrypt commented Oct 28, 2018 via email

@Spudz76
Copy link
Contributor

Spudz76 commented Oct 28, 2018

The release notes PDFs for 388.13/388.71 do claim such, however it must be a documentation lag as what the code says if from the driver itself so... 388.13 has 9.1 and 388.71 has 9.2 despite the docs. There was the one other crossover noted in the new compile.md but I did not notice these two as I did not install every driver (table even says, based on PDFs, and even then for Windows since there are no standard release notes for the Linux ones)

The release notes PDF for 388.71 does say 9.1 however it must be a documentation lag as what the code says is from the driver itself so... 388.13 has 9.1 despite the docs.

@sergneo
Copy link

sergneo commented Oct 29, 2018

Indeed, for example, driver 388.19 is installed with CUDA Toolkit 9.1. PDF is misleading.
CUDA Toolkit 9.0 - > Display Driver 385.54
CUDA Toolkit 9.1 - > Display Driver 388.19
CUDA Toolkit 9.2 - > Display Driver 398.75

gnagel pushed a commit to gnagel/xmr-stak that referenced this pull request Mar 23, 2019
…erOfThreads

NVIDIA: fix wrong number of threads
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants