build error:apex #711

wxd1995 · 2019-04-23T05:54:15Z

❓ Questions and Help

when i install the apex using the command "python setup.py install --cuda_ext --cpp_ext"
I get the error :

torch.__version__  =  1.1.0.dev20190422
Compiling cuda extensions with
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2017 NVIDIA Corporation
Built on Fri_Sep__1_21:08:03_CDT_2017
Cuda compilation tools, release 9.0, V9.0.176
from /usr/local/cuda/bin

Pytorch binaries were compiled with Cuda 9.0.176

running install
running bdist_egg
running egg_info
writing apex.egg-info/PKG-INFO
writing dependency_links to apex.egg-info/dependency_links.txt
writing top-level names to apex.egg-info/top_level.txt
reading manifest file 'apex.egg-info/SOURCES.txt'
writing manifest file 'apex.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-x86_64/egg
running install_lib
running build_py
running build_ext
building 'amp_C' extension
gcc -pthread -B /home/dy113/anaconda3/envs/maskrcnn_benchmark/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/dy113/anaconda3/envs/maskrcnn_benchmark/lib/python3.7/site-packages/torch/include -I/home/dy113/anaconda3/envs/maskrcnn_benchmark/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -I/home/dy113/anaconda3/envs/maskrcnn_benchmark/lib/python3.7/site-packages/torch/include/TH -I/home/dy113/anaconda3/envs/maskrcnn_benchmark/lib/python3.7/site-packages/torch/include/THC -I/usr/local/cuda/include -I/home/dy113/anaconda3/envs/maskrcnn_benchmark/include/python3.7m -c csrc/amp_C_frontend.cpp -o build/temp.linux-x86_64-3.7/csrc/amp_C_frontend.o -O3 -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=amp_C -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++11
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
/usr/local/cuda/bin/nvcc -I/home/dy113/anaconda3/envs/maskrcnn_benchmark/lib/python3.7/site-packages/torch/include -I/home/dy113/anaconda3/envs/maskrcnn_benchmark/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -I/home/dy113/anaconda3/envs/maskrcnn_benchmark/lib/python3.7/site-packages/torch/include/TH -I/home/dy113/anaconda3/envs/maskrcnn_benchmark/lib/python3.7/site-packages/torch/include/THC -I/usr/local/cuda/include -I/home/dy113/anaconda3/envs/maskrcnn_benchmark/include/python3.7m -c csrc/multi_tensor_scale_kernel.cu -o build/temp.linux-x86_64-3.7/csrc/multi_tensor_scale_kernel.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --compiler-options '-fPIC' -lineinfo -O3 --use_fast_math -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=amp_C -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++11
csrc/type_shim.h(13): error: class "at::Type" has no member "scalarType"

1 error detected in the compilation of "/tmp/tmpxft_0000270c_00000000-6_multi_tensor_scale_kernel.cpp1.ii".
error: command '/usr/local/cuda/bin/nvcc' failed with exit status 1

Can anyone help me? Thank you very much!

The text was updated successfully, but these errors were encountered:

Tegala · 2019-04-23T07:50:45Z

i got the same trouble, too......

skzhang1 · 2019-04-23T08:44:51Z

what is your gcc version?

Tegala · 2019-04-23T13:25:13Z

what is your gcc version?

mine is gcc-5.2

Yuliang-Zou · 2019-04-24T00:24:58Z

According to this issue, seems that apex can only be installed with CUDA10. My gcc version is 7.3, python is 3.6, and my pytorch version is 1.0.0. It works.

mel-2445 · 2019-04-24T21:48:07Z

same error with cuda 10 on ubuntu 16
tried with gcc 5 and gcc 7, python 2.7 and python 3.6

mel-2445 · 2019-04-24T22:04:45Z

workaround is to downgrade to pytorch nightly from a few days ago:
conda install pytorch-nightly=1.0.0.dev20190404 cudatoolkit=10.0 -c pytorch

fmassa · 2019-04-25T09:44:17Z

cc @mcarilli are you aware of any recent breakages of apex with latest PyTorch nightly?

mdsmith-cim · 2019-04-25T19:15:07Z

There's already an issue at Apex (#267) with a PR (#272) that fixes it that apparently will be merged soon.

So if you need an immediate fix use the scalar_type branch of ptrblk's fork.

mcarilli · 2019-04-25T19:30:11Z

Thanks @mdsmith-cim for the concise, correct summary. Our fix will be merged tomorrow at the latest (I have some other commitments so I may not have time to review it in detail today).

skzhang1 · 2019-04-26T03:59:47Z

@mdsmith-cim hi, sorry to bother you. I still have the same error after git your apex, Why is this, I look forward to your reply.

MC-devel-staudt · 2019-04-26T04:01:30Z

@zskadazhang Did you checkout the scalar_type branch?

skzhang1 · 2019-04-26T05:15:37Z

@DavidSPumpkins Oh! It works,Thank you very much!

ptrblck · 2019-04-26T09:05:02Z

@zskadazhang Good to hear it's working!
Please tag me in case you are running into issues related to this branch.

However, we should merge it to apex/master today so you can pull from the master branch again.

ptrblck · 2019-04-26T16:53:47Z

The PR was merged so the build should work again using apex master. :)

chengruizhe · 2019-04-27T09:17:02Z

With torch 1.1.0.dev20190425, and the latest apex fix, I still get an error when I try to compile with python setup.py install --cuda_ext --cpp_ext . I'm using gcc 5.5. Can anyone please help? Much appreciated!


/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512vlintrin.h(11089): error: argument of type "void *" is incompatible with parameter of type "float *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512vlintrin.h(11100): error: argument of type "void *" is incompatible with parameter of type "float *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512vlintrin.h(11109): error: argument of type "void *" is incompatible with parameter of type "double *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512vlintrin.h(11120): error: argument of type "void *" is incompatible with parameter of type "double *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512vlintrin.h(11129): error: argument of type "void *" is incompatible with parameter of type "double *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512vlintrin.h(11140): error: argument of type "void *" is incompatible with parameter of type "double *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512vlintrin.h(11149): error: argument of type "void *" is incompatible with parameter of type "int *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512vlintrin.h(11160): error: argument of type "void *" is incompatible with parameter of type "int *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512vlintrin.h(11169): error: argument of type "void *" is incompatible with parameter of type "int *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512vlintrin.h(11180): error: argument of type "void *" is incompatible with parameter of type "int *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512vlintrin.h(11189): error: argument of type "void *" is incompatible with parameter of type "long long *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512vlintrin.h(11200): error: argument of type "void *" is incompatible with parameter of type "long long *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512vlintrin.h(11209): error: argument of type "void *" is incompatible with parameter of type "long long *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512vlintrin.h(11220): error: argument of type "void *" is incompatible with parameter of type "long long *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512vlintrin.h(11229): error: argument of type "void *" is incompatible with parameter of type "int *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512vlintrin.h(11240): error: argument of type "void *" is incompatible with parameter of type "int *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512vlintrin.h(11249): error: argument of type "void *" is incompatible with parameter of type "int *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512vlintrin.h(11260): error: argument of type "void *" is incompatible with parameter of type "int *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512vlintrin.h(11269): error: argument of type "void *" is incompatible with parameter of type "long long *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512vlintrin.h(11280): error: argument of type "void *" is incompatible with parameter of type "long long *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512vlintrin.h(11289): error: argument of type "void *" is incompatible with parameter of type "long long *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512vlintrin.h(11300): error: argument of type "void *" is incompatible with parameter of type "long long *"

92 errors detected in the compilation of "/tmp/tmpxft_00004034_00000000-6_multi_tensor_scale_kernel.cpp1.ii".
error: command '/usr/local/cuda-9.0/bin/nvcc' failed with exit status 1```

skzhang1 · 2019-04-27T12:28:56Z

Sorry, I don not know why。My enviroment is CUDA10.1,GCC7.3。And I get APEX from @mdsmith-cim 。Maybe You can ask him。

mcarilli · 2019-04-27T22:59:37Z

@chengruizhe I've never seen this error before. Does it point to a particular line in the file?

ptrblck · 2019-04-27T23:02:09Z

@chengruizhe @mcarilli
Could it be related to the gcc version?
Based on this information e.g. Ubuntu 16.04 should use GCC 5.3.1 for CUDA9.0.

Tegala · 2019-04-29T02:14:40Z

AttributeError: 'AmpState' object has no attribute 'opt_properties'

is there anyone got this problem? i build apex and maskrcnn-benchmark successful without any error.
my version informations are
cuda9.0, gcc 5.2, pytorch-nightly1.1 (Centos)
(i can run it successful under UbuntuOS with cuda9.0, gcc5.2, pytorch-nightly1.0.0...... )

ptrblck · 2019-04-29T11:14:51Z

@Tegala Are you building apex from source or are you using an older version of apex?

Tegala · 2019-04-29T14:16:39Z

@Tegala Are you building apex from source or are you using an older version of apex?
Thanks for your reply!
I use the commad to huild apex:

git clone https://github.com/NVIDIA/apex.git
cd apex
python setup.py install --cuda_ext --cpp_ext

And I am sure I using the latest version of apex. this is strange...

Tegala · 2019-04-29T14:19:04Z

@Tegala Are you building apex from source or are you using an older version of apex?
Thanks for your reply!
I use the commad to huild apex:
git clone https://github.com/NVIDIA/apex.git
cd apex
python setup.py install --cuda_ext --cpp_ext
And I am sure I using the latest version of apex. this is strange...

The error info outputs:

2019-04-30 06:11:31,877 maskrcnn_benchmark.trainer INFO: Start training
Traceback (most recent call last):
  File "tools/train_net.py", line 177, in <module>
    main()
  File "tools/train_net.py", line 170, in main
    model = train(cfg, args.local_rank, args.distributed)
  File "tools/train_net.py", line 76, in train
    arguments,
  File "/home/hjz/projects/maskrcnn-benchmark/maskrcnn_benchmark/engine/trainer.py", line 79, in do_train
    with amp.scale_loss(losses, optimizer) as scaled_losses:
  File "/home/hjz/perl5/anaconda3/envs/deephaj-env/lib/python3.7/contextlib.py", line 112, in __enter__
    return next(self.gen)
  File "/home/hjz/perl5/anaconda3/envs/deephaj-env/lib/python3.7/site-packages/apex-0.1-py3.7-linux-x86_64.egg/apex/amp/handle.py", line 78, in scale_loss
    if not _amp_state.opt_properties.enabled:
AttributeError: 'AmpState' object has no attribute 'opt_properties'

ptrblck · 2019-04-29T15:09:10Z

Thanks for the information!
I'm trying to reproduce this issue. CC @mcarilli

mcarilli · 2019-04-29T15:32:17Z

@Tegala Amp requires that

model, optimizer = amp.initialize(model, optimizer, opt_level=...)

be called before any invocation of

with amp.scale_loss(losses, optimizer) as scaled_loss:

.

If your code is somehow invoking with amp.scale_loss without ever invoking amp.initialize, the above error will result.

Tegala · 2019-04-30T06:23:08Z

@Tegala Amp要求
model, optimizer = amp.initialize(model, optimizer, opt_level=...)
在任何调用之前调用
with amp.scale_loss(losses, optimizer
。

如果您的代码以某种方式调用with amp.scale_loss而没有调用amp.initialize，将导致上述错误。

Thanks so much!
I check it again and find that It is just like what you said, now it works!

aashokvardhan · 2019-05-10T20:43:33Z

SeanNaren/deepspeech.pytorch#376

git clone --recursive https://github.com/NVIDIA/apex.git
cd apex && pip install .

This worked for me.

mcarilli · 2019-05-10T20:54:59Z

@aashokvardhan pip install . will perform a Python-only install, which is not ideal for performance. You should install with cuda and c++ extensions via

pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" .

and only fall back to pip install . if the extension build doesn't work.

joskaaaa · 2019-06-14T16:47:23Z

Encountered this when using Ubuntu 18.04 | CUDA 9.0 and the default GCC/G++ in Ubuntu, version 7. The CUDA compiler is incompatible with GCC >= 6.4.

Solved it by installing GCC-5 and G++-5 ( sudo apt install gcc-5 g++-5 ), and setting them as higher priority using update alternatives:

sudo update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-5 10
sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-5 10

After this, Apex installed fine using the default instructions.

ying-cai-cd · 2019-12-12T19:54:58Z

My solution is:
sudo ln -sf /usr/bin/gcc-5 /usr/local/cuda-9.0/bin/gcc
sudo ln -sf /usr/bin/g++-5 /usr/local/cuda-9.0/bin/g++

on my Ubuntu 18.04, cuda 9.0, pytorch 1.1.0, python 3.6.

Mahhos · 2020-01-29T17:10:25Z

pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" .

I am trying to install apex on windows 10. I clone the apex from its repo and when I run the above command, I get this error: ERROR: Directory '.' is not installable. Neither 'setup.py' nor 'pyproject.toml' found. Do you have any idea how to resolve the issue?
python 3.6
gcc 5.3.0
torch 1.0.1

ptrblck · 2020-02-05T22:41:34Z

@Mahhos Could you check your current working directory for the setup.py file?

fmassa added the dependency bug label Apr 25, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

build error:apex #711

build error:apex #711

wxd1995 commented Apr 23, 2019 •

edited

Loading

Tegala commented Apr 23, 2019

skzhang1 commented Apr 23, 2019

Tegala commented Apr 23, 2019

Yuliang-Zou commented Apr 24, 2019 •

edited

Loading

mel-2445 commented Apr 24, 2019

mel-2445 commented Apr 24, 2019

fmassa commented Apr 25, 2019

mdsmith-cim commented Apr 25, 2019

mcarilli commented Apr 25, 2019

skzhang1 commented Apr 26, 2019

MC-devel-staudt commented Apr 26, 2019

skzhang1 commented Apr 26, 2019

ptrblck commented Apr 26, 2019

ptrblck commented Apr 26, 2019

chengruizhe commented Apr 27, 2019

skzhang1 commented Apr 27, 2019

mcarilli commented Apr 27, 2019

ptrblck commented Apr 27, 2019 •

edited

Loading

Tegala commented Apr 29, 2019

ptrblck commented Apr 29, 2019

Tegala commented Apr 29, 2019

Tegala commented Apr 29, 2019

ptrblck commented Apr 29, 2019

mcarilli commented Apr 29, 2019 •

edited

Loading

Tegala commented Apr 30, 2019

aashokvardhan commented May 10, 2019

mcarilli commented May 10, 2019

joskaaaa commented Jun 14, 2019

ying-cai-cd commented Dec 12, 2019

Mahhos commented Jan 29, 2020

ptrblck commented Feb 5, 2020

build error:apex #711

build error:apex #711

Comments

wxd1995 commented Apr 23, 2019 • edited Loading

❓ Questions and Help

Tegala commented Apr 23, 2019

skzhang1 commented Apr 23, 2019

Tegala commented Apr 23, 2019

Yuliang-Zou commented Apr 24, 2019 • edited Loading

mel-2445 commented Apr 24, 2019

mel-2445 commented Apr 24, 2019

fmassa commented Apr 25, 2019

mdsmith-cim commented Apr 25, 2019

mcarilli commented Apr 25, 2019

skzhang1 commented Apr 26, 2019

MC-devel-staudt commented Apr 26, 2019

skzhang1 commented Apr 26, 2019

ptrblck commented Apr 26, 2019

ptrblck commented Apr 26, 2019

chengruizhe commented Apr 27, 2019

skzhang1 commented Apr 27, 2019

mcarilli commented Apr 27, 2019

ptrblck commented Apr 27, 2019 • edited Loading

Tegala commented Apr 29, 2019

ptrblck commented Apr 29, 2019

Tegala commented Apr 29, 2019

Tegala commented Apr 29, 2019

ptrblck commented Apr 29, 2019

mcarilli commented Apr 29, 2019 • edited Loading

Tegala commented Apr 30, 2019

aashokvardhan commented May 10, 2019

mcarilli commented May 10, 2019

joskaaaa commented Jun 14, 2019

ying-cai-cd commented Dec 12, 2019

Mahhos commented Jan 29, 2020

ptrblck commented Feb 5, 2020

wxd1995 commented Apr 23, 2019 •

edited

Loading

Yuliang-Zou commented Apr 24, 2019 •

edited

Loading

ptrblck commented Apr 27, 2019 •

edited

Loading

mcarilli commented Apr 29, 2019 •

edited

Loading