Package Details: python-apex-git 0.1.r599-1

Git Clone URL: https://aur.archlinux.org/python-apex-git.git (read-only, click to copy)
Package Base: python-apex-git
Description: A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch
Upstream URL: https://github.com/NVIDIA/apex
Keywords: pytorch
Licenses: BSD
Conflicts: python-apex
Provides: python-apex
Submitter: leomao
Maintainer: None
Last Packager: leomao
Votes: 0
Popularity: 0.000000
First Submitted: 2018-12-14 06:07 (UTC)
Last Updated: 2019-11-27 05:42 (UTC)

Latest Comments

leomao commented on 2019-11-04 08:17 (UTC) (edited on 2019-11-04 08:17 (UTC) by leomao)

@petronny Yes, it is clear that you need qt5-base because the error message showed that it needed libQt5Test.so.5. But this error was triggered by import torch.

I have verified that in a clean chroot environment, import torch is not working after just installing python-pytorch-cuda. So it seems that the dependencies of some packages in the official repo are not properly set...

petronny commented on 2019-11-04 07:26 (UTC)

I think the reason is qt5-base is not added to makedepends.

I can build the package with qt5-base in makedepends.

leomao commented on 2019-10-27 08:02 (UTC) (edited on 2019-10-27 08:03 (UTC) by leomao)

@petronny I think there is something wrong with your python-pytorch-cuda because the error message indicated that import torch failed...
Could you try python -c 'import torch; print(torch.__version__)' to verify that you can import it without the error?

I will add python-pip into the makedepends later.

petronny commented on 2019-10-24 05:00 (UTC) (edited on 2019-10-24 05:01 (UTC) by petronny)

Getting

==> Starting build()...
==> Building Python 3
Traceback (most recent call last):
  File "setup.py", line 1, in <module>
    import torch
  File "/usr/lib/python3.7/site-packages/torch/__init__.py", line 81, in <module>
    from torch._C import *
ImportError: libQt5Test.so.5: cannot open shared object file: No such file or directory
==> ERROR: A failure occurred in build().

now.

There might be something missing in makedepends.

petronny commented on 2019-08-16 03:59 (UTC)

There is one more thing to fix to pass the build.

==> Building Python 3
Traceback (most recent call last):
  File "setup.py", line 5, in <module>
    from pip._internal import main as pipmain
ModuleNotFoundError: No module named 'pip'
==> ERROR: A failure occurred in build().
    Aborting...

Please add python-pip to makedepends.

leomao commented on 2019-08-13 07:42 (UTC)

@petronny you're right. Fixed.

petronny commented on 2019-08-13 05:53 (UTC)

No CUDA runtime is found, using CUDA_HOME='/opt/cuda'
Traceback (most recent call last):
  File "setup.py", line 64, in <module>
    check_cuda_torch_binary_vs_bare_metal(torch.utils.cpp_extension.CUDA_HOME)
  File "setup.py", line 43, in check_cuda_torch_binary_vs_bare_metal
    torch_binary_major = torch.version.cuda.split(".")[0]
AttributeError: 'NoneType' object has no attribute 'split'

Please change python-pytorch to python-pytorch-cuda.

leomao commented on 2019-08-05 04:03 (UTC) (edited on 2019-08-05 04:04 (UTC) by leomao)

@petronny Thanks for the suggestion. I Updated the PKGBUILD.

petronny commented on 2019-08-05 03:59 (UTC)

Also, git should be in makedepends.

petronny commented on 2019-08-04 05:48 (UTC)

It shouldn't be an any package since it depends on cuda.
Please set arch to ('x86_64').

leomao commented on 2019-04-12 09:55 (UTC) (edited on 2019-04-12 09:57 (UTC) by leomao)

Please check https://github.com/NVIDIA/apex/issues/212. Currently, I don't have a solution with pytorch/pytorch-cuda in the community repo...

For now, I compile pytorch master myself...

drr21 commented on 2019-04-09 16:05 (UTC)

I get this warning when I use apex.amp:

'Warning: multi_tensor_applier fused unscale kernel is unavailable, possibly because apex was installed without --cuda_ext --cpp_ext. Using Python fallback. Original ImportError was: ImportError('/usr/lib/python3.7/site-packages/amp_C.cpython-37m-x86_64-linux-gnu.so: undefined symbol: _ZN3c105ErrorC1ENS_14SourceLocationERKSs')'

hottea commented on 2019-03-22 08:41 (UTC)

@leomao see syncbn for syncbn example. Actually, see this issue. It seems that pytorch appends -D_GLIBCXX_USE_CXX11_ABI=0 to compiler flags by default. I don't see a way to override it. And according to pytorch's PKGBUILD, there is no modify related to this flag. I believe that pytorch is build with -D_GLIBCXX_USE_CXX11_ABI=0, which is the default behavior of pytorch official configuration. So it would be OK to build apex extension with the same flag, aka -D_GLIBCXX_USE_CXX11_ABI=0. However, it's not. I try to build build apex with -D_GLIBCXX_USE_CXX11_ABI=1 by manually replace all -D_GLIBCXX_USE_CXX11_ABI=0 to -D_GLIBCXX_USE_CXX11_ABI=1 in /usr/lib/python3.7/site-packages/torch/utils/cpp_extension.py, and it works as expected. However, one should not expect to modify this cpp_extension.py during building apex with devtools, right?

leomao commented on 2019-02-25 03:10 (UTC)

Hi @hottea, thanks for reporting the issue. Could you provide a code snippet for testing? I just checked that the examples and tests ran without errors.

hottea commented on 2019-02-25 02:42 (UTC)

c++filt _ZN3c105ErrorC1ENS_14SourceLocationERKSs gives me:

c10::Error::Error(c10::SourceLocation, std::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)

and then I found https://github.com/pytorch/pytorch/issues/13541, it seems we need to add -D_GLIBCXX_USE_CXX11_ABI=0 when compiling apex.

hottea commented on 2019-02-25 02:04 (UTC)

I got this warning:

Warning:  using Python fallback for SyncBatchNorm, possibly because apex was ins
talled without --cuda_ext.  The exception raised when attempting to import the c
uda backend was:  /usr/lib/python3.7/site-packages/syncbn.cpython-37m-x86_64-lin
ux-gnu.so: undefined symbol: _ZN3c105ErrorC1ENS_14SourceLocationERKSs

It seems that it failed to build with --cuda_ext? Or maybe there is something wrong with libs?