The officially official Devuan Forum!

You are not logged in.

#1 2018-08-11 14:13:11

chris2be8
Member
Registered: 2018-08-11
Posts: 264  

[SOLVED] Configuring CUDA support

Hello,

I'm trying to get CUDA working on a system running Devuan. The GPU should support it:

$ nvidia-detect 
Detected NVIDIA GPUs:
01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GF104 [GeForce GTX 460] [10de:0e22] (rev a1)

Checking card:  NVIDIA Corporation GF104 [GeForce GTX 460] (rev a1)
Your card is supported by all driver versions.
It is recommended to install the
    nvidia-driver
package.

nvidia-driver version 384.130-1 is installed.

But something seems to be missing:

# nvidia-smi
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

When rebooting the system I got messages about nvidia-current-modeset and nvidia-current-drm not being in /lib/modules/4.9.0-6-amd64 (I had to pause the boot with control-S, write them down, then control-Q to continue the boot). Similar messages come from:

# modprobe nvidia
modprobe: FATAL: Module nvidia-current not found in directory /lib/modules/4.9.0-6-amd64
modprobe: ERROR: ../libkmod/libkmod-module.c:977 command_do() Error running install command for nvidia
modprobe: ERROR: could not insert 'nvidia': Operation not permitted

What packages provide nvidia-current-modeset and nvidia-current-drm?

Chris

Last edited by chris2be8 (2018-08-18 05:21:25)

Offline

#2 2018-08-14 15:53:34

chris2be8
Member
Registered: 2018-08-11
Posts: 264  

Re: [SOLVED] Configuring CUDA support

Hello,

Has anyone managed to get CUDA working on Devuan? If I can't get it working I'll reluctantly have to switch to another Linux distribution.

Chris

Offline

#3 2018-08-15 17:06:44

chris2be8
Member
Registered: 2018-08-11
Posts: 264  

Re: [SOLVED] Configuring CUDA support

Upgrading the OS from 4.9.0-6-amd64 to 4.9.0-7-amd64 fixed the driver problems. Now nvidia-smi works OK.

Testing nvcc I found it won't work with gcc 6.3.0. But it will work with clang 3.8 (and probably 3.9 or 4.0).

Now all I need to do is to get the apps I want to run under CUDA to work.

Chris

Offline

#4 2018-08-17 16:54:18

chris2be8
Member
Registered: 2018-08-11
Posts: 264  

Re: [SOLVED] Configuring CUDA support

I've managed to get gmp-ecm GPU support working by recompiling it all with clang. See http://mersenneforum.org/showthread.php?t=23561 post 10.

So this issue is solved. How do I set the topic to say that?

Chris

Offline

#5 2018-08-17 17:38:48

golinux
Administrator
Registered: 2016-11-25
Posts: 3,137  

Re: [SOLVED] Configuring CUDA support

chris2be8 wrote:

I've managed to get gmp-ecm GPU support working by recompiling it all with clang. See http://mersenneforum.org/showthread.php?t=23561 post 10.

Thanks for posting the fix.  Seems no one on this forum has played with CUDA.

So this issue is solved. How do I set the topic to say that?

Chris

To mark the topic as [SOLVED], edit the subject in your first post.

Offline

#6 2018-08-27 08:09:49

chris2be8
Member
Registered: 2018-08-11
Posts: 264  

Re: [SOLVED] Configuring CUDA support

Hello,

I've found that the CUDA programs I'm using only work if complied with clang instead of clang-3.8. Which is surprising because they are the same program:

chris@rigel:~$ which clang
/usr/bin/clang
chris@rigel:~$ which clang-3.8
/usr/bin/clang-3.8
chris@rigel:~$ ls -l /usr/bin/clang /usr/bin/clang-3.8
lrwxrwxrwx 1 root root 25 Apr 18  2017 /usr/bin/clang -> ../lib/llvm-3.8/bin/clang
lrwxrwxrwx 1 root root 25 Jun  2  2017 /usr/bin/clang-3.8 -> ../lib/llvm-3.8/bin/clang

Comparing two test cases (test4 fails, test3 works):

chris@rigel:~/msieve-svn1022.test4/trunk/cub$ diff Makefile ~/msieve-svn1022.test3/trunk/cub/Makefile 
20c20
< 			-ccbin clang-3.8 -Xcompiler -fPIC -Xcompiler -fvisibility=hidden
---
> 			-ccbin clang -Xcompiler -fPIC -Xcompiler -fvisibility=hidden
chris@rigel:~/msieve-svn1022.test4/trunk/cub$ ldd sort_engine.so 
	linux-vdso.so.1 (0x00007ffc90ffe000)
	librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007fa9cf8d2000)
	libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fa9cf6b5000)
	libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fa9cf4b1000)
	libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007fa9cf29a000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fa9ceefb000)
	/lib64/ld-linux-x86-64.so.2 (0x00007fa9cfe30000)
chris@rigel:~/msieve-svn1022.test3/trunk/cub$ ldd sort_engine.so 
	linux-vdso.so.1 (0x00007ffd177ef000)
	librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f47b6618000)
	libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f47b63fb000)
	libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f47b61f7000)
	libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f47b5e75000)
	libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f47b5b71000)
	libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f47b595a000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f47b55bb000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f47b6b76000)

The error message is:

cannot load library '/home/chris/msieve-svn1022.cuda/trunk/cub/sort_engine.so': /home/chris/msieve-svn1022.cuda/trunk/cub/sort_engine.so: undefined symbol: _ZNSt8ios_base4InitD1Ev

So why does compiling with clang-3.8 produce different output from compiling with clang?

Chris

Offline

#7 2018-09-07 16:26:25

chris2be8
Member
Registered: 2018-08-11
Posts: 264  

Re: [SOLVED] Configuring CUDA support

I've found out a bit more. Adding --verbose to nvcc's parms makes it print the commands it runs to invoke lower level programs that do the work. Comparing what it does when called with -ccbin clang and with -ccbin clang-3.8 the only significant difference is the last line of output:

#$ clang++ -fPIC -fvisibility=hidden -O3 -m64 -o "sort_engine.so" -Wl,--start-group "/tmp/tmpxft_000002fc_00000000-18_sort_engine_dlink.o" "/tmp/tmpxft_000002fc_00000000-16_sort_engine.o" -shared   -L/usr/lib/x86_64-linux-gnu/stubs -lcudadevrt  -lcudart_static  -lrt -lpthread  -ldl  -Wl,--end-group 
#$ clang-3.8 -fPIC -fvisibility=hidden -O3 -m64 -o "sort_engine.so" -Wl,--start-group "/tmp/tmpxft_0000029c_00000000-18_sort_engine_dlink.o" "/tmp/tmpxft_0000029c_00000000-16_sort_engine.o" -shared   -L/usr/lib/x86_64-linux-gnu/stubs -lcudadevrt  -lcudart_static  -lrt -lpthread  -ldl  -Wl,--end-group 

It seems that when it calls clang++ the code works and when it calls clang-3.8 it fails. Which is reasonable if it's c++ code.

So I think it's nvcc that's going wrong. Not part of Devuan.

Chris

Offline

Board footer