Not logged inRybka Chess Community Forum
Up Topic The Rybka Lounge / Computer Chess / Guessing and hacking to get Lc0 on Ubuntu 21.04 (Part 1)
- - By MrKris (****) Date 2021-04-20 04:43 Upvotes 1
I can not recommend this to Linux beginners - like myself!- because I may have just got lucky.
I thought it was an interesting experience, though.
--There was a complication when Ubuntu 'up'ed' to gcc 10.3 over a week later, see Part 2, next post.

I guessed as this as an alternative to complex methods at https://developer.nvidia.com/cuda-zone .
There Cuda Toolkit v11.3 is out already, but back in March, this year, I noticed v11.2 had requirements that Ubuntu 21.04 (Daily Build) seemed to have:
a) gcc/g++ 10.2  
b) (Ubuntu) nvidia-cuda-toolkit 11.2.

-- The idea here is:  ONLY the default Ubuntu repositories, NO keys, NO downloads, NO paths, NO complexities from https://developer.nvidia.com/cuda-zone !!

0) Fresh install of Ubuntu 21.04 (Daily Build) 29Mar2021 ;    ## The final version will be out soon - but then see Part 2!
$ sudo apt-get update ; $ sudo apt-get upgrade    ## <- also periodically rather than wait for the GUI notification.

1) $ sudo apt install nvidia-cuda-toolkit        ## This gets it from Ubuntu, not Nvidia's website!
Interesting excerpt from the terminal:
"... The following packages were automatically installed and are no longer required: ...<many>...
Use 'sudo apt autoremove' to remove them. ..."    ## I put off autoremove until Part 2, however it seems no harm done.
Just checking:
$ nvcc --version
"nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Thu_Jan_28_19:32:09_PST_2021
Cuda compilation tools, release 11.2, V11.2.142
Build cuda_11.2.r11.2/compiler.29558016_0"
And:
$ nvidia-smi
"Mon Mar 29 hr:mm:ss 2021      
+-----------------------------------------------------------------------------+-
| NVIDIA-SMI 460.56       Driver Version: 460.56       CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+ ..."

Looking good.
Then I noticed Ubuntu was using a "manually installed driver" - see screenshot below.
It seems fine for this use (Ubuntu installed it, not me). Checking further:
$ apt search nvidia-driver
"Sorting... Done
Full Text Search... Done ... <others> ...
xserver-xorg-video-nvidia-460/hirsute,now 460.56-0ubuntu1 amd64 [installed,auto-removable]
  NVIDIA binary Xorg driver
... <others> ..."
The 460.56 was Ubuntu's latest at the time so I left it as is.
(The meaning of "auto-removeable" becomes clear in Part 2.)

1a) Optional cudnn because Lc0 on RTX runs with cuda-fp16 as well as cudnn-fp16 (or better for 30xx).
I downloaded and followed the Nvidia instructions exactly (otherwise 'no go') at https://developer.nvidia.com/cudnn
libcudnn8-dev_8.1.1.33-1+cuda11.2_amd64.deb
libcudnn8_8.1.1.33-1+cuda11.2_amd64.deb

2) $ sudo apt install zlib1g    ## Ubuntu for zlib mentioned at https://github.com/LeelaChessZero/lc0
"--already newest version"
$ sudo apt install zlib1g-dev    ## It needs this also.

3-err) $ sudo apt install gcc
$ gcc --version
"gcc (Ubuntu 10.2.1-23ubuntu2) 10.2.1 20210320 ... ..."
## I read later I should have done 3-correct) instead of this.  Luckily no harm done.

3-correct) $  sudo apt install build-essential    ## On Ubuntu this would have installed everything needed.
After my 3-err) this finished critical compiling components like the matching g++ and make etc.
$ g++ --version
"g++ (Ubuntu 10.2.1-23ubuntu2) 10.2.1 20210320"

4) As on Lc0's github:
$ sudo apt install libstdc++-10-dev    ## Already in Ubuntu.
"...is already the newest version..."
$ sudo apt install meson
--The following additional packages will be installed:
  ninja-build    #Included with meson.
The following NEW packages will be installed:
  meson ninja-build
$ python3 --version    ## Python was installed also.
Python 3.9.2

5) I already had made on an older Ubuntu  a name and email account at https://github.com/join , this one:
$ sudo apt install git
$ git --version  "git version 2.30.2"
$ git config --global user.name "Your Name"    ## with ""'s
$ git config --global user.email "youremail@domain.com"    ## with ""'s
$ git config --list  "...< was okay >..."

5b) Optional for Lc0 on CPU, adds blas and/or eigen to Lc0's options.
$ sudo apt install libopenblas-dev  ## and/or/neither:
$ sudo apt install libeigen3-dev

6) Get the source and compile:
$ git clone -b release/0.27 --recurse-submodules https://github.com/LeelaChessZero/lc0.git
Creates the folder lc0: /home/<username>/lc0    ## <>'s hides my actual.
$ cd lc0    ## Go to it
$ ./build.sh    ## run the compile script there.
The engine, the binary file called lc0 also, will be in the folder /home/<username>/lc0/build/release .
I put a copy in the folder below and renamed it (in Ubuntu it can run from there without the ./ ).
$ /home/<username>/.local/bin/Lc0-0.27 -b cuda-fp16    ## On mine just slightly faster than the default, without "-b ...".
(Lc0 started without the path, but the path seems needed so it autodiscovers the net I put there with it.)
       _
|   _ | |
|_ |_ |_| v0.27.0 built Mar 30 2021
go nodes 10
Found pb network file: /home/<username>/.local/bin/J94-100
Creating backend [cuda-fp16]...
CUDA Runtime version: 11.2.0
Latest version of CUDA supported by the driver: 11.2.0
GPU: GeForce RTX 2060
GPU memory: 5.78766 Gb
GPU clock frequency: 1680 MHz
GPU compute capability: 7.5
info depth 1 seldepth 2 time 1556 nodes 6 score cp 10 nps 500 tbhits 0 pv d2d4 g8f6
info depth 2 seldepth 3 time 1567 nodes 18 score cp 7 nps 818 tbhits 0 pv g2g3 d7d5 g1f3
bestmove g2g3 ponder d7d5


My machine, RTX 2060 (R7 2700X), 200,000 nodes:
Lc0 0.26.3 from Cuda 10.0 (on Ub. 20.04), cudnn-fp16 = 6150 nps
Lc0 0.27.0 from Cuda 11.2 (on Ub. 21.04), cuda-fp16 (cudnn-fp16 is almost as fast on 20xx) = 8191 nps
About 33% faster, mainly from the Cuda change.

See step 1): Ubuntu has since upgraded the "manually installed" version that it installed of Nvidia's driver twice to 460.67 - so its fine for now.
Parent - - By sarona (**) Date 2021-04-21 01:48
A very big thank you for this post.  That detailed information will be of immense help when I make an attempt to replicate this when returning to Calgary later in the month of April.

Instead of Linux Mint 20.1, I will use the Ubuntu 21.04 OS which I will download from here: https://releases.ubuntu.com/21.04/ or http://cdimage.ubuntu.com/daily-live/

If I fail, so be it. I am always up for a challenge and this would be a significant achievement for me.
Parent - By MrKris (****) Date 2021-04-21 19:51
Your welcome,  but see my Part 2 - I am finishing it now - it should be done soon (something came up earlier).
I do not expect any problems though I had to find the trick in Part 2 to compile Lc0.

The official Ub. 21.04 starts tommorow (earlier ones, like mine and if you get the Beta, are 'becoming' it with their normal upgrades).

My Ub. 21.04 was installed 29mar2021, then about 12apr2021 they changed to GCC 10.3 from 10.2
- that made my Part 1 not work until I added the trick in Part 2 to compile Lc0 (but of course Ub. itself was fine).

I had to install gcc 9.3 then trick Nvidia to use that because Ubuntu/Debian generally use only the first whole number for gcc.
--See Part 2 here soon.

I'll be here with my 21.04 and can try to help - but I am a beginner, no formal training.
(Just in case, it looks unlikely though, I have any problems I'll post this thread.)
Parent - By MrKris (****) Date 2021-04-22 00:56
Just to double  check my Part 1, above, and Part 2, below.
R7 2700X 16ths | RTX 2060

## Ubuntu 21.04 default was gcc 10.2
## lc0 compiled fine (nothing done for it)
~/.local/bin$ Lc0-0.27 benchmark -b cuda-fp16
       _
|   _ | |
|_ |_ |_| v0.27.0 built Mar 30 2021
Found pb network file: ./J94-100
Creating backend [cuda-fp16]...
CUDA Runtime version: 11.2.0
Latest version of CUDA supported by the driver: 11.2.0
GPU: GeForce RTX 2060
GPU memory: 5.78766 Gb
GPU clock frequency: 1680 MHz
GPU compute capability: 7.5
===========================
Total time (ms) : 341634
Nodes searched  : 3099620
Nodes/second    : 9073

## Ubuntu 21.04 is now with gcc 10.3:
$ gcc --version
gcc (Ubuntu 10.3.0-1ubuntu1) 10.3.0 ...
## lc0 compile errors without 2)Hack
## With 2)Hack: force nvcc to gcc 9.3.0 (gcc 9 added in 1))
~/.local/bin$ lc0Ub10nvcc9 benchmark -b cuda-fp16
       _
|   _ | |
|_ |_ |_| v0.27.0+git.unknown built Apr 21 2021  ## unknown is download instead of git
Found pb network file: ./J94-100
Creating backend [cuda-fp16]...
CUDA Runtime version: 11.2.0
Latest version of CUDA supported by the driver: 11.2.0
GPU: GeForce RTX 2060
GPU memory: 5.78766 Gb
GPU clock frequency: 1680 MHz
GPU compute capability: 7.5
===========================
Total time (ms) : 341801
Nodes searched  : 3099918
Nodes/second    : 9069

- - By MrKris (****) Date 2021-04-21 22:32
Guessing and hacking to get Lc0 on Ubuntu 21.04 (Part 2) <<<

Part 1 lasted for 12 days.

Then Ubuntu 21.04 upgraded to gcc/g++ to 10.3 from 10.2.

https://developer.nvidia.com/cuda-zone warns about that - though I did not use anything from there.
However I hoped Ubuntu 'knew what it was doing' since I installed Ubuntu's nvidia-cuda-toolkit (not Nvidia's) in Part 1.

Although I have a perfectly good Lc0 v0.27.0 as in Part 1 I did step 6) again, from there, to try to guess if I was ready for the next Lc0 versions.

--That caused the problem at the bottom. (Thanks to lc0's meson, ect., I got that good description in the terminal.)
--I can not use gcc/g++ 10.3 to compile lc0:
--At the bottom: the same '.../10/chrono...' Segmentation Fault 4 times?!
--See the error in the bottom text area.

--Very annoying, by default Ubuntu (and Debian) do not do "point" versions for gcc/g++, just the first whole number (for example gcc-10).
And, https://packages.ubuntu.com/ now says nothing about 10.2 for Ub. 21.04, as 10.2 is just for Ub. 20.10 now.

Fortunately Ubuntu allows adding prior gcc/g++ whole number versions and the Nvidia site says gcc 9.3.0 is okay for Cuda 11.2 (though, again, I have Cuda 11.2 from Ubuntu, not Nvidia).

1) $ sudo apt intall gcc-9  ## gets the latest for Ubuntu 9.3.0
$ sudo apt install g++-9
The proper -10 is still the defaults for my Ub. 21.04.
Check: $ gcc --version
should still show 10.3.
Check all gcc's:
$ dpkg -l | grep gcc
Outputs a long list, you should see various gcc/g++ 9's and 10's.

-----Optional: -----
It is optional because it worked (I did it for something in 19.04) but it did not help to compile Lc0.
See the web for the tedious method using "update-alternatives" to be able to manully swicth the default gcc/g++.
I then switched to -9, tried step 6) Part 1 again - exactly the same errors in '.../10/chrono...'
So I switched back to the normal, for Ub. 21.04, -10's
- that way I am back to as if I had not done the --Optional:-- at all.

-----End optional -----

--See the screenshot of the 2 scripts. ('Read-Only' (open with Text Editor) showing I changed them back to their originals after the hack below.)
     /usr/lib/nvidia-cuda-toolkit/bin/g++
     /usr/lib/nvidia-cuda-toolkit/bin/gcc

--This is why I got the same error with --Optional-- above as without it:

Nvidia Cuda (nvcc) is using the latest gcc/g++ installed without regard to other settings (as in --Optional-- above).

Notice I have two 10's: one is 10.2 from my early Daily Build 30mar2021 and another from Ubuntu's upgrade to 10.3 about 12apr2021.
Note: there is no distinction between the 10.2 and 10.3 - an example of why I prefer the 2)Hack rather than even try any fallback which looks unsupported by Ubuntu and would seem to fail anyway.

2) -----Hack: use entirely at your own risk, ect. --See the 'Read-Only' screenshot below.
-----Make sure you have done 1) above to install gcc/g++ 9 first
a) make a back-up of the 2 scripts originals and keep notes on what you did.
b) use the following to carefully change the top 2 to "prog=g++-9" and "prog=gcc-9" (after doing 1) above of course).
--or one for a newer Ub. 21.04 install, my is showing one for the early (Daily Build) that had gcc/g++ 10.2 and one for the upgrade to 10.3--
$ sudo -i gedit /usr/lib/nvidia-cuda-toolkit/bin/g++
$ sudo -i gedit /usr/lib/nvidia-cuda-toolkit/bin/gcc
(gedit is just the Text Editor - but with sudo -i is in the root system so be careful - close after just changing the top 10's (or top 10) to 9, save & close each ; undo after)
--See the screenshot below -which is Read-Only: with sudo -i they will be editable!
This forces nvcc to use g++/gcc 9
c) After building Lc0 change them back.
-----End Hack -----

With 1) and 2)Hack I got the following after step 6, Part 1, git and build Lc0:

NO errors!! And it worked fine. I renamed it lc0g10nvg9 for:
Ub. left at its default gcc/g++10 (now 10.3 - its old 10.2 was fine as in Part1),
nvcc forced by the 2)Hack above to use gcc/g++ 9 (9.3.0) from the Ubuntu standard addition in 1) above.

~/.local/bin$ lc0g10nvg9 -b cuda-fp16
       _
|   _ | |
|_ |_ |_| v0.27.0 built Apr 16 2021
go nodes 10
Found pb network file: ./J94-100
Creating backend [cuda-fp16]...
CUDA Runtime version: 11.2.0
Latest version of CUDA supported by the driver: 11.2.0
GPU: GeForce RTX 2060
GPU memory: 5.78766 Gb
GPU clock frequency: 1680 MHz
GPU compute capability: 7.5
info depth 1 seldepth 2 time 1589 nodes 6 score cp 10 nps 600 tbhits 0 pv d2d4 g8f6
info depth 2 seldepth 3 time 1599 nodes 18 score cp 7 nps 900 tbhits 0 pv g2g3 d7d5 g1f3
bestmove g2g3 ponder d7d5

Note: the date is after the near-disaster of Ubuntu 21.04's early Daily Build gcc 10.2 currently "up'd" to 10.3 (which, again, does not work for Lc0 without 1) and 2)Hack above).
--The speed differnce from 2)Hack should be virtually insignificant*, the Cuda 11.2 is the main thing.
(*As in abrock.eu/stockfish still using GCC 7.)

-----Continued from Part 1, 'clean up'
Now that I can't do much more with it and it is okay I finally did:
$ sudo apt autoremove
(I presume most would have done it sooner. It is supossed to remove packages "no longer needed.")
Then double checking:
$ nvcc --version
Was okay, Cuda 11.2.
$ nvidia-smi
"Command 'nvidia-smi' not found, but can be installed with:
sudo apt install nvidia-utils-390         # version 390.141-0ubuntu2, or
sudo apt install nvidia-utils-418-server  # version 418.181.07-0ubuntu2
sudo apt install nvidia-utils-450         # version 450.102.04-0ubuntu2
sudo apt install nvidia-utils-450-server  # version 450.102.04-0ubuntu2
sudo apt install nvidia-utils-460         # version 460.67-0ubuntu1
sudo apt install nvidia-utils-460-server  # version 460.32.03-0ubuntu2"
So I used the latest (not server):
$ sudo apt install nvidia-utils-460
Part of the terminal output:
"Suggested packages:
  nvidia-driver-460"    So, I did that:
$ sudo apt install nvidia-driver-460
then a *LOT* of output but it went okay.

To check, also:
$ nvidia-smi
Sun Apr 18 hr:mm:ss 2021      
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.67       Driver Version: 460.67       CUDA Version: 11.2    
   ...
   ...

-----End 'clean up-- Screenshot below of Software & Updates shows regular Nvidia driver now (latest).
-
-
10.3 Error - terminal after ./build.sh
Ubuntu 21.04 (Daily Build) had gcc/g++ 10.2 and Part 1 worked fine to compile Lc0.
Then (about 12apr2021) it upgraded to 10.3 --- same step 6) Part 1 caused:

[133/152] Generating 'liblc0_lib.so.p/common_kernels.o'.
FAILED: liblc0_lib.so.p/common_kernels.o 
/usr/bin/nvcc -c ../../src/neural/cuda/common_kernels.cu -o liblc0_lib.so.p/common_kernels.o -I /home/me/lc0/src --std=c++14 -Xcompiler -fPIC -I /opt/cuda/include/ -I /usr/local/cuda/include/ -I /usr/lib/cuda/include/
/usr/include/c++/10/chrono: In substitution of ‘template<class _Rep, class _Period> template<class _Period2> using __is_harmonic = std::__bool_constant<(std::ratio<((_Period2::num / std::chrono::duration<_Rep, _Period>::_S_gcd(_Period2::num, _Period::num)) * (_Period::den / std::chrono::duration<_Rep, _Period>::_S_gcd(_Period2::den, _Period::den))), ((_Period2::den / std::chrono::duration<_Rep, _Period>::_S_gcd(_Period2::den, _Period::den)) * (_Period::num / std::chrono::duration<_Rep, _Period>::_S_gcd(_Period2::num, _Period::num)))>::den == 1)> [with _Period2 = _Period2; _Rep = _Rep; _Period = _Period]’:
/usr/include/c++/10/chrono:473:154:   required from here
/usr/include/c++/10/chrono:428:27: internal compiler error: Segmentation fault
  428 |  _S_gcd(intmax_t __m, intmax_t __n) noexcept
      |                           ^~~~~~
Please submit a full bug report,
with preprocessed source if appropriate.
See <file:///usr/share/doc/gcc-10/README.Bugs> for instructions.
---{same exact text for [134/152] , [142/152] , [143/152] , then:}
ninja: build stopped: subcommand failed.
$

--With 1) and 2)Hack it works but has no errors to show!
Don't do anything with the below, just the probable explaination:
Evidentally it works because nvcc then uses .../9/chrono... which evidentally is okay.
(Ub., after 1) above, has two folders .../c++/10/ and .../c++/9/ both with many items, same names, including chrono.) 
Parent - - By sarona (**) Date 2021-04-22 13:55 Upvotes 1
I return home from Edmonton on Tuesday and have a week off.  I will make an attempt to get a Ubuntu Linux system with a compiled Lc0 running.

I have taken screenshots of this post for archival purposes.

Thanks again!
Parent - By MrKris (****) Date 2021-04-23 03:47
*************************************************
https://www.reddit.com/r/pop_os/comments/mkpjfd/critical_bug_wiped_my_entire_install/

Posted byu/fintip 17 days ago

CRITICAL BUG, WIPED MY ENTIRE INSTALL

*************************************************

Regarding any new Ubuntu install (I presume Debian/others?! I do not know though).

1) I don't know if this is the best link for it - I ran into several links like it: some old, but many recent also.
A recent one said "there is no excuse for this [urgent rating] - its been a critical bug since almost 7 years ago now."

2) Many years ago 'everyone' had only 1 drive, basically large enough for 1 OS, and where content, when installing a new Ubuntu to:
"wipe the drive and fresh install."

Nowadays people often have several drives, sometimes with various OS's, and want to install a new Ubuntu on just one of the partitions on one of the drives.

3) Many have, from years ago to currently, claimed that they set their custom install (I don't know the technical term) correctly - but Ubuntu:
one said "wiped the first partition it ran into" instead of the one set, another said "it bricked my machine" black screen, nothing, with power on button.

4) I did not save the links (wish I had now) because:
I had already unplugged my SSD/older Ubuntu before beginning with Part 1 just from intentional over-caution before I read any of the above.
And, I decided I would remember to do this:

Unplug every drive but the target drive before a fresh install

Even if that means temporarily removing my M.2* (when I probably someday find something more interesting for my SSD/old Ubuntu).
*Put the computer on its 'back' so the little, tiny, tiny, M.2 screw can't fall very far, like into your PSU, if it does, use a magnetized screwdriver - I think the M.2 is impossible without it, done correctly removing the video card, if needed, such not take any 'force', ect.)
- - By MrKris (****) Date 2021-04-27 02:50
FAILURE THIS THREAD - Nvidia on Ubuntu 21.04

*********************************************
*                                           *
*     *****      *       ***     *          *
*     *         * *       *      *          *
*     ****     *****      *      *          *
*     *       *     *     *      *          *
*     *       *     *    ***     *****      *
*                                           *    
*********************************************

Lc0 stopped working in Cute Chess.
It looks like my Ubuntu 21.04 with sudo apt update & sudo apt upgrade and/or Software Updater (GUI) broke itself:


Software Updater [clicked icon:]
The software on this computer is up to date.

- - -

~/.local/bin$ Lc0-0.27 -b cuda-fp16
       _
|   _ | |
|_ |_ |_| v0.27.0 built Mar 30 2021
go nodes 10000
Found pb network file: ./net-68802
Creating backend [cuda-fp16]...
CUDA Runtime version: 32.65.1
WARNING: CUDA Runtime version mismatch, was compiled with version 11.2.0
Latest version of CUDA supported by the driver: 11.2.0
error CUDA error: unknown error (../../src/neural/cuda/network_cuda.cc:136)

- - -

$ nvidia-smi
Failed to initialize NVML: Driver/library version mismatch

- - -

$ nvcc --version
Command 'nvcc' not found, but can be installed with:
sudo apt install nvidia-cuda-toolkit
 
 
(but that is one of the first things I installed)
 
VERY SORRY !!!
 

>>> I will stop posting any Linux instructions so I do not lead any else into a bad path. <<<
 
Parent - - By MrKris (****) Date 2021-04-27 03:57
So I ran:
$ sudo apt install nvidia-cuda-toolkit
again --having done that at first-- assuming it would make things worse.
Now its all 'innocent looking' and even bumped up my driver from 460.67 to 460.73:
$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Sun_Feb_14_21:12:58_PST_2021
Cuda compilation tools, release 11.2, V11.2.152
Build cuda_11.2.r11.2/compiler.29618528_0
me@me-RyRTX-314:~$ nvidia-smi
Mon Apr 26 20:18:53 2021      
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.73.01    Driver Version: 460.73.01    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+


I managed to get 2 games, @ G/150"+1", 40 moves and 52 moves, from Cute Chess:
Score of Lc0-0.27-net-68802 vs Stockfish_21042513: 0 - 0 - 2 [0.500]
...      Lc0-0.27-net-68802 playing White: 0 - 0 - 1  [0.500] 1
...      Lc0-0.27-net-68802 playing Black: 0 - 0 - 1  [0.500] 1
...      White vs Black: 0 - 0 - 2  [0.500] 2
Elo difference: 0.0 +/- 0.0, LOS: nan %, DrawRatio: 100.0 %
2 of 2 games finished.


The problems are:
what is wrong with it?
how long will it last this time?
will it really break itself next time?
Parent - - By sarona (**) Date 2021-04-27 18:33
All of your efforts (including the setbacks) are of great help.  You are clearly docunenting everything - including the setbacks. Any reasonably intelligent person (except, perhaps, politicians) should be aware of the risks prior to proceeding! I can"t speak for anyone else here, but I appreciate your detailed posts.
Parent - - By MrKris (****) Date 2021-04-28 19:59
***************************************************
*  Seems okay for now at least, fingers crossed.  *
***************************************************


Thanks again for your kind words!

I probably would have wrote my "FAIL" post differently had I guessed that the simply step, as in my reply to it, of:
re-installing (though mystery as to why again) Ubuntu's nvidia-cuda-toolkit would have worked so well -MAYBE, the future will tell..

> Any reasonably intelligent person (except, perhaps, politicians) should be aware of the risks prior to proceeding!


I did not check if I was over or not on the time to change my "FAIL" post and left it to be sure of "be aware of the risks".

> I can"t speak for anyone else here, but I appreciate your detailed posts.


I have to admit to being in somewhat of a panic at my "FAIL" post, one think is my web-search --though it all depends on how typed sometimes-- had some very 'bad news' advice, before I tried the simple step above.
-Interesting about 'details': in that post it had recommended what to do after after '$ nvcc --version'. (Though the other there, '$ nvidia-smi', its "mismatch" message was too cryptic for me; however the time before, 4/21, towards the end of the very long post, it come up with "..can be installed with").

>>> Okay. I will post this thread if any 'major' comes up, though obviously I am still a beginner at this.

Good luck if you still try it!
Parent - - By sarona (**) Date 2021-04-29 01:29
I downloaded the iso tonight and will begin the process tomorrow afternoon.
Parent - By MrKris (****) Date 2021-05-03 02:13
(Sorry, I've been involved in a different project.)
Did it go okay?
- By MrKris (****) Date 2021-06-04 05:47
NEW NVIDIA/UBUNTU problem, but luckily a quick solution.

Lc0 stopped working in GUI's, command line:
       _
|   _ | |
|_ |_ |_| v0.27.0 built Mar 30 2021
go movetime 10
Found pb network file: ./net-69272
Creating backend [cudnn-auto]...
Switching to [cudnn]...
CUDA Runtime version: 32.71.0
WARNING: CUDA Runtime version mismatch, was compiled with version 11.2.0
Cudnn version: 8.1.1
Latest version of CUDA supported by the driver: 11.2.0
error CUDA error: forward compatibility was attempted on non supported HW (../../src/neural/cuda/network_cudnn.cc:168)
quit


From the web:
sudo dmesg |grep NVRM
[    4.227618] NVRM: loading NVIDIA UNIX x86_64 Kernel Module  460.73.01  Thu Apr  1 21:40:36 UTC 2021
[ 1580.909047] NVRM: API mismatch: the client has the version 460.80, but
               NVRM: this kernel module has the version 460.73.01.  Please
               NVRM: make sure that this kernel module and all NVIDIA driver
               NVRM: components have the same version.
....much more...


Opened the main Ubuntu app "Additional Drivers" (also a tab in "Software & Updates") and it listed, among others, two "460" drivers (only the main number shows in this app) - usually there is just one each "xxx".
Ignoring all "xxx -server" drivers, because I have the regular Ubuntu.

FIX: I switched to the second "460" on the list, it said to restart to make the change.

*** Everything is fine again.
"Additional Drivers" now shows only one "460" that its 'on'/'active'/"in use" (also a "460-server" not in use).

Evidentially when it upgraded from 460.73.01 to 460.80 it left the old 'active' for some unknown reason;
then when I manually switched above to the 2nd "460" that was the correct/new one, and now the old now longer shows.
Just guessing but it is fine now >>> NO fiddling with anything (that might be on the web for similar -but DIFFERNT- problems)

*** JUST the above click to the 2nd "460" that was on the list and restart ONLY, now its FINE ***
Up Topic The Rybka Lounge / Computer Chess / Guessing and hacking to get Lc0 on Ubuntu 21.04 (Part 1)

Powered by mwForum 2.27.4 © 1999-2012 Markus Wichitill